1. So your hardware is ServerReady?

    Recently I changed my assignment at Linaro. From Cloud to Server Architecture. Which means less time spent on Kolla things, more on server related things. And at start I got some project I managed to forget about :D

    SBSA reference platform in QEMU

    In 2017 someone got an idea to make a new machine for QEMU. Pure hardware emulation of SBSA compliant reference platform. Without using of virtio components.

    Hongbo Zhang wrote code and got it merged into QEMU, Radosław Biernacki wrote basic support for EDK2 (also merged upstream). Out of box it can boot to UEFI shell. Linux is not bootable due to lack of ACPI tables (DeviceTree is not an option here).

    ACPI tables in firmware

    Tanmay Jagdale works on adding ACPI tables in his fork of edk2-platforms. With this firmware Linux boots and can be used.

    Testing tools

    But what the point of just having reference platform if there is no testing? So I took a look and found two interesting tools:

    Server Base System Architecture — Architecture Compliance Suite

    SBSA ACS tool requires ACPI tables to be present to work. And once started it nicely checks how compliant your system is:

    FS0:\> Sbsa.efi -p
    
     SBSA Architecture Compliance Suite
        Version 2.4
    
     Starting tests for level  4 (Print level is  3)
    
     Creating Platform Information Tables
     PE_INFO: Number of PE detected       :    3
     GIC_INFO: Number of GICD             :    1
     GIC_INFO: Number of ITS              :    1
     TIMER_INFO: Number of system timers  :    0
     WATCHDOG_INFO: Number of Watchdogs   :    0
     PCIE_INFO: Number of ECAM regions    :    2
     SMMU_INFO: Number of SMMU CTRL       :    0
     Peripheral: Num of USB controllers   :    1
     Peripheral: Num of SATA controllers  :    1
     Peripheral: Num of UART controllers  :    1
    
          ***  Starting PE tests ***
       1 : Check for number of PE            : Result:  PASS
       2 : Check for SIMD extensions                PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       3 : Check for 16-bit ASID support            PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       4 : Check MMU Granule sizes                  PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       5 : Check Cache Architecture                 PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       6 : Check HW Coherence support               PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       7 : Check Cryptographic extensions           PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       8 : Check Little Endian support              PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
       9 : Check EL2 implementation                 PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      10 : Check AARCH64 implementation             PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      11 : Check PMU Overflow signal         : Result:  PASS
      12 : Check number of PMU counters             PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
      13 : Check Synchronous Watchpoints            PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      14 : Check number of Breakpoints              PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      15 : Check Arch symmetry across PE            PSCI_CPU_ON: failure
    
           Reg compare failed for PE index=1 for Register: CCSIDR_EL1
           Current PE value = 0x0         Other PE value = 0x100FBDB30E8
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      16 : Check EL3 implementation                 PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      17 : Check CRC32 instruction support          PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    1 for Level=  4 : Result:  --FAIL-- 129
      18 : Check for PMBIRQ signal
           SPE not supported on this PE      : Result:  -SKIPPED- 1
      19 : Check for RAS extension                  PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
      20 : Check for 16-Bit VMID                    PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
      21 : Check for Virtual host extensions        PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
      22 : Stage 2 control of mem and cache         PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    : Result:  -SKIPPED- 1
      23 : Check for nested virtualization          PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    : Result:  -SKIPPED- 1
      24 : Support Page table map size change       PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    : Result:  -SKIPPED- 1
      25 : Check for pointer signing                PSCI_CPU_ON: failure
    
    
      25 : Check for pointer signing                PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    : Result:  -SKIPPED- 1
      26 : Check Activity monitors extension        PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    : Result:  -SKIPPED- 1
      27 : Check for SHA3 and SHA512 support        PSCI_CPU_ON: failure
           PSCI_CPU_ON: failure
    : Result:  -SKIPPED- 1
    
          *** One or more PE tests have failed... ***
    
          ***  Starting GIC tests ***
     101 : Check GIC version                 : Result:  PASS
     102 : If PCIe, then GIC implements ITS  : Result:  PASS
     103 : GIC number of Security states(2)  : Result:  PASS
     104 : GIC Maintenance Interrupt
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
    
          One or more GIC tests failed. Check Log
    
          *** Starting Timer tests ***
     201 : Check Counter Frequency           : Result:  PASS
     202 : Check EL0-Phy timer interrupt     : Result:  PASS
     203 : Check EL0-Virtual timer interrupt : Result:  PASS
     204 : Check EL2-phy timer interrupt     : Result:  PASS
     205 : Check EL2-Virtual timer interrupt
           v8.1 VHE not supported on this PE : Result:  -SKIPPED- 1
     206 : SYS Timer if PE Timer not ON
           PE Timers are not always-on.
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
     207 : CNTCTLBase & CNTBaseN access
           No System timers are defined      : Result:  -SKIPPED- 1
    
         *** Skipping remaining System timer tests ***
    
          *** One or more tests have Failed/Skipped.***
    
          *** Starting Watchdog tests ***
     301 : Check NS Watchdog Accessibility
           No Watchdogs reported          0
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
     302 : Check Watchdog WS0 interrupt
           No Watchdogs reported          0
           Failed on PE -    0 for Level=  4 : Result:  --FAIL-- 1
    
          ***One or more tests have failed... ***
    
          *** Starting PCIe tests ***
     401 : Check ECAM Presence               : Result:  PASS
     402 : Check ECAM value in MCFG table    : Result:  PASS
    
            Unexpected exception occured
            FAR reported = 0xEBDAB180
            ESR reported = 0x97800010
         -------------------------------------------------------
         Total Tests run  =   42;  Tests Passed  =   11  Tests Failed =   22
         ---------------------------------------------------------
    
          *** SBSA tests complete. Reset the system. ***
    

    As you can see there is still a lot of work to do.

    ACPI Tables View

    This tool displays content of ACPI tables in hex/ascii format and then with information interpreted field by field.

    What makes it more useful is “-r 2” argument as it enables checking tables against Server Base Boot Requirements (SBBR) v1.2 specification. On SBSA reference platform with Tanmay’s firmware it lists two errors:

    ERROR: SBBR v1.2: Mandatory DBG2 table is missing
    ERROR: SBBR v1.2: Mandatory PPTT table is missing
    
    Table Statistics:
            2 Error(s)
            0 Warning(s)
    

    So situation looks good as those can be easily added.

    CI

    So we have code to check and tools to do that. Add one to another and you have a clean need for CI job. So I wrote one for Linaro CI infrastructure: “LDCG SBSA firmware“. It builds top of QEMU and EDK2, then boot it and run above tools. Results are sent to mailing list.

    ServerReady?

    The Arm ServerReady compliance program provides a solution for servers that “just works”, allowing partners to deploy Arm servers with confidence. The program is based on industry standards and the Server Base System Architecture (SBSA) and Server Base Boot Requirement (SBBR) specifications, alongside Arm’s Server Architectural Compliance Suite (ACS). Arm ServerReady ensures that Arm-based servers work out-of-the-box, offering seamless interoperability with standard operating systems, hypervisors, and software.

    In other words: if your hardware is SBSA complaint then you can go with SBBR compliance tests and then go and ask for certification sticker or sth like that.

    But if your hardware is not SBSA compliant then EBBR is all you can get. Far from being ServerReady. Never mind what people tries to say — ServerReady requires SBBR which requires SBSA.

    Future work

    More tests to integrate. ARM Enterprise ACS is next on my list.

    Written by Marcin Juszkiewicz on
  2. NAS update

    In 2014 I bought Synology DS214se NAS and two 4TB hard drives. It worked fine for me for years and served files. But it was low cpu power system with just 256MB of ram so was too easy to overload.

    Let’s move to x86-64

    So few years ago friend was selling ASUS M5A78L-M LX3 mainboard with AMD FX-6300 processor. I bought it, added 8 GB of ram from my desktop (which got additional 16 GB instead) and put into Node 804 case from Fractal Design.

    Case fits MicroATX board and has plenty of space for storage (I think 10 3.5”, 2 2.5” and slot-in optical drive).

    Machine got several hard drives (from other home machines or drawers):

    • WD Red 4 TB x2
    • Toshiba 2 TB
    • Samsung 1.5 TB
    Hard drives
    Hard drives in cages
    Hard drive cage
    Hard drives cage

    FreeNAS

    Installed FreeNAS 11 on it and started using. Machine was named ‘lumpek’ (Lumpy the Heffalump) to follow my way of naming computers.

    4 TB drives went into simple mirror, 2 TB for less important data and 1.5 TB one for virtual machines and related storage (like installation iso files).

    ZFS works nice, some extra FreeNAS plugins allowed me to offload some services from my desktop to NAS (like Transmission daemon for fetching torrents or MySQL server for local needs).

    Memory upgrade

    Many people say that NAS machine should have ECC memory. So at some moment it got 16 GB (2x 8 GB sticks) of DDR3-1866 ECC memory recovered from old server:

    Handle 0x0026, DMI type 16, 15 bytes
    Physical Memory Array
            Location: System Board Or Motherboard
            Use: System Memory
            Error Correction Type: Single-bit ECC
            Maximum Capacity: 16 GB
            Error Information Handle: Not Provided
            Number Of Devices: 2
    

    More disks

    4 TB of space ends one day. So I went and bought another WD Red 4 TB disk. The idea was to move data from mirror to some spare storage, create new RAID-Z1 array from 3x 4 TB drives and migrate data back.

    But… Lumpek already had 4 hard drives and it was maximum this mainboard supported.

    Dell H310 aka LSI 9211-8i

    Luckily mainboard has on-board graphics so PCI Express x16 slot was empty. Asked friends, checked some internet pages and ordered used Dell H310 SAS controller. This is probably the most popular (among IBM M1015) storage solution in FreeNAS community.

    Card arrived with not needed SAS cable and SFF-8187 cables came in other order.

    Crossflashing

    How to make best use of server class RAID controller? Strip it from any RAID functionality ;D

    Turns out that Dell H310 is basically LSI 9211-8i card. Which means we can flash it with generic firmware to switch to “initiator target” (also called “IT mode”). Card will then presents each drive individually to the host.

    There are several pages describing process. One of them is JC-LAN. I do not remember which set of instructions I followed but they do not differ much.

    At the end I got generic LSI SAS2008 controller:

    root@lumpek:~ # sas2flash -listall
    LSI Corporation SAS2 Flash Utility
    Version 16.00.00.00 (2013.03.01) 
    Copyright (c) 2008-2013 LSI Corporation. All rights reserved 
    
            Adapter Selected is a LSI SAS: SAS2008(B2)   
    
    Num   Ctlr            FW Ver        NVDATA        x86-BIOS         PCI Addr
    ----------------------------------------------------------------------------
    
    0  SAS2008(B2)     20.00.07.00    14.01.00.08    07.39.02.00     00:02:00:00
    
            Finished Processing Commands Successfully.
            Exiting SAS2Flash.
    root@lumpek:~ # 
    

    And as a bonus all my hard drives got a bit more bandwidth:

    da2: <ATA WDC WD40EFRX-68W 0A82> Fixed Direct Access SPC-4 SCSI device
    da2: 600.000MB/s transfers
    da2: Command Queueing enabled
    da2: 3815447MB (7814037168 512 byte sectors)
    da2: quirks=0x8<4K>
    

    Not that 300->600 MB/s transfer update change anything with rusting plates ;D

    Summary

    FreeNAS based machine serves me well. Five hard drives give lot of space for data. 1 GbE network connection is probably my main limit now but there are no plans so far for moving to 10 GbE cards/switch due to their price.

    Virtual machines run from NAS with good speed and if I need faster then I can move them to NVME in my desktop or laptop.

    Written by Marcin Juszkiewicz on
  3. Installing Fedora on RockPro64

    Continuing tests of distribution installers. This time I installed Fedora ‘rawhide’ from netinst iso (2020.06.20). Fetched, wrote to USB pen drive and booted. Due to U-Boot being present in on-board SPI flash I did not had to mess with installation media.

    Issues

    There were some issues:

    1. Panfrost failing to initialize
    2. U-Boot unable to load grub efi

    Panfrost initialization failure

    Panfrost kernel module needs some devfreq governor. Kernel has four of them, Fedora enables one. There are no dependencies between those modules which ends with the same error as with Debian:

    panfrost ff9a0000.gpu: devfreq_add_device: Unable to find governor for the device
    panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init [panfrost]] *ERROR* Couldn't initialize GPU devfreq
    panfrost ff9a0000.gpu: Fatal error during devfreq init
    panfrost: probe of ff9a0000.gpu failed with error -22
    

    Solution was the same as before — boot without ‘panfrost’ module. I interrupted grub from starting and added rd.driver.blacklist=panfrost to “linux” command. This allowed me to boot into Fedora installer and system installation went smoothly.

    First boot on installed system shown working Panfrost driver:

    panfrost ff9a0000.gpu: clock rate = 500000000
    panfrost ff9a0000.gpu: mali-t860 id 0x860 major 0x2 minor 0x0 status 0x0
    panfrost ff9a0000.gpu: features: 00000000,100e77bf, issues: 00000000,24040400
    panfrost ff9a0000.gpu: Features: L2:0x07120206 Shader:0x00000000 Tiler:0x00000809 Mem:0x1 MMU:0x00002830 AS:0xff JS:0x7
    panfrost ff9a0000.gpu: shader_present=0xf l2_present=0x1
    [drm] Initialized panfrost 1.1.0 20180908 for ff9a0000.gpu on minor 0
    

    U-Boot can not load Grub EFI

    After reboot U-Boot was not able to load Grub from EFI System Partition:

    Device 0: Vendor: ADATA    Rev: 1.00 Prod: USB Flash Drive 
                Type: Removable Hard Disk
                Capacity: 59200.0 MB = 57.8 GB (121241600 x 512)
    ... is now current device
    Scanning usb 0:1...
    Found EFI removable media binary efi/boot/bootaa64.efi
    libfdt fdt_check_header(): FDT_ERR_BADMAGIC
    Card did not respond to voltage select!
    Scanning disk mmc@fe310000.blk...
    Disk mmc@fe310000.blk not ready
    Card did not respond to voltage select!
    Scanning disk mmc@fe320000.blk...
    Disk mmc@fe320000.blk not ready
    Card did not respond to voltage select!
    Scanning disk sdhci@fe330000.blk...
    Disk sdhci@fe330000.blk not ready
    Scanning disk usb_mass_storage.lun0...
    ** Unrecognized filesystem type **
    ** Unrecognized filesystem type **
    Found 4 disks
    BootOrder not defined
    EFI boot manager: Cannot load any image
    858216 bytes read in 25 ms (32.7 MiB/s)
    libfdt fdt_check_header(): FDT_ERR_BADMAGIC
    System BootOrder not found.  Initializing defaults.
    Could not read \EFI\: Invalid Parameter
    Error: could not find boot options: Invalid Parameter
    start_image() returned Invalid Parameter
    ## Application terminated, r = 2
    EFI LOAD FAILED: continuing...
    

    It was already reported as ‘shim’ bug 1733817.

    How to work around it?

    1. connect your Fedora storage into other computer
    2. copy “/efi/fedora/grubaa64.efi” to “/efi/boot/bootaa64.efi”

    This way U-Boot will get grub efi binary to load in default location.

    Final effect

    Board boots directly to graphical login manager and then to GNOME3 session. Extreme Tux Racer and Xonotic worked out of the box. Speed-wise it feels slower than KDE Plasma session on Debian.

    Written by Marcin Juszkiewicz on
  4. Installing Debian on RockPro64

    Installed Debian ‘testing’ from netinst iso (2020.06.15) today. Fetched, wrote to USB pen drive and booted. Due to U-Boot being present in on-board SPI flash I did not had to mess with installation media.

    Issues

    There were some issues:

    1. no graphics on default installer (known, someone promised to fix it)
    2. grub refusing to install (bug against installer reported)
    3. Panfrost failing to initialize

    Serial console FTW!

    Ok, this time I am joking. There are two choices: text and graphical installer. First option lacks kernel modules for graphics so only serial console is available. Graphical installer works fine.

    EFI Grub and lack of EFI variables storage

    As I booted board with U-Boot there was no EFI variables storage. Grub was not satisfied:

    os-prober: debug: running /usr/lib/os-probes/50mounted-tests on /dev/sdb2
    50mounted-tests: debug: mounted using GRUB fat filesystem driver
    50mounted-tests: debug: running subtest /usr/lib/os-probes/mounted/40lsb
    50mounted-tests: debug: running subtest /usr/lib/os-probes/mounted/90linux-distro
    grub-installer: info: Installing grub on 'dummy'
    grub-installer: info: grub-install does not support --no-floppy
    grub-installer: info: Running chroot /target grub-install  --force "dummy"
    grub-installer: Installing for arm64-efi platform.
    grub-installer: grub-install: warning: Cannot set EFI variable Boot0000.
    grub-installer: grub-install: warning: vars_set_variable: write() failed: Invalid argument.
    grub-installer: grub-install: warning: _efi_set_variable_mode: ops->set_variable() failed: No such file or directory.
    grub-installer: grub-install: error: failed to register the EFI boot entry: No such file or directory.
    grub-installer: error: Running 'grub-install  --force "dummy"' failed.
    

    How to work around it?

    1. chroot into target system and run update-grub by hand
    2. copy “/efi/debian/grubaa64.efi” to “/efi/boot/bootaa64.efi”

    This way U-Boot will get efi binary to load in default location.

    Panfrost initialization failure

    Panfrost kernel module needs some devfreq governor. Kernel has four of them, Debian enables one. There are no dependencies between those modules which ends with this error:

    panfrost ff9a0000.gpu: devfreq_add_device: Unable to find governor for the device
    panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init [panfrost]] *ERROR* Couldn't initialize GPU devfreq
    panfrost ff9a0000.gpu: Fatal error during devfreq init
    panfrost: probe of ff9a0000.gpu failed with error -22
    

    Solution:

    1. boot system
    2. rmmod panfrost, modprobe governor_simpleondemand, modprobe panfrost
    3. update-initramfs -u -kall

    Good option at this phase is changing configuration of update-initramfs to include only needed kernel modules (by setting “MODULES=dep” in it’s configuration). This allowed me to shrink initramfs from 37 to 13 megabytes (removal of plymouth and ntfs-3g shrinked to 6.6 MB).

    Final effect

    Board boots directly to graphical login manager and then to KDE Plasma session. Some of OpenGL games work, some not (Nexuiz). Looks good.

    Written by Marcin Juszkiewicz on
  5. EBBR on RockPro64

    SBBR or GTFO

    Me.

    But Arm world no longer ends on “SBBR compliant or complete mess”. For over a year there is new specification called EBBR (Embedded Base Boot Requirements).

    WTH is EBBR?

    In short it is kind of SBBR for devices which can not comply. So you still need to have some subset of UEFI Boot/Runtime Services but it can be provided by whatever bootloader you use. So U-Boot is fine as long it’s EFI implementation is enabled.

    ACPI is not required but may be present. DeviceTree is perfectly fine. You may provide both or one of them.

    Firmware can be stored wherever you wish. Even MBR partitioning is available if really needed.

    Make it nice way

    RockPro64 has 16MB of SPI flash on board. This is far more than needed for storing firmware (I remember time when it was enough for palmtop Linux).

    During last month I sent a bunch of patches to U-Boot to make this board as comfortable to use as possible. Including storing of all firmware parts into on board SPI flash.

    To have U-Boot there you need to fetch two files:

    Their sha256 sums:

    3985f2ec63c2d31dc14a08bd19ed2766b9421f6c04294265d484413c33c6dccc  idbloader.img
    35ec30c40164f00261ac058067f0a900ce749720b5772a759e66e401be336677  u-boot.itb
    

    Store them as files on USB pen drive and plug it into any of RockPro64 USB ports. Then reboot to U-Boot as you did before (stored in SPI or on SD card or on EMMC module).

    Next do this set of commands to update U-Boot:

    Hit any key to stop autoboot:  0 
    => usb start
    
    => ls usb 0:1
       163807   idbloader.img
       867908   u-boot.itb
    
    2 file(s), 0 dir(s)
    
    => sf probe
    SF: Detected gd25q128 with page size 256 Bytes, erase size 4 KiB, total 16 MiB
    
    => load usb 0:1 ${fdt_addr_r} idbloader.img
    163807 bytes read in 16 ms (9.8 MiB/s)
    
    => sf update ${fdt_addr_r} 0 ${filesize}
    device 0 offset 0x0, size 0x27fdf
    163807 bytes written, 0 bytes skipped in 2.93s, speed 80066 B/s
    
    => load usb 0:1 ${fdt_addr_r} u-boot.itb
    867908 bytes read in 53 ms (15.6 MiB/s)
    
    => sf update ${fdt_addr_r} 60000 ${filesize}
    device 0 offset 0x60000, size 0xd3e44
    863812 bytes written, 4096 bytes skipped in 11.476s, speed 77429 B/s
    

    And reboot board.

    After this your RockPro64 will have firmware stored in on board SPI flash. No need for wondering which offsets to use to store them on SD card etc.

    Booting installation media

    The nicest part of it is that no longer you need to mess with installation media. Fetch Debian/Fedora installer ISO, write it to USB pen drive, plug into port and reboot board.

    Should work with any generic AArch64 installation media. Of course kernel on media needs to support RockPro64 board. I played with Debian ‘testing’ and Fedora 32 and rawhide and they booted fine.

    My setup

    My board boots to either Fedora rawhide or Debian ‘testing’ (two separate pen drives).

    Written by Marcin Juszkiewicz on
  6. OpenDev CI speed-up for AArch64

    I work with OpenDev CI for a while. My first Kolla patches were over three years ago. We (Linaro) added AArch64 nodes few times — some nodes were taken down, some replaced, some added.

    Speed or lack of it

    Whenever you want to install some Python package using pip it is downloaded from Pypi (directly or mirror). If there is a binary package then you get it, if not then “noarch” package is fetched.

    In worst case source tarball is downloaded and whole build process starts. You need to have all required compilers installed, development headers for Python and all required libraries and rest of needed tools. And then wait. And wait as some packages require a lot of time.

    And then repeat it again and again as you are not allowed to upload packages into Pypi for projects you do not own.

    Argh you, protobuf

    There was a new release of protobuf package. OpenStack bot picked it up, sent patch for review and it got merged.

    And all AArch64 CI jobs failed…

    Turned out that protobuf 3.12.0 was released with x86 wheels only. No source tarball. At all.

    This turned out to be new maintainer mistake — after 2-3 weeks it was fixed in 3.12.2 release.

    Another CI job then

    So I started looking at ‘requirements’ project and created a new CI job for it. To check are new package versions are available for AArch64. Took some time and several side updates as well (yak shaving all the way again).

    Stuff got merged and works now.

    Wheels cache

    While working on above CI job I had a discussion with OpenDev infra team how to make it work properly. Turned out that there were old jobs doing exactly what I wanted: building wheels and caching them for next CI tasks.

    It took several talks and patches from Ian Wienand, Clark Boylan, Jeremy ‘fungi’ Stanley and others. Several CI jobs got renamed, some were moved from one project to another. Servers got configuration changes etc.

    Now we have wheels built for both x86-64 and AArch64 architectures. Covering CentOS 7/8, Debian ‘buster’ and Ubuntu ‘xenial/bionic/focal’ releases. For OpenStack ‘master’ and few stable branches.

    Effect

    Requirements project has quick ‘check-uc’ job running on AArch64 to make sure that all packages are available for both architectures. All OpenStack projects profit from it.

    In Kolla ‘openstack-base’ image went from 23:49 to just 5:21 minutes. Whole Debian/source build is now 57 minutes instead of 2 hours 20 minutes.

    Nice result, isn’t it?

    Written by Marcin Juszkiewicz on
  7. RockPro64 some time later

    Some time passed since I got that board. And some things changed as well.

    U-Boot changes

    I went through U-Boot and submitted a bunch of changes. Most of them were small tweaks to RockPro64 board configuration:

    • enable USB OHCI host (so USB 1.1 things can be plugged directly into ports)
    • enable USB keyboard
    • enable RNG support
    • enable SPI boot
    • start USB support

    With above changes I have all firmware parts stored in on-board SPI chip and do not have to worry about storing whatever on removable media. Also do not need serial console to play with U-Boot as everything can be done with USB keyboard and HDMI monitor.

    Board configuration

    To be able to work comfortably with RockPro64 board I mounted it in acrylic case from Pine A64 and then added some tweaks.

    RockPro64 setup on my desk
    RockPro64 setup on my desk

    Breadboard wires and switches

    Black and white cables are SPI clock and ground. Using button I can short them during boot so on-board flash gets ignored and U-Boot gets loaded from SD card.

    Two orange wires are serial Rx line going through hardware switch. The reason is simple: for some reason board does not boot when Rx line is used during first power on (right after power plug insert, not after pressing Power button). So instead of disconnecting wire each time I can just move switch left, power on, wait for U-Boot messages, switch right and have both directions of serial console working.

    Too bad that there no Power/Reset pins as I would move buttons to breadboard as well to make them more reachable.

    USB hubs

    Next to the board I have a bunch of USB hubs:

    • USB-C to USB-A/HDMI/Power one from one of previous Linaro Connects
    • USB 3.0 one connected with microUSB cable so work as USB 2.0 one
    • USB 1.1 one

    Each of them was in use when I played with enabling USB keyboard as I do not have USB 2.0 keyboard and needed a way to connect standard 1.1 one.

    Other stuff

    Tincantools SPI Hook works as serial console and there is PCI Express x8->x8 (open ended) riser so I can play with different PCIe cards.

    And there is a jumper on ‘disable eMMC’ pins as I lack such module.

    What next?

    Next step would be testing distribution installers a bit. And check how they work. I did some attempts already and waiting for fixes to be merged.

    Written by Marcin Juszkiewicz on
  8. From a diary of AArch64 porter — firefighting

    When I was a kid there was a children’s book about Wojtek who wanted to be firefighter. It is part of culture for my generation.

    I never wanted to follow Wojtek’s dreams. But during last years I became firefighter. And this is not a good thing in a long term.

    CI failures

    During last months we (Linaro) took care of AArch64 support in OpenStack infrastructure. There are nodes with CentOS 7 and 8, Debian ‘stretch’ and ‘buster, Ubuntu ‘xenial’, ‘bionic’ and ‘focal’. And several CI jobs in some projects (Disk Image Builder, Kolla, Nova and some other).

    And those CI jobs tend to fail. As usual, right? Not quite…

    Missing Python packages

    One day when I joined irc in the morning I was greeted with “aarch64 ci fails — can you take a look?” from one of project developers.

    Quick look into logs:

    ERROR: Could not find a version that satisfies the requirement ntplib (from versions: none)
    ERROR: No matching distribution found for ntplib
    

    Then usual path. Pypi, check release, history, go to homepage, fill issue. Curse. And wait for upstream to fix problem. They fixed, CI was working again.

    Last Monday — I started work ready to do something interesting. And then was greeted with same story “aarch64 ci fails, can you take a look?”.

    ERROR: Could not find a version that satisfies the requirement protobuf==3.12.0 
    (from versions: 2.0.0b0, 2.0.3, 2.3.0, 2.4.1, 2.5.0, 2.6.0, 2.6.1, 3.0.0a2,
    3.0.0a3, 3.0.0b1, 3.0.0b1.post1, 3.0.0b1.post2, 3.0.0b2, 3.0.0b2.post1,
    3.0.0b2.post2, 3.0.0b3, 3.0.0b4, 3.0.0, 3.1.0, 3.1.0.post1, 3.2.0rc1,
    3.2.0rc1.post1, 3.2.0rc2, 3.2.0, 3.3.0, 3.4.0, 3.5.0.post1, 3.5.1, 3.5.2,
    3.5.2.post1, 3.6.0, 3.6.1, 3.7.0rc2, 3.7.0rc3, 3.7.0, 3.7.1, 3.8.0rc1, 3.8.0,
    3.9.0rc1, 3.9.0, 3.9.1, 3.9.2, 3.10.0rc1, 3.10.0, 3.11.0rc1, 3.11.0rc2, 3.11.0,
    3.11.1, 3.11.2, 3.11.3)
    

    Pypi, check, homepage, there was an issue already filled. So far no upstream response.

    Problem got solved by moving all OpenStack projects to previous (3.11.3) release.

    Missing/deprecated versions in distributions

    So I started work on adding another CI job. This time for ‘requirements’ OpenStack project. To make sure that whatever Python package upgrade will be available on AArch64 as well.

    As usual had to add a pile of distro dependencies to get ‘numpy’ and ‘scipy’ built correctly. And bump timeout to 2 hours. Build was going nice.

    And then ‘confluent_kafka’ hit hard:

    /tmp/pip-install-ld4fzu94/confluent-kafka/confluent_kafka/src/confluent_kafka.h:65:2: 
    error: #error "confluent-kafka-python requires librdkafka v1.4.0 or later.
    Install the latest version of librdkafka from the Confluent repositories, 
    see http://docs.confluent.io/current/installation.html"                                                            
    
    

    librdkafka v1.4.0 or later” is not available in any distribution used on OpenStack infra nodes. Fantastic!

    And repositories mentioned in error message are x86-64 only. Fun!

    Sure, can do usual pypi, release, homepage, issue. Even went that way. Chain of GitHub projects building components into repositories. x86-64 only all the way.

    External CI services

    Which gets us to another thing — external CI services. You know: Azure pipelines, GitHub actions or Travis CI (alphabetical order). Used in misc ways by most FOSS projects nowadays.

    Each of them has some kind of AArch64 support nowadays. But it looks like only Travis provides you with hardware. Azure and GitHub only can connect your external machines to their CI service.

    Speed or lack of it

    So you have a project where you need AArch64 support and upstream is already using Travis for their CI needs. Lucky you!

    You work with project developers you get test suite running and then it time outs. It does not matter that CI machine you got have few cores because ‘pytest’ does not know how to run tests in parallel.

    So you cut tests completely or partially. Or just abandon the idea.

    No hardware, no binaries

    If you are less lucky then you may get such answer from upstream. I had such in past few times. And I fully understand them — why support something when you can not even test does it work?

    Firefighting or yak shaving?

    When I discussed it with friends one of them mentioned that this reminds him more yak shaving than firefighting. To be honest it is both most of time.

    Written by Marcin Juszkiewicz on
  9. New toy arrived — RockPro64

    AArch64 world has toys on cheap side, servers on expensive and some attempts in a middle.

    Me.

    I started working on AArch64 in 2012 year. Before hardware was available. Even before toolchain bits were available. Then first prototype server arrived at Red Hat, then it got replaced with Applied Micro Mustang. Then more servers, Mustang under desk. Joined Linaro (as Red Hat assignee) and started using their server lab.

    And during all those years I never played with AArch64 SBC.

    If it lacks proper storage and 16+GB ram then I am not interested.

    Some months ago I decided to change it.

    What to choose?

    There are many AArch64 boards to choose from. Some are better, some are not. Popular ones and those with features. Like usual.

    So I asked Peter Robinson and few other folks about their proposal and most of them agreed that Rockchip RK3399 looks most interesting. Has PCI Express slot on several boards, uses Mali T860 so can play with Panfrost driver, USB 3 ports are present etc.

    RockPro64

    After some market research I decided on RockPro64 from Pine64. Board has USB 3 in both A and C type, PCIe x4 slot is present and I can reuse acrylic case from Pine A64 board I bought few years ago. And it has “the most expensive chip in ARM world” present (16MB SPI flash for boot firmware).

    Ordered in March, board arrived two days ago.

    Setup

    I digged in my boxes and found 12V/5A power supply (3A is recommended minimum), used some Ethernet cable flying under desk, old 64GB usb3 thumb drive for storage and my lovely IOGEAR GKM561R keyboard/trackpoint which I use for all boards bring-up.

    Mounted board into acrylic case and started wiring extra stuff.

    Disable SPI button

    As boot firmware can be stored in onboard SPI chip there has to be a way to ignore it on boot. According to disable SPI note on vendor website it is a matter of crossing SPI_CLK with GND during boot.

    So small breadboard, two wires, button and I am ready.

    Serial console

    As usual serial console is on pins. So breadboard got one of my cheap USB-Serial dongles plugged and then wires started.

    Serial console setup post on forum says that 6/8/10 pins are all I need. Did them, connected USB cable and started picocom at 1.5Mbps speed:

    Terminal ready
    ��tU�����UU���˕�U���TU J���UU���Q�UR�)���Q�
    *�R��uT��*��U�
          �VQ��Y��ս+UJ.���I��zT����UQ])
    

    Lovely! Turned out that CP2102 dongles are only “expected” to work at that speed. Digged deeper in drawers and found SPI Hook device from TinCanTools. It uses FT2232HL so should be fine with up to 12Mbps speed.

    And worked without problems:

    U-Boot TPL 2020.04 (Apr 20 2020 - 00:00:00)
    Channel 0: LPDDR4, 50MHz
    BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB
    Channel 1: LPDDR4, 50MHz
    BW=32 Col=10 Bk=8 CS0 Row=16/15 CS=1 Die BW=16 Size=2048MB
    256B stride
    256B stride
    lpddr4_set_rate: change freq to 400000000 mhz 0, 1
    lpddr4_set_rate: change freq to 800000000 mhz 1, 0
    Trying to boot from BOOTROM
    Returning to boot ROM...
    
    U-Boot SPL 2020.04 (Apr 20 2020 - 00:00:00 +0000)
    Trying to boot from MMC1
    
    U-Boot 2020.04 (Apr 20 2020 - 00:00:00 +0000)
    
    SoC: Rockchip rk3399
    Reset cause: POR
    Model: Pine64 RockPro64
    DRAM:  3.9 GiB
    PMIC:  RK808 
    MMC:   dwmmc@fe320000: 1, sdhci@fe330000: 0
    Loading Environment from MMC... Card did not respond to voltage select!
    *** Warning - No block device, using default environment
    

    So I was ready to play!

    Mali

    So when Arm will open source Mali drivers?

    This question is present on each “Ask Arm Anything” sessions during Linaro Connect (ARM partners only session, badges are checked on room entry). And answer always is more or less “never”.

    But there is Panfrost now. Fully open source driver for Mali family used in RK3399 SoC.

    Booted board and it got stuck:

    [   29.043106] panfrost ff9a0000.gpu: clock rate = 500000000
    [   29.053073] rockchip-vop ff8f0000.vop: Adding to iommu group 1
    [   29.058099] panfrost ff9a0000.gpu: mali-t860 id 0x860 major 0x2 minor 0x0 status 0x0
    [   29.058948] panfrost ff9a0000.gpu: features: 00000000,100e77bf, issues: 00000000,24040400
    [   29.059820] panfrost ff9a0000.gpu: Features: L2:0x07120206 Shader:0x00000000 Tiler:0x00000809 Mem:0x1 MMU:0x00002830 AS:0xff JS:0x7
    [   29.061053] panfrost ff9a0000.gpu: shader_present=0xf l2_present=0x1
    [   29.063559] rockchip-vop ff900000.vop: Adding to iommu group 2
    [   29.147966] panfrost ff9a0000.gpu: devfreq_add_device: Unable to find governor for the device
    [   29.151244] panfrost ff9a0000.gpu: [drm:panfrost_devfreq_init [panfrost]] *ERROR* Couldn't initialize GPU devfreq
    [   29.173499] panfrost ff9a0000.gpu: Fatal error during devfreq init
    

    Discussed issue with people on #panfrost IRC channel and they pointed me at “governor_simpleondemand” kernel module being missing. So I rebuilt initramfs with “dracut -v —force-drivers governor_simpleondemand —force” command and it booted to FullHD framebuffer.

    Some work is needed here as monitor has 3440x1440 resolution so at least 2560x1440 or 2560x1080 should be used. But that can wait for now. I plan to move this board somewhere else on desk and connect to 24” full hd monitor.

    Plans?

    I do not plan to use that board for any serious stuff. More as a device to learn how EBBR world looks. Boot sequence with U-Boot loading Grub from EFI system partition looks good.

    There are some things which need work. I want to have boot firmware stored in SPI flash as now I have 1GB microsd card in slot just to have U-Boot loaded.

    There are some changes in U-Boot waiting for someone to test, USB-C port is not enabled at all (it should work as USB 3 and DisplayPort).

    So who knows :D

    Written by Marcin Juszkiewicz on
  10. Another blog updates

    Recently one of my friends worked on his websites and we exchanged some hints on how to make things better for accessibility and SEO. I checked my website and decided to work on another set of improvements.

    Headers

    First thing was headers (h1-h6) and their hierarchy. Loaded 51 articles, fixed their markup and then worked a bit on CSS to get styling like I wanted it to look.

    Now there is just one H1 on website (page title), title of each article is in H2 and then H3-H5 are used in article content. Related posts are H3 as well as they are part of article.

    Image captions

    This thing was on my list for a while. In Wordpress times some articles used images with captions, some were not. Loaded 99 articles, fixed image titles and added ‘yafg’ plugin to Markdown to get all images described. Then some CSS styling and effect is quite ok:

    Fog somewhere in Spanish mountains
    Fog somewhere in Spanish mountains

    Small tweaks

    PageSpeed Insights complained that there can be a moment with no visible text (a flash of invisible text (FOIT) in webdev speak). Three small changes to get ‘font-display:swap’ in CSS and problem solved.

    Also added some extra tags into head part of page — mostly OpenGraph junk.

    Hope that it will be better usable now.

    Written by Marcin Juszkiewicz on
Page 1 / 79
Older posts »