This post is part 3 of the "SBSA Reference Platform in QEMU" series:
- Versioning of sbsa-ref machine
- SBSA Reference Platform update
- Testing *BSD on SBSA Reference Platform
- Running SBSA Reference Platform
- DT-free EDK2 on SBSA Reference Platform
- ConfigurationManager in EDK2: just say no
SystemReady specification mentions that system to be certified needs to be able to boot several operating systems:
In addition, OS installation and boot logs are required:
- Windows PE boot log, from a GPT partitioned disk, is required.
- VMware ESXi-Arm installation and boot logs are recommended.
- Installation and boot logs from two of the Linux distros or BSDs are required.
All logs must be submitted using the ES/SR template.
In choosing the Linux distros or BSDs, maximize the coverage by diversifying the heritage. For example, the following shows the grouping of the heritage:
- RHEL/Fedora/CentOS/AlmaLinux/Rocky Linux/Oracle Linux/Anolis OS
- SLES/openSUSE
- Ubuntu/Debian
- CBL-Mariner
- NetBSD/OpenBSD/FreeBSD
So during last week I went through *BSD ones.
OpenBSD
Started with “download OpenBSD” page and found out that there is no installation ISO for aarch64 architecture. Not good.
So I fetched miniroot73.img disk image instead and went on with booting:
>> OpenBSD/arm64 BOOTAA64 1.16
boot>
cannot open sd0a:/etc/random.seed: No such file or directory
booting sd0a:/bsd: 2798224+1058776+12709688+630920 [229059+91+651336+254968]=0x1
3ce628
FACP DBG2 MCFG SPCR APIC SSDT PPTT GTDT BGRT
Copyright (c) 1982, 1986, 1989, 1991, 1993
        The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2023 OpenBSD. All rights reserved.  https://www.OpenBSD.org
OpenBSD 7.3 (RAMDISK) #1941: Sat Mar 25 14:42:22 MDT 2023
    deraadt@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/RAMDISK
real mem  = 4287451136 (4088MB)
avail mem = 4073807872 (3885MB)
random: boothowto does not indicate good seed
mainbus0 at root: ACPI
psci0 at mainbus0: PSCI 1.1, SMCCC 1.2
cpu0 at mainbus0 mpidr 0: ARM Cortex-A57 r1p0
cpu0: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu0: 2048KB 64b/line 16-way L2 cache
cpu0: CRC32,SHA2,SHA1,AES+PMULL,ASID16
efi0 at mainbus0: UEFI 2.7
efi0: EFI Development Kit II / SbsaQemu rev 0x10000
smbios0 at efi0: SMBIOS 3.4.0
smbios0: vendor EFI Development Kit II / SbsaQemu version "1.0" date 09/15/2023
smbios0: QEMU QEMU SBSA-REF Machine
agintc0 at mainbus0 shift 4:3 nirq 256 nredist 2: "interrupt-controller"
agtimer0 at mainbus0: 62500 kHz
acpi0 at mainbus0: ACPI 6.0
acpi0: tables DSDT FACP DBG2 MCFG SPCR APIC SSDT PPTT GTDT BGRT
acpimcfg0 at acpi0
acpimcfg0: addr 0xf0000000, bus 0-255
pluart0 at acpi0 COM0 addr 0x60000000/0x1000 irq 33
pluart0: console
ahci0 at acpi0 AHC0 addr 0x60100000/0x10000 irq 42: AHCI 1.0
ahci0: port 0: 1.5Gb/s
ahci0: port 1: 1.5Gb/s
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0: <ATA, QEMU HARDDISK, 2.5+> t10.ATA_QEMU_HARDDISK_QM00001_
sd0: 43MB, 512 bytes/sector, 88064 sectors, thin
sd1 at scsibus0 targ 1 lun 0: <ATA, QEMU HARDDISK, 2.5+> t10.ATA_QEMU_HARDDISK_QM00003_
sd1: 504MB, 512 bytes/sector, 1032192 sectors, thin
ehci0 at acpi0 USB0 addr 0x60110000/0x10000 irq 43panic: uvm_fault failed: ffffff800034c3e8 esr 96000050 far ffffff8066ef5048
The operating system has halted.
Please press any key to reboot.
As you can see it hang on an attempt to initialize USB controller. Which shows that our move from EHCI to XHCI was not properly tested ;(
The problem was that our virtual hardware (QEMU) had XHCI (USB 3) controller on non-discoverable platform bus. But firmware (EDK2) tells that it was EHCI (USB 2) one.
This got solved with Yuquan Wang’s patch moving EDK2 to initiate and describe XHCI usb controller (change is already merged upstream). After rebuilding EDK2 OpenBSD booted fine right to the installation prompt (skipped previous messages):
xhci0 at acpi0 USB0 addr 0x60110000/0x10000 irq 43, xHCI 0.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 addr 1
acpipci0 at acpi0 PCI0
pci0 at acpipci0
0:1:0: rom address conflict 0xfffc0000/0x40000
0:2:0: rom address conflict 0xffff8000/0x8000
"Red Hat Host" rev 0x00 at pci0 dev 0 function 0 not configured
em0 at pci0 dev 1 function 0 "Intel 82574L" rev 0x00: msi, address 52:54:00:12:34:56
"Bochs VGA" rev 0x02 at pci0 dev 2 function 0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
simplefb0 at mainbus0: 1280x800, 32bpp
wsdisplay0 at simplefb0 mux 1
wsdisplay0: screen 0 added (std, vt100 emulation)
uhidev0 at uhub0 port 1 configuration 1 interface 0 "QEMU QEMU USB Keyboard" rev 2.00/0.00 addr 2
uhidev0: iclass 3/1
ukbd0 at uhidev0
wskbd0 at ukbd0 mux 1
wskbd0: connecting to wsdisplay0
uhidev1 at uhub0 port 2 configuration 1 interface 0 "QEMU QEMU USB Tablet" rev 2.00/0.00 addr 3
uhidev1: iclass 3/0
uhid at uhidev1 not configured
softraid0 at root
scsibus1 at softraid0: 256 targets
root on rd0a swap on rd0b dump on rd0b
WARNING: CHECK AND RESET THE DATE!
erase ^?, werase ^W, kill ^U, intr ^C, status ^T
Welcome to the OpenBSD/arm64 7.3 installation program.
(I)nstall, (U)pgrade, (A)utoinstall or (S)hell?
After this I added booting OpenBSD to QEMU tests for SBSA Reference Platform to make sure that we have something non-Linux based there.
FreeBSD
The next one was FreeBSD. And here situation started to be weird…
First I took 13.2 release. Used firmware with XHCI information and was greeted with:
Copyright (c) 1992-2021 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
        The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 13.2-RELEASE releng/13.2-n254617-525ecfdad597 GENERIC arm64
FreeBSD clang version 14.0.5 (https://github.com/llvm/llvm-project.git llvmorg-14.0.5-0-gc12386ae247c)
VT(efifb): resolution 1280x800
module firmware already present!
real memory  = 4294967296 (4096 MB)
avail memory = 4160204800 (3967 MB)
Starting CPU 1 (1)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
arc4random: WARNING: initial seeding bypassed the cryptographic random device because it was not yet seeded and the knob 'bypass_before_seeding' was enabled.
random: entropy device external interface
MAP 100fbdf0000 mode 2 pages 128
MAP 100fbe70000 mode 2 pages 160
MAP 100fbf10000 mode 2 pages 80
MAP 100fbfb0000 mode 2 pages 80
MAP 100ff500000 mode 2 pages 400
MAP 100ff690000 mode 2 pages 592
MAP 10000000 mode 0 pages 1728
MAP 60010000 mode 0 pages 1
kbd0 at kbdmux0
acpi0: <LINARO SBSAQEMU>
acpi0: Power Button (fixed)
acpi0: Sleep Button (fixed)
acpi0: Could not update all GPEs: AE_NOT_CONFIGURED
psci0: <ARM Power State Co-ordination Interface Driver> on acpi0
gic0: <ARM Generic Interrupt Controller v3.0> iomem 0x40060000-0x4007ffff,0x40080000-0x4407ffff on acpi0
its0: <ARM GIC Interrupt Translation Service> mem 0x44081000-0x440a0fff on gic0
generic_timer0: <ARM Generic Timer> irq 3,4,5 on acpi0
Timecounter "ARM MPCore Timecounter" frequency 62500000 Hz quality 1000
Event timer "ARM MPCore Eventtimer" frequency 62500000 Hz quality 1000
efirtc0: <EFI Realtime Clock>
efirtc0: registered as a time-of-day clock, resolution 1.000000s
uart0: <PrimeCell UART (PL011)> iomem 0x60000000-0x60000fff irq 0 on acpi0
uart0: console (115200,n,8,1)
ahci0: <AHCI SATA controller> iomem 0x60100000-0x6010ffff irq 1 on acpi0
ahci0: AHCI v1.00 with 6 1.5Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich3: <AHCI channel> at channel 3 on ahci0
ahcich4: <AHCI channel> at channel 4 on ahci0
ahcich5: <AHCI channel> at channel 5 on ahci0
xhci0: <Generic USB 3.0 controller> iomem 0x60110000-0x6011ffff irq 2 on acpi0
xhci0: 32 bytes context size, 32-bit DMA
And it hang there…
Let check newer FreeBSD
Contacted people on #freebsd IRC channel and Mina Galić (meena on IRC) asked me to boot FreeBSD 14 or 15 images. So I tried both:
xhci0: <Generic USB 3.0 controller> iomem 0x60110000-0x6011ffff irq 2 on acpi0
xhci0: 32 bytes context size, 64-bit DMA
usbus0 on xhci0
System booted further. Note “64-bit DMA” information instead of “32-bit DMA” from 13.2 release. Reported bug 274237 for it. On the same day required change was identified and marked for potential backport.
AHCI issue
But that was not the only problem. Turned out that none of AHCI devices were found… So no way to run an installer:
ahci0: <AHCI SATA controller> iomem 0x60100000-0x6010ffff irq 1 on acpi0
ahci0: AHCI v1.00 with 6 1.5Gbps ports, Port Multiplier not supported
ahcich0: <AHCI channel> at channel 0 on ahci0
ahcich1: <AHCI channel> at channel 1 on ahci0
ahcich2: <AHCI channel> at channel 2 on ahci0
ahcich3: <AHCI channel> at channel 3 on ahci0
ahcich4: <AHCI channel> at channel 4 on ahci0
ahcich5: <AHCI channel> at channel 5 on ahci0
[..]
Release APs...done
Trying to mount root from cd9660:/dev/iso9660/13_2_RELEASE_AARCH64_BO [ro]...
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
Root mount waiting for: CAM
ahcich0: Poll timeout on slot 1 port 0
ahcich0: is 00000000 cs 00000002 ss 00000000 rs 00000002 tfd 170 serr 00000000 cmd 0000c017
(aprobe0:ahcich0:0:0:0): SOFT_RESET. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
(aprobe0:ahcich0:0:0:0): CAM status: Command timeout
(aprobe0:ahcich0:0:0:0): Error 5, Retries exhausted
I checked QEMU 7.2 (from Fedora package) and it booted fine. 8.0.5 failed, 8.0.0 booted. Hm… Started “git bisect” to find out which change broke it. After several rebuilds I found commit to blame:
commit 7bcd32128b227cee1fb39ff242d486ed9fff7648
Author: Niklas Cassel <niklas.cassel@wdc.com>
Date:   Fri Jun 9 16:08:40 2023 +0200
    hw/ide/ahci: simplify and document PxCI handling
    The AHCI spec states that:
    For NCQ, PxCI is cleared on command queued successfully.
Is it AArch64 only or not?
The next step: checking is it global problem or only aarch64 one.
I built x86-64 emulation component and checked Q35 machine (which also uses AHCI). And FreeBSD failed exactly same way. This made bug reporting a lot easier as there were several architectures and more users affected.
Mailed author and QEMU developers about it. Described the problem, gave exact command line arguments for QEMU etc. Niklas Cassel replied:
I will have a look at this.
So it will be done.
NetBSD
Here situation was a bit similar to FreeBSD one.
Fetched NetBSD 9.3 image and booted just to see it hang (removed printk.time from output):
Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
    2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017,
    2018, 2019, 2020, 2021, 2022
    The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
    The Regents of the University of California.  All rights reserved.
NetBSD 9.3 (GENERIC64) #0: Thu Aug  4 15:30:37 UTC 2022
 mkrepro@mkrepro.NetBSD.org:/usr/src/sys/arch/evbarm/compile/GENERIC64
total memory = 4075 MB
avail memory = 3929 MB
running cgd selftest aes-xts-256 aes-xts-512 done
armfdt0 (root)
simplebus0 at armfdt0: QEMU QEMU SBSA-REF Machine
simplebus1 at simplebus0
acpifdt0 at simplebus0
acpifdt0: using EFI runtime services for RTC
ACPI: RSDP 0x00000100FC020018 000024 (v02 LINARO)
ACPI: XSDT 0x00000100FC02FE98 00006C (v01 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: FACP 0x00000100FC02FB98 000114 (v06 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: DSDT 0x00000100FC02E998 000CD8 (v02 LINARO SBSAQEMU 20200810 INTL 20220331)
ACPI: DBG2 0x00000100FC02FA98 00005C (v00 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: MCFG 0x00000100FC02FE18 00003C (v01 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: SPCR 0x00000100FC02FF98 000050 (v02 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: IORT 0x00000100FC027518 0000DC (v00 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: APIC 0x00000100FC02E498 000108 (v04 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: SSDT 0x00000100FC02E898 000067 (v02 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: PPTT 0x00000100FC02FD18 0000B8 (v02 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: GTDT 0x00000100FC02E618 000084 (v03 LINARO SBSAQEMU 20200810 LNRO 00000001)
ACPI: 2 ACPI AML tables successfully acquired and loaded
acpi0 at acpifdt0: Intel ACPICA 20190405
cpu0 at acpi0: unknown CPU (ID = 0x411fd402)
cpu0: package 0, core 0, smt 0
cpu0: IC enabled, DC enabled, EL0/EL1 stack Alignment check enabled
cpu0: Cache Writeback Granule 16B, Exclusives Reservation Granule 16B
cpu0: Dcache line 64, Icache line 64
cpu0: L1 0KB/64B 4-way PIPT Instruction cache
cpu0: L1 0KB/64B 4-way PIPT Data cache
cpu0: L2 0KB/64B 8-way PIPT Unified cache
cpu0: revID=0x0, 4k table, 16k table, 64k table, 16bit ASID
cpu0: auxID=0x1011111110212120, GICv3, CRC32, SHA1, AES+PMULL, rounding, NaN propagation, denormals, 32x64bitRegs, Fused Multiply-Add
cpu1 at acpi0: unknown CPU (ID = 0x411fd402)
cpu1: package 0, core 1, smt 0
gicvthree0 at acpi0: GICv3
gicvthree0: ITS #0 at 0x44081000
gicvthree0: ITS [#0] Devices table @ 0x10009210000/0x80000, Cacheable WA WB, Inner shareable
gicvthree0: ITS [#1] Collections table @ 0x10009290000/0x10000, Cacheable WA WB, Inner shareable
As 9.3 release is quite old I tested NetBSD 10-Beta:
gicvthree0 at acpi0: GICv3
gicvthree0: ITS #0 at 0x44081000
gicvthree0: ITS [#0] Devices table @ 0x10008a60000/0x80000, Cacheable WA WB, Inner shareable
gicvthree0: ITS [#1] Collections table @ 0x10008ae0000/0x10000, Cacheable WA WB, Inner shareable
gtmr0 at acpi0: irq 27
armgtmr0 at gtmr0: Generic Timer (62500 kHz, virtual)
plcom0 at acpi0 (COM0, ARMH0011-0): mem 0x60000000-0x60000fff irq 33
plcom0: console
[..]
NetBSD-10.0_BETA Install System
Went to #netbsd channel on IRC and started disussion. Michael van Elst (mlelstv on irc) gave me a helping hand and debugged the problem. Looks like kernel went into infinite loop on parsing GTDT table from ACPI. Newer branches of NetBSD have additional check there.
Filled bug 57642 for it. And, like in FreeBSD case, it looks like some backport to stable branch is needed.
Summary
Testing platforms for SystemReady compliance needs to include *BSD systems. Linux and NetBSD were fine with our USB controller mess — gave “something is wrong” message and went on. FreeBSD and OpenBSD systems were complaining and stopped booting process.
We also need to do more testing before merging big changes in future. This USB controller mess could be avoided or done better.