One of things which made AArch64 servers so successful was agreeing on set of standards and keeping them implemented. But that was not always a case…
Wild, Wild West
I started working with Arm architecture in 2004. This was time when nearly every device required own kernel… You had those ‘board files’ inside of arch/arm directory, each vendor made own versions of same drivers etc.
From distribution perspective it was nightmare. I was maintaining OpenZaurus at that time and with ten models supported we had to build whole set of kernels. Good that four of them were differing only by amount of memory and flash so we were able to handle them as one machine leaving checking details to kernel once it booted. PXA250 or PXA255 processor was also handled by kernel.
Those times also meant different bootloaders. Zaurus ones were awful. We even had to ignore kernel cmdline it gave as it did not fit even into our 2.4.18-crappix kernels and was completely wrong once we moved to 2.6 line.
Nokia 770/N8x0 had another one. Developer boards had RedBoot, U-Boot (if lucky) or whatever vendor invented. Some had a way to change and store boot commands, some did not. Space for kernel could be limited in a way that getting something which fits was a challenge.
Basically for most of devices you had to handle booting, updates of kernels etc. separately.
Linaro to the rescue
In 2010 Arm with some partners created Linaro to improve Linux situation on Arm devices. I was one of first engineers there. We were present in many areas. Porting software, benchmarking, improving performance etc.
And cleaning kernel/boot situation. I do not know how many people remember this post by Linus Torvalds:
Gaah. Guys, this whole ARM thing is a f*cking pain in the ass.
You need to stop stepping on each others toes. There is no way that your changes to those crazy clock-data files should constantly result in those annoying conflicts, just because different people in different ARM trees do some masturbatory renaming of some random device. Seriously.
This was reaction to a moment when someone created another copy of some drivers. It was popular way to do things on Arm architecture — each vendor had their own version of PL011 serial driver etc.
Some time later “arm-soc” subsystem was created to handle merging code touching device support, drivers etc. This allowed Russell King to concentrate on maintaining Arm architecture support.
During next years most of vendor versions were merged into single ones. And moved where they belong — from arch/arm/ to drivers/ area of kernel.
At some moment adding new board files was forbidden as Arm architecture was migrating into DeviceTree world.
Why going into DeviceTree (DT in short)? What it gave us? Other than new problems?
There were several such questions. The good part is that it was not something new to the Linux kernel. DT was already in use on Power architecture (and iirc SPARC). After some adaptations Arm devices became more maintainable.
DeviceTree solved one crucial problem of Arm — lack of hardware discovery. System on Chip (SoC) can contain several controllers, processor cores etc. Before it was handled inside of ‘board file’ but also required building kernels per nearly each device. Now kernel was finally able to boot, parse DT information and get idea what is available and which drivers need to be used.
That way one kernel was able to support several devices. And amount of them was bigger and bigger each release. At some moment you could build one kernel for all Arm v4 and v5 devices plus second one for v6 and v7 ones. Huge improvement.
When it comes to bootloaders situation changed here as well. Most of ones used in past vanished and U-Boot became kind of ‘gold standard’. DeviceTree support was present but still each device had own way of booting. Different commands, storage options etc.
Distributions handled that in miscellaneous ways. Extlinux support, ‘flash-kernel’ scripts etc.
At some moment Dennis Gilmore took some time and introduced generic boot command for U-Boot. It was merged in July 2014. So instead of having different ways of handling stuff there was now one command on all devices (once they migrated).
Kernel and initramfs were checked on sd/mmc/emmc, sata, scsi, ide, usb and then fallback to tftp. It was expanded since then to support several options and is now standard in U-Boot.
At the beginning of 2013 several AArch64 systems started to appear. SBC ones followed what was on 32-bit Arm but servers were driven into different direction.
They were supposed to be as boring as x86 one were. You unpack, put it into rack, connect standard power/network cables and boot it without worrying will it work or not. At same time provide administrators with same environment as they had on x86.
So it meant UEFI as a firmware. ACPI as hardware description. And I simplified a bit.
So to make it right it needed work on defining standards and then vendors to follow them.
SBSA defined hardware
First specification was Server Base System Architecture (SBSA in short). It defined hardware part — each AArch64 machine needs to use PL011 serial port, PL031 RTC, PL061 GPIO controller etc. And PCI Express support without quirks. Without it it can not be called server.
SBSA has several levels of compliance. Nowadays level 3 is minimal version.
Level 0 was funny as it covered only X-Gene1 boxes (SoC was older than specification).
SBBR defined firmware
Simplest definition of Server Base Boot Requirements specification? Server needs to run UEFI and use ACPI to describe hardware. And has to be SBSA compliant.
Someone may ask why UEFI and ACPI. One reason is that they are present in x86 servers and aarch64 ones follow them as much as possible in behaviour. Other is that this way there are things which can be done with firmware help.
But ACPI was x86 only so it needed to be adapted to AArch64 architecture. Work started by making ACPI an open specification under UEFI Forum agenda so it became open to anyone (it was Intel, Microsoft, Phoenix and Toshiba only before). There were many changes made since then. And several new tables defined.
I heard several rumours about why ACPI. Someone said that ACPI was forced by Microsoft. In reality it was decision taken by all major distros and Microsoft.
So what SBBR compliance gives? For start it allows to run generic distribution kernels out of the box. Each server SoC has same basic components and use same standards to boot system. So far Linux distributions, several *BSD systems and Microsoft Windows support SBBR machines out of the box.
For example getting Qualcomm Centriq or Huawei TaiShan servers supported in Debian ‘buster’ was very easy task. Both booted with distribution kernel. Huawei one required enabling of on-board network card, Centriq had SAS controller module to enable to connect to storage (which was enabled on few other architectures already).
EBBR for those who can not follow
In short Embedded Base Boot Requirements are kind of SBBR for non-server class hardware.
Device can use ACPI and/or DeviceTree to describe hardware. May boot whatever as long it provides EFI Boot Services to bootloader used by distributions (grub2, gummiboot etc).
Specification feels made especially for distributions to make their life easier. This way there is one way to boot both SBC and SBBR compliant machines.
Getting distribution kernel running on EBBR board is usually more work than it is with SBBR compliant server. All hardware specific options need to be found and enabled (from SoC support to all it’s drivers etc).
During Arm DevSummit 2020 there was announcement of new standards for Arm devices:
Arm is extending the system architecture standards compliance from servers to other segments of the market, edge and IoT. We introduce the new BSA specification with market segment-specific supplements and provide the operating system-oriented boot requirements recipes in the new BBR specification.
I need to spend some time reading specifications and then write second part of this article.