One of things which made AArch64 servers so successful was agreeing on set of
standards and keeping them implemented. But that was not always a case…
Wild, Wild West
I started working with Arm architecture in 2004. This was time when nearly every
device required own kernel… You had those ‘board files’ inside of arch/arm
directory, each vendor made own versions of same drivers etc.
From distribution perspective it was nightmare. I was maintaining OpenZaurus at
that time and with ten models supported we had to build whole set of kernels.
Good that four of them were differing only by amount of memory and flash so we
were able to handle them as one machine leaving checking details to kernel once
it booted. PXA250 or PXA255 processor was also handled by kernel.
Those times also meant different bootloaders. Zaurus ones were awful. We even
had to ignore kernel cmdline it gave as it did not fit even into our
2.4.18-crappix kernels and was completely wrong once we moved to 2.6 line.
Nokia 770/N8x0 had another one. Developer boards had RedBoot, U-Boot (if lucky)
or whatever vendor invented. Some had a way to change and store boot commands,
some did not. Space for kernel could be limited in a way that getting something
which fits was a challenge.
Basically for most of devices you had to handle booting, updates of kernels etc. separately.
Linaro to the rescue
In 2010 Arm with some partners created Linaro to improve Linux situation on Arm
devices. I was one of first engineers there. We were present in many areas.
Porting software, benchmarking, improving performance etc.
And cleaning kernel/boot situation. I do not know how many people remember this
post by Linus Torvalds:
Gaah. Guys, this whole ARM thing is a f*cking pain in the ass.
You need to stop stepping on each others toes. There is no way that
your changes to those crazy clock-data files should constantly result
in those annoying conflicts, just because different people in
different ARM trees do some masturbatory renaming of some random
device. Seriously.
This was reaction to a moment when someone created another copy of some drivers.
It was popular way to do things on Arm architecture — each vendor had their own
version of PL011 serial driver etc.
Some time later “arm-soc” subsystem was created to handle merging code touching
device support, drivers etc. This allowed Russell King to concentrate on
maintaining Arm architecture support.
During next years most of vendor versions were merged into single ones. And
moved where they belong — from arch/arm/ to drivers/ area of kernel.
At some moment adding new board files was forbidden as Arm architecture was
migrating into DeviceTree world.
DeviceTree migration
Why going into DeviceTree (DT in short)? What it gave us? Other than new problems?
There were several such questions. The good part is that it was not something
new to the Linux kernel. DT was already in use on Power architecture (and iirc
SPARC). After some adaptations Arm devices became more maintainable.
DeviceTree solved one crucial problem of Arm — lack of hardware discovery.
System on Chip (SoC) can contain several controllers, processor cores etc.
Before it was handled inside of ‘board file’ but also required building kernels
per nearly each device. Now kernel was finally able to boot, parse DT
information and get idea what is available and which drivers need to be used.
That way one kernel was able to support several devices. And amount of them was
bigger and bigger each release. At some moment you could build one kernel for
all Arm v4 and v5 devices plus second one for v6 and v7 ones. Huge improvement.
Bootloaders?
When it comes to bootloaders situation changed here as well. Most of ones used
in past vanished and U-Boot became kind of ‘gold standard’. DeviceTree
support was present but still each device had own way of booting. Different
commands, storage options etc.
Distributions handled that in miscellaneous ways. Extlinux support,
‘flash-kernel’ scripts etc.
At some moment Dennis Gilmore took some time and introduced generic boot command
for U-Boot. It was merged in July 2014. So instead of having different ways of
handling stuff there was now one command on all devices (once they migrated).
Kernel and initramfs were checked on sd/mmc/emmc, sata, scsi, ide, usb and then
fallback to tftp. It was expanded since then to support several options and is
now standard in U-Boot.
AArch64 arrival
At the beginning of 2013 several AArch64 systems started to appear. SBC ones
followed what was on 32-bit Arm but servers were driven into different direction.
Servers
They were supposed to be as boring as x86 one were. You unpack, put it into
rack, connect standard power/network cables and boot it without worrying will it
work or not. At same time provide administrators with same environment as they
had on x86.
So it meant UEFI as a firmware. ACPI as hardware description. And I simplified a bit.
So to make it right it needed work on defining standards and then vendors to
follow them.
SBSA defined hardware
First specification was Server Base System Architecture (SBSA in short). It
defined hardware part — each AArch64 machine needs to use PL011 serial port,
PL031 RTC, PL061 GPIO controller etc. And PCI Express support without quirks.
Without it it can not be called server.
SBSA has several levels of compliance. Nowadays level 3 is minimal version.
Level 0 was funny as it covered only X-Gene1 boxes (SoC was older than specification).
SBBR defined firmware
Simplest definition of Server Base Boot Requirements specification? Server needs
to run UEFI and use ACPI to describe hardware. And has to be SBSA compliant.
Someone may ask why UEFI and ACPI. One reason is that they are present in x86
servers and aarch64 ones follow them as much as possible in behaviour. Other is
that this way there are things which can be done with firmware help.
But ACPI was x86 only so it needed to be adapted to AArch64 architecture. Work
started by making ACPI an open specification under UEFI Forum agenda so it
became open to anyone (it was Intel, Microsoft, Phoenix and Toshiba only
before). There were many changes made since then. And several new tables defined.
I heard several rumours about why ACPI. Someone said that ACPI was forced by
Microsoft. In reality it was decision taken by all major distros and Microsoft.
So what SBBR compliance gives? For start it allows to run generic distribution
kernels out of the box. Each server SoC has same basic components and use same
standards to boot system. So far Linux distributions, several *BSD systems and
Microsoft Windows support SBBR machines out of the box.
For example getting Qualcomm Centriq or Huawei TaiShan servers supported in
Debian ‘buster’ was very easy task. Both booted with distribution kernel. Huawei
one required enabling of on-board network card, Centriq had SAS controller
module to enable to connect to storage (which was enabled on few other
architectures already).
EBBR for those who can not follow
In short Embedded Base Boot Requirements are kind of SBBR for non-server class hardware.
Device can use ACPI and/or DeviceTree to describe hardware. May boot whatever as
long it provides EFI Boot Services to bootloader used by distributions (grub2,
gummiboot etc).
Specification feels made especially for distributions to make their life easier.
This way there is one way to boot both SBC and SBBR compliant machines.
Getting distribution kernel running on EBBR board is usually more work than it
is with SBBR compliant server. All hardware specific options need to be found
and enabled (from SoC support to all it’s drivers etc).
BSA, BBR?
During Arm DevSummit 2020 there was announcement of new standards for Arm devices:
Arm is extending the system architecture standards compliance from servers to
other segments of the market, edge and IoT. We introduce the new BSA
specification with market segment-specific supplements and provide the
operating system-oriented boot requirements recipes in the new BBR specification.
They are described in second part of this article.