Hotplug in VM. Easy to say…

You run VM instance. Nevermind is it part of OpenStack setup or just local one started using Boxes, virt-manager, virsh or other that kind of fronted to libvirt daemon. And then you want to add some virtual hardware to it. And another card and one more controller…

Easy to imagine scenario, right? What can go wrong, you say? “No more available PCI slots.” message can happen. On second/third card/controller… But how? Why?

Like I wrote in one of my previous posts most of VM instances are 90s pc hardware virtual boxes. With simple PCI bus which accepts several cards to be added/removed at any moment.

But not on AArch64 architecture. Nor on x86-64 with Q35 machine type. What is a difference? Both are PCI Express machines. And by default they have far too small amount of pcie slots (called pcie-root-port in qemu/libvirt language). More about PCI Express support can be found in PCI topology and hotplug page of libvirt documentation.

So I wrote a patch to Nova to make sure that enough slots will be available. And then started testing. Tried few different approaches, discussed with upstream libvirt developers about ways of solving the problem and finally we selected the one and only proper way of doing it. Then discussed failures with UEFI developers. And went for help to Qemu authors. And explained what I want to achieve and why to everyone in each of those four projects. At some point I had seen pcie-root-port things everywhere…

Turned out that the method of fixing it is kind of simple: we have to create whole pcie structure with root port and slots. This tells libvirt to not try any automatic adding of slots (which may be tricky if not configured properly as you may end with too small amount of slots for basic addons).

Then I went with idea of using insane values. VM with one hundred PCIe slots? Sure. So I made one, booted it and then something weird happen: landed in UEFI shell instead of getting system booted. Why? How? Where is my storage? Network? Etc?

Turns out that Qemu has limits. And libvirt has limits… All ports/slots went into one bus and memory for MMCONFIG and/or I/O space was gone. There are two interesting threads about it on qemu-devel mailing list.

So I added magic number into my patch: 28 — this amount of pcie-root-port entries in my aarch64 VM instance was giving me bootable system. Have to check it on x86-64/q35 setup still but it should be more or less the same. I expect this patch to land in ‘Rocky’ (the next OpenStack release) and probably will have to find a way to get it into ‘Queens’ as well because this is what we are planning to use for next edition of Linaro Developer Cloud.

Conclusion? Hotplug may be complicated. But issues with it can be solved.

Developers planet is online

People write blogs. People read blogs. But sometimes it is hard to find blogs of all those interesting people. That’s where so called “planets” are solution.

Years ago there was “Planet Linaro” website filled with blog posts from Linaro developers. Then it vanished. Later it got replaced by poor substitute.

But I do not want to have to track each Linaro developer to find their blog and add it into Feedly. So instead I decided to create new planet website. And that’s how Developers Planet got born.

So far it lists a bunch of blogs of Linaro developers. I used venus to run it. Few years old code but runs. Will adapt HTML/CSS template to be a bit more modern.

And why .cf domain? It is free — that’s why.

Graphical console in OpenStack/aarch64

OpenStack users are used to have graphical console available. They take it for granted even. But we lacked it…

When we started working on OpenStack on 64-bit ARM there were many things missing. Most of them got sorted out already. One thing was still in a queue: graphical console. So two weeks ago I started looking at the issue.

Whenever someone tried to use it Nova reported one simple message: “No free USB ports.” You can ask what it has to console? I thought similar and started digging…

As usual reason was simple: yet another aarch64<>x86 difference in libvirt. Turned out that arm64 is one of those few architectures which do not have USB host controller in default setup. When Nova is told to provide graphical console it adds Spice (or VNC) graphics, video card and USB tablet. But in our case VM instance does not have any USB ports so VM start fails with “No free USB ports” message.

Solution was simple: let’s add USB host controller into VM instance. But where? Should libvirt do that or should Nova? I discussed it with libvirt developers and got mixed answers. Opened a bug for it and went to Nova hacking.

Turned out that Nova code for building guest configuration is not that complicated. I created a patch to add USB host controller and waited for code reviews. There were many suggestions, opinions and questions. So I rewrote code. And then again. And again. Finally 15th version got “looks good” opinion from all reviewers and got merged.

And how result looks? Boring as it should:

OpenStack graphical console on aarch64

Devconf.cz? FOSDEM? PTG? Linaro Connect?

What connects those names? All of them are conference or team meeting names. Spread over Europe + Asia. And I will be on all of them this year.

First will be in Brno, Czechia. Terrible city to travel to but amount of people I can meet there is enormous. Some guys from my Red Hat team will be there, my boss’ boss (and boss of my boss’ boss) and several people from CentOS, Fedora, RHEL (and some other names) communities. Meetings, sessions… Nice to be there. Will be in A-Sport hotel as it is closest to Red Hat office where I have some meetings to attend.

Then FOSDEM. The only “week-long conference squeezed into two days” I know. Good luck with meeting me probably. As usual my list of sessions covers all buildings and have several conflicts. Will meet many people, miss even more, do some beers. This year I stay in Floris Arlequin Grand-Place hotel.

Next one? OpenStack PTG in Dublin, Ireland. Finally will meet all those developers reviewing my patches, helping me with understanding source code of several OpenStack projects. And probably answering several questions about state of AArch64 support. Conference hotel.

Linaro Connect. Hong Kong again. Not registered yet, not looked at flights. Have to do that sooner than later and checking which airline sucks less on intercontinental connections sucks too. With few members of SDI team I will talk about our journey through/with OpenStack on our beloved architecture. Conference hotel.

What after those? Probably OpenSource Day in Poland, maybe some other ones. Will see.

I am now core reviewer in Kolla

Months of work, tens of patches, hundreds of changed lines. My whole work in Kolla project got rewarded this week. I am now one of core reviewers 🙂

What does it mean? I think that Gema summarised it best:

For those of you who don’t know, this means Kolla has recognised our contributions to the project as first class and are giving Marcin and ARM64 a vote of confidence, they realise we are there to stay.

I found it helpful in my daily work as now I can suggest my coworkers to send their patches directly instead of proxying it through me ;D

Can Socionext SynQuacer be first 96boards desktop machine?

During Linaro Connect SFO17 I had an occasion to take a look at first 96boards Enterprise Edition MicroATX format board: Socionext SynQuacer. Can it be called first 96boards desktop machine?

Just to remind — 96boards EE specification defined two form factors:

  • custom 160x120mm
  • MicroATX

There were attempts to build boards in that custom format (Husky, Cello) but they both failed terribly. Turns out that companies which are able to produce 96boards CE boards are not able to make more complicated ones.

Connect ago I wrote about Systart Oxalis LS1020A board as being first 96boards EE one but it used that custom format.

So going back to SynQuacer board…

I would say that it looks like typical MicroATX mainboard:

  • four memory slots (DDR4, up to 64GB, ECC or not ECC)
  • CPU under heatsink (24 Cortex-A53 cores, 1GHz clock)
  • PCI-Express slots (x1, x1, x16 with just 4 lanes)
  • two SATA ports
  • Gigabit Ethernet port
  • two USB 3.0 ports at the back
  • connector for another USB 3.0 ports
  • 96boards low-speed connector (think sensors, serial console, tpm etc)
  • 24pin ATX power connector (no extra +12V ones)
  • power and reset buttons
  • fan connector
  • JTAG port

Socionext SynQuacer

The official announcement did not provide information about price. Only info present was that it will available in December 2017. During discussions with Socionext representatives I was told that full developer box will cost around 1000 USD and involve mainboard, memory, storage (rather not SSD), case and graphics card. Price for just mainboard was not provided as it looked like such option is not planned.

From software point of view there was UEFI presented. With graphical boot. Upstreaming kernel support is in progress (Linaro provides 4.14-rc tree with required changes).

Will it satisfy a need for AArch64 desktop? Time will show. From what I got from developers using it already performance is quite ok as long as it is multithreaded (so kernel build goes nice with -j24 until linking phase kicks in).

Other option for AArch64 desktop would be Macchiatobin. Latest revisions are needed as PCI support got fixed (I was told that first revisions were unable to fully use PCI Express port). Bernhard Rosenkränzer was demoing such setup and it was running nicely.

Moar X-Genes!

At Linaro we have one of those HPe Moonshot beasts. Basically it is chassis with some Ethernet switches built-in. Then you can plug cartridges with processors into it. There are some x86-64 ones and there are M400 ones with X-Gene cpu, 64GB ram and some SSD storage.

And there was delivery at Linaro office. With huge pile of M400 cartridges. Gema opened chassis and started to plug one after another until we got all 45 slots used (we had 15 cartridges before):

Moonshot chassis filled with m400 cartridges

Turned out that one slot is dead so we have to live without c22n1 cartridge. But that still gives us 44 octa core systems. Each has 64GB ram, storage size varies (some have 480GB, some 120GB, some do not want to tell).

We are waiting for another chassis to fill it with rest of M400s ;D

There will be some work as we need to get them updated to be SBSA/SBBR compliant (U-Boot -> kernel is something I leave for some Company but it is not how Linaro expects) – we need to replace firmware setup.

Plans for use? Linaro Developer Cloud, OpenStack 3rdparty CI and probably several other targets.

We need some thermite…

Time goes and it is that time of year where Linaro Enterprise Group is working on a new release. And as usual jokes about lack of thermite starts…

Someone may ask “Why?”. Reason is simple: X-Gene 1 processor. I think that it’s hateclub grows and grows with time.

When it was released it was a nice processor. Eight cores, normal SATA, PCI Express, USB, DDR3 memory with ECC etc. It was used for distribution builders, development platforms etc. Not that there was any choice 😀

Nowadays with all those other AArch64 processors on a market it starts to be painful. PCI support requires quirks, serial console requires patching etc. We have X-Gene1 in Applied Micro Mustang servers and HPe Moonshot M400 cartridges. Maybe officially those machines are not listed as supported but we still use them so testing a new release work there has to be done.

And each time there are some issues to work around. Some could probably be fixed with firmware updates but I do not know do vendors still support that hardware.

So if you have some spare thermite (and a way to handle that legally) then contact us.

Is my work on Kolla done?

During last few months I was working on getting Kolla running on AArch64 and POWER architectures. It was a long journey with several bumps but finally ended.

When I started in January I had no idea how much work it will be and how it will go. Just followed my typical “give me something to build and I will build it” style. You can find some background information in previous post about my work on Kolla.

A lot of failures were present at beginning. Or rather: there was a small amount of images which built. So my first merged change was to do something with Kolla output ;D

  • build: sort list of built/failed images before printing

Debian support was marked for removal so I first checked how it looked, then enabled all possible images and migrated from ‘jessie’ to ‘stretch’ release. Reason was simple: ‘jessie’ (even with backports) lacked packages required to build some images.

  • debian: import key for download.ceph.com repository
  • debian: install gnupg and dirmngr needed for apt-key
  • debian: enable all images enabled for Ubuntu
  • handle rtslib(-fb) package names and dependencies
  • debian: move to stretch
  • Debian 8 was not released yet

Both YUM and APT package managers got some changes. For first one I took care to make sure that it fails if there were missing packages (which was very often during builds for aarch64/ppc64le). It allowed to catch some typo in ‘ironic-conductor’ image. In other words: I make YUM behave closer to APT (which always complain about missing packages). Then I made change for APT to behave more like YUM by making sure that update of packages lists was done before packages were installed.

  • ironic-conductor: add missing comma for centos/source build
  • make yum fail on missing packages
  • always update APT lists when install packages

Of course many images could not be built at all for aarch64/ppc64le architectures. Mostly due to lack of packages and/or external repositories. For each case I was checking is there some way for fixing it. Sometimes I had to disable image, sometimes update packages to newer version. There were also discussions with maintainers of external repositories on getting their stuff available for non-x86 architectures.

  • kubernetes: disable for architectures other than x86-64
  • gnocchi-base: add some devel packages for non-x86
  • ironic-pxe: handle non-x86 architectures
  • openstack-base: Percona-Server is x86-64 only
  • mariadb: handle lack of external repos on non x86
  • grafana: disable for non-x86
  • helm-repository: update to v2.3.0
  • helm-repository: make it work on non-x86
  • kubetoolbox: mark as x86-64 only
  • magnum-conductor: mark as x86-64 only
  • nova-libvirt: handle ppc64le
  • ceph: take care of ceph-fuse package availability
  • handle mariadb for aarch64/ubuntu/source
  • opendaylight: get it working on CentOS/non-x86
  • kolla-toolbox: use proper mariadb packages on CentOS/non-x86

At some moment I had over ten patches in review and all of them depended on the base one. So with any change I had to refresh whole series and reviewers had to review again… Painful it was. So I decided to split out the most basic stuff to get whole patch set split into separate ones. After “base_arch” variable was merged life became much simpler for reviewers and a bit more complicated for me as from now on each patch was kept in separate git branch.

  • add base_arch variable for future non-x86 work

At Linaro we support CentOS and Debian. Kolla supports CentOS/RHEL/OracleLinux, Debian and Ubuntu. I was not making builds with RHEL nor OracleLinux but had to make sure that Ubuntu ones work too. There was funny moment when I realised that everyone using Kolla/master was building images with Ocata packages instead of Pike ;D

  • Ubuntu: use Pike repository

But all those patches meant “nothing” without first one. Kolla had information about which packages are available for aarch64/ppc64le/x86-64 architectures but still had no idea that aarch64 or ppc64le exist. Finally the 50th revision of patch got merged so it now knows ;D

  • Support non-x86 architectures (aarch64, ppc64le)

I also learnt a lot about Gerrit and code reviews. OpenStack community members were very helpful with their comments and suggestions. We had hours of talk on #openstack-kolla IRC channel. Thanks goes to Alicja, Duong Ha-Quang, Jeffrey, Kurt, Mauricio, Michał, Qin Wang, Steven, Surya Prakash, Eduardo, Paul, Sajauddin and many others. You people rock!

So is my work on Kolla done now? Some of it is. But we need to test resulting images, make a Docker repository with them, update official documentation with information how to deploy on aarch64 boxes (hope that there will be no changes needed). Also need to make sure that OpenStack Kolla CI gets Debian based gates operational and provide them with 3rdparty AArch64 based CI so new changes could be checked.

So you run OpenStack on your phone?

For about a year I have been working on OpenStack on AArch64 architecture. And the question from the title is asked from time to time in this or other forms.

Yes, I do have AArch64 powered phone nowadays. But it has just 4GB of memory and runs Android. So is not a good platform for using OpenStack.

I am aware that for many people anything which came from ARM Ltd means small, embedded, not worthy serious effort etc. For me they are not wrong — they are just ‘not up to date’.

We have servers. Sure, someone can say that we had them years ago and it will be right too. There were Marvell server boards, Calxeda had their “high density” boxes with huge amount of quad core cpus. But now we have ‘boring’ ones which can be used in same way as x86-64 ones.

ARM Ltd published SBSA and SBBR specifications which define what ARM server is nowadays. Short version is “boring box which you put into rack, plug power and network, power it on and install any Enterprise Linux distribution”. No need to deal with weird bootloaders (looking from server perspective), random kernel versions etc. Just unpack, connect and use.

But what you get inside? It depends on product. Can be 1 cpu with 8 cores but can also be 1-2 cpus with 48 cores per cpu. Or even more (I heard about 240 cpu cores products but not idea are they on market now). And processors means memory. What about 1TB (terabyte) of memory per CPU? Cavium ThunderX mainboards allow such setup with 8 memory dimms per cpu.

Then goes network. With 32bit ARM machines the problem was “will it support 1GbE?” and with AArch64 servers that problem can re-appear too as some systems do not support ports with less than 10GbE (some ThunderX boards have 3x40GbE + 4x10GbE ports). RJ-45 connector is usually to connect with BMC (think IPMI).

Storage is Serial-ATA, whatever you plug into PCI Express or something on network. Choose your way. I would not be surprised with M.2 connectors too.

Usually that means that several PCI Express chips are present on board to provide all that. On AArch64 most of controllers are already part of SoC to make things easier and faster.

On top of that we run standard distributions like CentOS, Debian, Fedora or OpenSUSE. Out of box, with distro kernels based on mainline kernels. And then we install OpenStack. From packages, as Docker containers, using devstack or any other way we tend to use.

And when I really have to use OpenStack on my phone then it looks like this: