You run VM instance. Never mind is it part of OpenStack setup or just local one started using Boxes, virt-manager, virsh or other that kind of fronted to libvirt daemon.
Let hotplug some hardware
Then you want to add some virtual hardware to it. And another card and one more controller…
Easy to imagine scenario, right? What can go wrong, you say? “No more available PCI slots.” message can happen. On second/third card or controller… But how? Why?
Like I wrote in one of my previous posts most of VM instances are 90s pc hardware virtual boxes. With simple PCI bus which accepts several cards to be added/removed at any moment.
PCI Express is different
Things are different on AArch64 architecture. Or on x86-64 with Q35 machine
type. What is a difference? Both are PCI Express machines. And by default they
have far too small amount of pcie slots (called pcie-root-port
in qemu/libvirt
language). More about PCI Express support can be found in PCI topology and
hotplug page of libvirt documentation.
So I wrote a patch to Nova to make
sure that enough slots will be available. And then started testing. Tried few
different approaches, discussed with upstream libvirt developers about ways of
solving the problem and finally we selected the one and only proper way of doing
it. Then discussed failures with UEFI developers. And went for help to Qemu
authors. And explained what I want to achieve and why to everyone in each of
those four projects. At some point I had seen pcie-root-port
things everywhere…
Turned out that the method of fixing it is kind of simple: we have to create whole pcie structure with root port and slots. This tells libvirt to not try any automatic adding of slots (which may be tricky if not configured properly as you may end with too small amount of slots for basic addons).
More PCI Express slots
Then I went with idea of using insane values. VM with one hundred PCIe slots? Sure. So I made one, booted it and then something weird happen: landed in UEFI shell instead of getting system booted. Why? How? Where is my storage? Network? Etc?
Limits, limits everywhere…
Turns out that Qemu has limits. And libvirt has limits… All ports/slots went into one bus and memory for MMCONFIG and/or I/O space was gone. There are two interesting threads about it on qemu-devel mailing list.
So I added magic number into my patch: 28 — this amount of pcie-root-port
entries in my aarch64 VM instance was giving me bootable system. Have to check
it on x86-64/q35 setup still but it should be more or less the same. I expect
this patch to land in ‘Rocky’ (the next OpenStack release) and probably will
have to find a way to get it into ‘Queens’ as well because this is what we are
planning to use for next edition of Linaro Developer Cloud.
Conclusion
Hotplug may be complicated. But issues with it can be solved.