Is my work on Kolla done?

During last few months I was working on getting Kolla running on AArch64 and POWER architectures. It was a long journey with several bumps but finally ended.

When I started in January I had no idea how much work it will be and how it will go. Just followed my typical “give me something to build and I will build it” style. You can find some background information in previous post about my work on Kolla.

A lot of failures were present at beginning. Or rather: there was a small amount of images which built. So my first merged change was to do something with Kolla output ;D

  • build: sort list of built/failed images before printing

Debian support was marked for removal so I first checked how it looked, then enabled all possible images and migrated from ‘jessie’ to ‘stretch’ release. Reason was simple: ‘jessie’ (even with backports) lacked packages required to build some images.

  • debian: import key for download.ceph.com repository
  • debian: install gnupg and dirmngr needed for apt-key
  • debian: enable all images enabled for Ubuntu
  • handle rtslib(-fb) package names and dependencies
  • debian: move to stretch
  • Debian 8 was not released yet

Both YUM and APT package managers got some changes. For first one I took care to make sure that it fails if there were missing packages (which was very often during builds for aarch64/ppc64le). It allowed to catch some typo in ‘ironic-conductor’ image. In other words: I make YUM behave closer to APT (which always complain about missing packages). Then I made change for APT to behave more like YUM by making sure that update of packages lists was done before packages were installed.

  • ironic-conductor: add missing comma for centos/source build
  • make yum fail on missing packages
  • always update APT lists when install packages

Of course many images could not be built at all for aarch64/ppc64le architectures. Mostly due to lack of packages and/or external repositories. For each case I was checking is there some way for fixing it. Sometimes I had to disable image, sometimes update packages to newer version. There were also discussions with maintainers of external repositories on getting their stuff available for non-x86 architectures.

  • kubernetes: disable for architectures other than x86-64
  • gnocchi-base: add some devel packages for non-x86
  • ironic-pxe: handle non-x86 architectures
  • openstack-base: Percona-Server is x86-64 only
  • mariadb: handle lack of external repos on non x86
  • grafana: disable for non-x86
  • helm-repository: update to v2.3.0
  • helm-repository: make it work on non-x86
  • kubetoolbox: mark as x86-64 only
  • magnum-conductor: mark as x86-64 only
  • nova-libvirt: handle ppc64le
  • ceph: take care of ceph-fuse package availability
  • handle mariadb for aarch64/ubuntu/source
  • opendaylight: get it working on CentOS/non-x86
  • kolla-toolbox: use proper mariadb packages on CentOS/non-x86

At some moment I had over ten patches in review and all of them depended on the base one. So with any change I had to refresh whole series and reviewers had to review again… Painful it was. So I decided to split out the most basic stuff to get whole patch set split into separate ones. After “base_arch” variable was merged life became much simpler for reviewers and a bit more complicated for me as from now on each patch was kept in separate git branch.

  • add base_arch variable for future non-x86 work

At Linaro we support CentOS and Debian. Kolla supports CentOS/RHEL/OracleLinux, Debian and Ubuntu. I was not making builds with RHEL nor OracleLinux but had to make sure that Ubuntu ones work too. There was funny moment when I realised that everyone using Kolla/master was building images with Ocata packages instead of Pike ;D

  • Ubuntu: use Pike repository

But all those patches meant “nothing” without first one. Kolla had information about which packages are available for aarch64/ppc64le/x86-64 architectures but still had no idea that aarch64 or ppc64le exist. Finally the 50th revision of patch got merged so it now knows ;D

  • Support non-x86 architectures (aarch64, ppc64le)

I also learnt a lot about Gerrit and code reviews. OpenStack community members were very helpful with their comments and suggestions. We had hours of talk on #openstack-kolla IRC channel. Thanks goes to Alicja, Duong Ha-Quang, Jeffrey, Kurt, Mauricio, Michał, Qin Wang, Steven, Surya Prakash, Eduardo, Paul, Sajauddin and many others. You people rock!

So is my work on Kolla done now? Some of it is. But we need to test resulting images, make a Docker repository with them, update official documentation with information how to deploy on aarch64 boxes (hope that there will be no changes needed). Also need to make sure that OpenStack Kolla CI gets Debian based gates operational and provide them with 3rdparty AArch64 based CI so new changes could be checked.

My work on changing CirrOS images

What is CirrOS and why I was working on it? This was quite common question when I mentioned what I am working on during last weeks.

So, CirrOS is small image to run in a cloud. OpenStack developers use it to test their projects.

Technically it is yet another Frankenstein OS. Built using Buildroot 2015.05 uses uclibc or glibc (depending on target architecture). Then Ubuntu 16.04 kernel is applied on top and “grub” (also from Ubuntu) is used to make it bootable.

The problem was that it was not done in UEFI bootable way…

My first changes were: switch images to GPT, create EFI system partition and put some bootloader there. I first used CentOS “grub2-efi” packages (as they provided ready to use EFI binaries) and later switched to Ubuntu ones as upstream maintainer (Scott Moser) prefers to have all external binaries to came from one source.

When he was on vacations (so merge request had to wait) I started digging more and more into scripts.

Fixed getopt use as arguments passed between scripts were read partly via getopt, partially by assigning variables to ${X} (where X is a number).

All scripts were moved to use Bash (as /bin/sh in Ubuntu is usually Dash which is minimalist POSIX shell), whitespace got unified between all scripts and some other stuff happened as well.

At one moment all scripts had 1835 lines and my diff was 2250 lines (+1018/-603) long. Hopefully Scott was back and we got most of that stuff merged.

Recent (2016.07.21) images are available and work fine on all platforms. If someone uses them with OpenStack then please remember about setting “short_id” property to “ubuntu16.04” — otherwise there may be a problem with finding rootfs (no virtio-scsi in disk images).

Summary:

architecture booting before booting after
aarch64 direct kernel UEFI or direct kernel
arm direct kernel UEFI or direct kernel
i386 BIOS or direct kernel BIOS, UEFI or direct kernel
powerpc direct kernel direct kernel
ppc64 direct kernel direct kernel
ppc64le direct kernel direct kernel
x86-64 BIOS or direct kernel BIOS, UEFI or direct kernel

My workflow for building big sets of RPM packages

In last months I did two rebuilds: NodeJS 4.x in Fedora and OpenStack Mitaka in CentOS. Both were targeting AArch64 and both were not done before. During latter one I was asked to write about my workflow, so will describe it with OpenStack one as base.

identify what needs to be done

At first I had to figure out what exactly needs to be built. Haïkel Guémar (aka number80) pointed me to openstack-mitaka directory on CentOS’ vault where all source packages are present. Also told me that EPEL repository is not required which helped a lot as it is not yet built for CentOS.

structure of sources

OpenStack set of packages in CentOS is split into two parts: “common” shared through all OpenStack versions and “openstack-mitaka” containing OpenStack Mitaka packages and build dependencies not covered by CentOS itself or “common” directory.

prepare build space

I used “mockchain” for such rebuilds. It is simple tool which does not do any ordering tricks just builds set of packages in given order and do it three times hoping that all build dependencies will be solved that way. Of course what got built once is not tried again.

To make things easier I used shell alias:

alias runmockchain mockchain -r default -l /home/hrw/rpmbuild/_rebuilds/openstack/mockchain-repo-centos

With this I did not have to remember about those two switches. Common call was “runmockchain –recurse srpms/*” which means “cycle packages three times and continue on failures”.

Results of all builds (packages and logs) were kept in “~/_rebuilds/openstack/mockchain-repo-centos/results/default/” subdirectories. I put all extra packages there to have all in one repository.

populate “noarch” packages

Then I copied x86-64 build of OpenStack Mitaka into “_IMPORT-openstack-mitaka/” to get all “noarch” packages for satisfying build dependencies. I built all those packages anyway but having them saved me several rebuilds.

extra rpm macros

When I started first build it turned out that some Python packages lack proper “Provides” fields. I was missing newer rpm build macros (“%python_provide” was added after Fedora 19 which was base for RHEL7). Asked Haïkel and added “rdo-rpm-macros” to mock configuration.

But had to scrap everything I built so far.

surprises and failures

Building big set of packages for new architecture most of time generate failures which were not present with x86-64 build. Same was this time as several build dependencies were missing or wrong.

packages missing in CentOS/aarch64

Some were from CentOS itself — I told Jim Perrin (aka Evolution) and he added them to build queue to fill gaps. I built them in meantime or (if they were “noarch”) imported into “_IMPORT-extras” or “_IMPORT-os” directories.

packages imported from other CBS tags

Other packages were originally imported from other tags at CBS (CentOS koji). For those I created directory named “_IMPORT-cbs”. And again — if they were “noarch” I just copied them. For rest I did full build (using “runmockchain”) and they end in same repository as rest of build.

For some packages it turned out that they got built long time ago with older versions of build dependencies and are not buildable from current versions. For them I tracked proper versions on CBS and imported/built (sometimes with their build dependencies and build dependencies of build dependencies).

downgrading packages

There was a package “python-flake8” which failed to build spitting out Python errors. I checked how this version was built on CBS and turned out that “python-mock” 1.3.0 (from “openstack-mitaka” repository) was too new… Downgraded it to 1.0 allowed me to build “python-flake8” one (upgrading of it is in queue).

merging fixes from Fedora

Both “galera” and “mariadb-galera” got AArch64 support merged from Fedora and got built with “.hrw1” added into “Release” field.

directories

I did whole build in “~/rpmbuild/_rebuilds/openstack/” directory. Extra folders were:

  • vault.centos.org/centos/7/cloud/Source/openstack-mitaka/common/
  • vault.centos.org/centos/7/cloud/Source/openstack-mitaka/
  • mockchain-repo-centos/results/default/_HACKED-by-hew/
  • mockchain-repo-centos/results/default/_IMPORT-cbs/
  • mockchain-repo-centos/results/default/_IMPORT-extras/
  • mockchain-repo-centos/results/default/_IMPORT-openstack-mitaka/
  • mockchain-repo-centos/results/default/_IMPORT-os/

Vault ones were copy of OpenStack source packages and their build dependencies. Hacked ones got AArch64 support merged from Fedora. Both “extras” and “os” directories were for packages missing in CentOS/AArch64 repositories. CBS one was for source/noarch packages which had to be imported/rebuilt because they came from other CBS tags.

status page

In meantime I prepared web page with build results so anyone interested can see what builds, what not and check logs, packages etc. It has simple description and then table with list of builds (data can be sorted by clicking on column headers).

thanks

Whole job would take much more time if not help from CentOS developers: Haïkel Guémar, Alan Pevec, Jim Perrin, Karanbir Singh and others from #centos-devel and #centos-arm IRC channels.

How to speed up mock

Fedora and related distributions use “mock” to build packages. It creates chroot from “root cache” tarball, updates it, installs build dependencies and runs build. Similar to “pbuilder” under Debian. Whole build time can be long but there are some ways to get it faster.

kill fsync()

Everytime something calls “fsync()” storage slow downs because all writes need to be done. Build process goes in a chroot which will be removed at the end so why bother?

Run “dnf install nosync” and then enable it in “/etc/mock/site-defaults.cfg” file:

config_opts['nosync'] = True

local cache for packages

I do lot of builds. Often few builds of same package. And each build requires fetching of RPM packages from external repositories. So why not cache it?

In LAN I have one machine working as NAS. One of services running there is “www cache” with 10GB space which I use only for fetching packages — both in mock and system and I use it on all machines. This way I can recreate build chroot without waiting for external repositories.

config_opts['http_proxy'] = 'http://nas.lan:3128'

Note that this also requires editing mock distribution config files to not use “mirrorlist=” but “baseurl=” instead so same server will be used each time. There is a script to convert mock configuration files if you go that way.

decompress root cache

Mock keeps tarball of base chroot contents which gets unpacked at start of build. By default it is gzip compressed and unpacking takes time. On my systems I switched off compression to gain a bit at cost of storage:

config_opts['plugin_conf']['root_cache_opts']['compress_program'] = ""
config_opts['plugin_conf']['root_cache_opts']['extension'] = ""

tmpfs

If memory is not an issue then tmpfs can be used for builds. Mock has own plugin for it but I do not use it. Instead I decide on my own about mounting “/var/lib/mock” as tmpfs or not. Why? I only have 16GB ram in pinkiepie so 8-12GB can be spent on tmpfs while there are packages which would not fit during build.

parallel decompression

Sources are compressed. Nowadays CPU has more than one core. So why not using it with multithreaded gzip/bzip2 depackers? This time distro file (like “/etc/mock/default.cfg” one) needs to be edited:

config_opts['chroot_setup_cmd'] = 'install @buildsys-build /usr/bin/pigz /usr/bin/lbzip2'
config_opts['macros']['%__gzip'] = '/usr/bin/pigz'
config_opts['macros']['%__bzip2'] = '/usr/bin/lbzip2'

extra mockchain tip

For those who use “mockchain” a lot this shell snippet may help finding which packages failed:

for dir in *
do 
    if [ -e $dir/fail ];then
        rm $dir/fail
        mv $dir _fail-$dir
    fi
done

summary

With this set of changes I can do mock builds faster than before. Hope that it helps someone else too.

And now something more enterprise — CentOS on AArch64

Everyone can grab and install Fedora 21-22 on APM Mustang. But what if you want something more enterprise ready? Answer is simple: you can install CentOS 7 (at Alpha stage now).

Requirements are simple:

First fetch iso and then copy to USB thumb drive (simple “dd if=boot.iso of=/dev/sdc bs=4M” is enough). Then just reboot to UEFI shell and run “FS1:\EFI\BOOT\BOOTAA64.EFI” to get into installation.

All files will be read from USB (for Mustang you need official APM firmware 1.1.0 to get proper DTB) and installation goes like with Fedora, RHEL, CentOS — standard Anaconda.

Finally reboot and system is ready for use:

CentOS Linux 7 (Core)
Kernel 4.0.0-1.el7.aarch64 on an aarch64

pinkiepie login: 

AArch64 port is now in progress so not all packages are built yet but they will be.

Generic Linux cross toolchain for tests

Some time ago we agreed that not everyone here uses Ubuntu distribution and decided to provide so called ‘generic linux’ cross toolchain. Recently I managed to get it done and now need brave testers to tell is it working or not.

Get it here: http://people.linaro.org/~hrw/generic-linux/ (64bit only)

Needed files are toolchain-11.07.tar.xz and init.sh script. Unpack tarball from / so /opt/linaro/11.07/ will be populated and put init.sh anywhere you want (it will be integrated into tarball later).

How to use:

$ source init.sh

this will add cross toolchain into PATH and also set LD_LIBRARY_PATH to two directories:

  • one with binutils libraries
  • second with all extra libraries which may be needed

Feel free to experiment with second dir by removing files from there and checking are system provided libs are fine too.

So far I checked this toolchain under few distributions:

  • Ubuntu 10.04 ‘lucid’ LTS
  • Ubuntu 11.04 ‘natty’
  • Fedora 14
  • OpenSUSE 11.4
  • CentOS 5.6

It failed only under CentOS (which was expected due to it’s age).

How did I checked? So far compilation of ‘gpm’ and ‘zlib’ were tested.