GDPR?

Generic Data Protected Reduction or something like that. Everyone in EU knows (those in UK too) due to amount of spam from all those services/pages you registered in the past.

I would not bother writing anything about it but we had a discussion (beer was involved) recently in a pub and I decided to blog.

So to make sure you know: there is some data stored in this system. Every time you leave a comment all that data you wrote is recorded. And can be used to identify author so we can agree that those are personal details, right?

If by any chance you want those data removed then write to me. With url of comment you wrote, from email address used in that comment. I will remove your email, link to website (if present) and replace your name with some random words (like Herman Humpalla for example).

If I remember correctly there is no other data stored in my system. All statistics done by WordPress are anonymous.

Android at Google I/O: what’s the point?

Another year, another Google I/O. Another set of articles with “what’s new in xyz Google product”. Maps, Photos, AI, this, that. And then all those Android P features which nearly no one will see on their phones (tablets look like dead part of market already).

I have a feeling that this part is more or less useless with current state of Android. Latest release is Oreo. On 5.7% of devices. Which sounds like “feel free to ignore” value. Every 4th device runs 3 years old version (and usually lacks two years of security updates). Every 3rd one has 2 years old Nougat one.

How many users will remember what’s new in their phones when Android P will land on their devices? Probably very small part of crazy geeks. Some features will get renamed by device vendors. Other will be removed. Or changed (not always in positive way). Reviewers will write “OMG that feature added by VENDORNAME is so awesome” as no one will remember that it is part of base system.

In other words: I stopped caring what is happening in Android space. With most popular version being few years old I do not see a point in tracking new features. Who would use them in their apps when you have to care about running on four years old Android?

Mass removal of image tags on Docker hub

At Linaro we moved from packaged OpenStack to virtualenv tarballs. Then we packaged those. But as it took us lot of maintenance time we switched to Docker container images for OpenStack and whatever it needs to run. And then we added CI job to our Jenkins to generate hundreds of images per build. So now we have lot of images with lot of tags…

Finding out which tags are latest is quite easy — you just have to go to Docker hub page of linaro/debian-source-base image and switch to tags view. But how to know which build is complete? We had some builds where all images except one got built and pushed. And the missing one is first in deployment… So whole set was b0rken.

How to remove those tags? One solution is to login to Docker hub website and go image by image and click all those tags to be removed. No one is so insane to suggest it. And we do not have credentials to do that as well.

So let’s handle it as we do that in SDI team: by automation. Docker has some API so it’s hub should have some too, right? Hmm…

I went through some pages, then issues, bug reports, random projects. Saw code in JavaScript, Ruby, Bash but nothing usable in Python. Some of projects assume that no one has more than one hundred of images (no paging in getting list of images) and limits itself to some queries.

Started reading docs and some code. Learnt that GET/POST are not the only methods of doing HTTP. There is also DELETE one which was exactly what I needed. Sorted out authentication, web paths and something started to work.

First version was simple: login and remove tag from image. Then added querying for whole list of images (with proper paging) and looping through the list with removal of requested tags from requested images:

15:53 (s) hrw@gossamer:docker$ ./delimage.py haerwu debian-source 5.0.0
haerwu/debian-source-memcached:5.0.0 removed
haerwu/debian-source-glance-api:5.0.0 removed
haerwu/debian-source-nova-api:5.0.0 removed
haerwu/debian-source-rabbitmq:5.0.0 removed
haerwu/debian-source-nova-consoleauth:5.0.0 removed
haerwu/debian-source-nova-placement-api:5.0.0 removed
haerwu/debian-source-glance-registry:5.0.0 removed
haerwu/debian-source-nova-compute:5.0.0 removed
haerwu/debian-source-keystone:5.0.0 removed
haerwu/debian-source-horizon:5.0.0 removed
haerwu/debian-source-neutron-dhcp-agent:5.0.0 removed
haerwu/debian-source-openvswitch-db-server:5.0.0 removed
haerwu/debian-source-neutron-metadata-agent:5.0.0 removed
haerwu/debian-source-heat-api:5.0.0 removed

Final version got MIT license as usual, I created git repo for it and pushed code. Next step? Probably creation of a job on Linaro CI to have a way of removing no longer supported builds. And some more helper scripts.

XGene1: cursed processor?

Years ago Applied Micro (APM) released XGene processor. It went to APM BlackBird, APM Mustang, HPe M400 and several other systems. For some time there was no other AArch64 cpu available on market so those machines got popular as distribution builders, developer machines etc…

Then APM got aquired by someone, CPU part got bought by someone else and any support just vanished. Their developers moved to work on XGene2/XGene3 cpus (APM Merlin etc systems). And people woke up with not-supported hardware.

For some time it was not an issue – Linux boots, system works. Some companies got rid of their XGene systems by sending them to Linaro lab, some moved them to ‘internal use only, no external support’ queue etc.

Each mainline kernel release was “let us check what is broken on XGene this time” time. No serial console output again? Ok, we have that ugly patch for it (got cleaned and upstreamed). Now we have kernel 4.16 and guess what? Yes, it broke. Turned out that 4.15 was already faulty (we skipped it at Linaro).

Red Hat bugzilla has a Fedora bug for it. Turns out that firmware has wrong ACPI tables. Nothing new, right? We already know that it lacks PPTT for example (but it is quite new thing for processors topology). This time bug is present in DSDT one.

Sounds familiar? If you had x86 laptop about 10 years ago then it could. DSDT stands for Differentiated System Description Table. It is a major ACPI table used to describe what peripherals the machine has. And serial ports are described wrong there so kernel ignores them.

One of solutions is bundling fixed DSDT to kernel/initrd but that would require adding support for it into Debian and probably not get merged as no one needs that nowadays (unless they have XGene1).

So far I decided to stay on 4.14 for my development cartridges. It works and allows me to continue my Nova work. Do not plan to move to other platform as at Linaro we have probably over hundred XGene1 systems (M400 and Mustangs) which will stay there for development (hard to replace 4.3U case with 45 cartridges by something else).

Shenzhen trip

Few months ago, at the end of previous Linaro Connect gathering, there was announcement that next one will take place in Hong Kong. This gave me idea of repeating Shenzhen trip but in a bit longer version.

So I mailed people at Linaro and there were some responses. We quickly agreed on going there before Connect. Alex, Arnd, Green and me were landing around noon, Riku a few hours later so we decided that we will meet in Shenzhen.

We crossed border in Lok Ma Chau, my visa had the highest price again and then we took a taxi to the Maker Hotel (still called “Quchuang Hotel” in Google Maps and on Booking.com) next to all those shops we wanted to visit. Then went for quick walk through Seg Electronics Market. Lot of mining gear. 2000W power supplies, strange PCI Express expanders etc. Dinner, meeting with Riku and day ended.

I have woken up at 02:22 and was not able to fall asleep. Around 6:00 it turned out that rest of team is awake as well so we decided to go around and search for some breakfast. Deserted streets looked a bit weird.

Back at hotel we were discussing random things. Then someone from Singapore joined and we were talking about changes in how Shenzhen stores/factories operate. He told us that there is less and less of stores as business moves to the Internet. Then some Chinese family came with about seven years old boy. He said something, his mother translated and it turned out that he wanted to touch my beard. As it was not the first time my beard got such attention I allowed him. That surprise on his face was worth it. And then we realized that we have not seen bearded Chinese man on a street.

As stores were opening at 10:00 we still had a lot of time so went for random walk. Including Shenzhen Center Park which is really nice place:

Then stores started to open. Fake phones, real phones, tablets, components, devices, misc things… Walking there was fun itself. Bought some items from my list.

They also had a lot of old things. Intel Overdrive system for example or 386/486 era processors and FPUs.

From weird things: 3.5″ floppy disks and Intel Xeon Platinum 8175 made for Amazon cloud only.

Lot and lot of stuff everywhere. Need power supply? There were several stores with industrial ones, regulated ones etc. Used computers/laptops? Piles after piles. New components? Lot to choose from. Etc, etc, etc…

After several hours we finally decided to go back to Hong Kong and rest. The whole trip was fun. I really enjoyed it. Even without getting half of items from my ‘buy while in Shenzhen’ list ;D

And ordered Shenzhen fridge magnet on Aliexpress… They were not available to buy at any place we were.

25 years of Red Hat

Years ago I bought Polish translation of “Under the radar” book about how Red Hat was started. Was a good read and went to bookshelf.

Years passed. In meantime I got hired by Red Hat. To work on Red Hat Enterprise Linux. For AArch64 architecture.

Then one day I was talking with my wife about books and I looked at shelf. And found that book again. Took it and said:

You know, when I bought that book I did not even dreamt that one day I will be working at Red Hat.

Today company turned 25. Amount of time longer than my career. I remember how surprised I was when realised that some of my friends work at company for 20 years already.

This is the oldest company I worked for. Directly at least as some of the customers of companies I worked in past were probably older. And hope that one day my work title will be “Retired Software Engineer” as my wife once said. And that will be at this company.

Android pisses me off

If you want smart phone then you are limited to Android or iOS. Other options just do not count. iOS philosophy and devices which run it are not something I want to own/use so I am left with Android.

My first Android device was Nokia N900 with Froyo (Android 2.2) based NITdroid. When I saw “K9 mail” on it I knew that Maemo goes to trash (it’s mail client “Modest” worked only in landscape and used font size for visually impaired people). So few weeks later I bought Nexus S. Then Nexus 4. Next was Samsung Galaxy S4 which I won in some contest. Then moved to Nexus 5, LG G3, and now use ZTE Axon 7. Had/have few tablets as well: first some Tegra2 based one with Honeycomb (sold quickly), Archos G9, Nexus 7 (2012) and finally Lenovo S8.

For most of time I tried to run latest possible Android on my devices. Of course non-vendor one cause Android world cares about device for a year (or year and half in best case) and then ignores it. I stopped caring are there any updates to my devices. Sure, they are full of security holes etc but sorry I am not planning to spend few hundred euros every year to replace three phones and tablet.

With Android Oreo (not present for any of my devices) Google announced ‘Project TrelloTreble’ which should fix some of that. I suppose that in 2020 year 40-50% of new devices may support it. With old versions of Android anyway because binary blobs will be too old to keep up with newer releases.

Switching device is the other thing. Doing backups, restoring backups, (re)configuring applications etc. Last time I did factory reset on one of phones it took 2 hours before Google Play Store finished installing applications. Including those I removed half year earlier. Of course forget about text messages or call history. WTF Google?

Backups are fun anyway. Official way is “hope that Google keeps backups of your app settings in a cloud”. Most of apps to do sensible backup require root. Which usually require factory reset to be done first. Or all they do is provide other UI for ‘adb backup’ command (which does some backup and then decides to do nothing for any random amount of time).

ADB itself is a joke. Sure, it can be used to send files over USB connection but it looks like it’s authors live in 90s and all they have is USB 1.1 host controller in their PCs. I can not find other excuse for its speed of 3 MB/s (yes, THREE megabytes per second). Again: WTF?

My current plan is to use my Axon 7 with Nougat for about a year (or two) until it finally die or meet with ground one time too many. And still be pissed off any time related with backups (changing devices in family or sending them for repair).

SnowpenStack in Dublin

During last week I was in Dublin, Ireland on OpenStack PTG. It was also the worst weather since 1982. There was snow and strong wind so conference quickly got renamed to SnowpenStack.

The main reasons for me to be there were:

  • meet all those developers who took some time and looked at my changes
  • discuss some other changes/plans
  • share aarch64 knowledge with OpenStack projects

Conference took place in the Croke Park stadium. We used meeting rooms on 4th, 5th, 6th floors. One day by mistake I took wrong stairs and ended on top of the stadium in just T-shirt… Quickly ran to an elevator to get back to proper floor ;D

The schedule was split into two parts: Monday and Tuesday were for mixed teams sessions while Wednesday — Friday were for discussions in teams. I spoke with Kolla, Nova and Infra teams mostly. There were some discussions with Ironic, Kuryr and some other ones too. Also met several Polish developers so there was a time to speak in native language ;D

On Tuesday I went to the city centre to buy some souvenirs for family (and 99th fridge magnet for myself). Launched Ingress, did one mosaic to see more of a city and after 11 kilometres I was back at the hotel just in time for a small party in GAA museum. And then pub trip with Polish guys. When I finally reached the hotel (about 01:30) there were still discussions in the lobby and I took a part in one of them.

Team discussions started on Wednesday. Visited Nova one summarizing ‘Queens’ release. Turned out that it went better than previous ones. The main problem was lack of reviews — not everyone likes to pester developers on IRC to get some attention to patches. I was asked few times for opinion as I was one of few fresh contributors.

Kolla sessions were a bit chaotic in my opinion. Recently chosen PTL was not present and the person supposed to replace him got stuck at home due to weather. One of the discussions I remember was about Ceph: should we keep using our images or rather move to ‘ceph-ansible’ instead. Final idea was to keep as it looked like there were more cons than pros with moving to ‘ceph-ansible’ images.

Discussed Arm64 support with Infra team. We (Linaro) provided them resources on one of our developer clouds to get aarch64 present in OpenStack CI gates. Turned out that machines work and some initial tests were done. I also got informed that diskimage-builder patches to add GPT/UEFI support will be reviewed soon.

And then there were some weather related issues. On Wednesday every attendee got an email with information that Irish government issued the Red Alert which strongly suggest to stay inside if you do not really have to go outside. And as attendance was not mandatory then people should first check are they comfortable with going to Croke Park (especially those who not stayed in the hotel nearby). Next day organization team announced that the venue we used will close after lunch to make sure that everyone is safe. And the whole conference moved to the hotel…

Imagine developers discussing EVERYWHERE in a hotel. Lobby was occupied by few teams, Infra found a table in library corner, Nova took Neutron and occupied breakfast room. Bar area was quite popular and soon some beers were visible here and there. Few teams went to meeting rooms etc. WiFi bandwidth was gone… Some time later hotel staff created a separate wireless network for our use. And then similar situation was on Friday.

On Wednesday other thing happened too: people started receiving information that their flights are cancelled. There were some connections on Thursday and then nothing was flying on Friday. Kudos to hotel staff to be aware of it — they stopped taking external reservations to make sure that PTG attendees have a place to stay for longer (as some people got rebooked to even Thursday).

And even on Saturday it was hard to get to the airport. No taxi going to the hotel due to snow on a street. But if you walked 500 meters then cab could be hailed on a street. Many people went for buses (line 700 was the only working one). The crowd on the airport was huge. Some of those people looked like they lived there (which was probably true). Several flights were delayed (even by 4-5 hours), other got cancelled but most of them were flying.

Despite the weather sitting in a hotel in Dublin was safe, walking around too as there were about 15-20 centimetres of snow on a street. There were several snowmen around, people had fun playing with snow. But at same time local news were informing that 30 000 homes lacked electricity and some people got stuck in their cars. There was no public transport, no trains, no buses. Much smaller amount of people on streets.

Was it worth attending? Yes. Will I attend next ones? Probably not as this is very developer related where I spend most of my OpenStack time around building it’s components or doing some testing.

OpenStack ‘Queens’ release done

OpenStack community released ‘queens’ version this week. IMHO it is quite important moment for AArch64 community as well because it works out of the box for us.

Gone are things like setting hw_firmware_type=uefi for each image you upload to Glance — Nova assumes UEFI to be the default firmware on AArch64 (unless you set the variable to different value for some reason). This simplifies things as users does not have to worry about and we should have less support questions on new setups of Linaro Developer Cloud (which will be based on ‘Queens’ instead of ‘Newton’).

There is a working graphical console if your guest image uses properly configured kernel (4.14 from Debian/stretch-backports works fine, 4.4 from Ubuntu/xenial (used by CirrOS) does not have graphics enabled). Handy feature which we were asked already by some users.

Sad thing is state of live migration on AArch64. It simply does not work through the whole stack (Nova, libvirt, QEMU) because we have no idea what exactly cpu we are running on and how it is compatible with other cpu cores. In theory live migration between same type of processors (like XGene1 -> XGene1) should be possible but we do not have even that level of information available. More information can be found in bug 1430987 reported against libvirt.

Less sad part? We set cpu_model to ‘host-passthrough’ by default now (in Nova) so nevermind which deployment method is used it should work out of the box.

When it comes to building (Kolla) and deploying (Kolla Ansible) most of the changes were done during Pike cycle. During Queens’ one most of the changes were small tweaks here and there. I think that our biggest change was convincing everyone in Kolla(-ansible) to migrate from MariaDB 10.0.x (usually from external repositories) to 10.1.x taken from distribution (Debian) or from RDO.

What will Rocky bring? Better hotplug for PCI Express machines (AArch64/virt, x86/q35 models) is one thing. I hope that live migration stuff situation will improve as well.

Hotplug in VM. Easy to say…

You run VM instance. Nevermind is it part of OpenStack setup or just local one started using Boxes, virt-manager, virsh or other that kind of fronted to libvirt daemon. And then you want to add some virtual hardware to it. And another card and one more controller…

Easy to imagine scenario, right? What can go wrong, you say? “No more available PCI slots.” message can happen. On second/third card/controller… But how? Why?

Like I wrote in one of my previous posts most of VM instances are 90s pc hardware virtual boxes. With simple PCI bus which accepts several cards to be added/removed at any moment.

But not on AArch64 architecture. Nor on x86-64 with Q35 machine type. What is a difference? Both are PCI Express machines. And by default they have far too small amount of pcie slots (called pcie-root-port in qemu/libvirt language). More about PCI Express support can be found in PCI topology and hotplug page of libvirt documentation.

So I wrote a patch to Nova to make sure that enough slots will be available. And then started testing. Tried few different approaches, discussed with upstream libvirt developers about ways of solving the problem and finally we selected the one and only proper way of doing it. Then discussed failures with UEFI developers. And went for help to Qemu authors. And explained what I want to achieve and why to everyone in each of those four projects. At some point I had seen pcie-root-port things everywhere…

Turned out that the method of fixing it is kind of simple: we have to create whole pcie structure with root port and slots. This tells libvirt to not try any automatic adding of slots (which may be tricky if not configured properly as you may end with too small amount of slots for basic addons).

Then I went with idea of using insane values. VM with one hundred PCIe slots? Sure. So I made one, booted it and then something weird happen: landed in UEFI shell instead of getting system booted. Why? How? Where is my storage? Network? Etc?

Turns out that Qemu has limits. And libvirt has limits… All ports/slots went into one bus and memory for MMCONFIG and/or I/O space was gone. There are two interesting threads about it on qemu-devel mailing list.

So I added magic number into my patch: 28 — this amount of pcie-root-port entries in my aarch64 VM instance was giving me bootable system. Have to check it on x86-64/q35 setup still but it should be more or less the same. I expect this patch to land in ‘Rocky’ (the next OpenStack release) and probably will have to find a way to get it into ‘Queens’ as well because this is what we are planning to use for next edition of Linaro Developer Cloud.

Conclusion? Hotplug may be complicated. But issues with it can be solved.