Mass removal of image tags on Docker hub

At Linaro we moved from packaged OpenStack to virtualenv tarballs. Then we packaged those. But as it took us lot of maintenance time we switched to Docker container images for OpenStack and whatever it needs to run. And then we added CI job to our Jenkins to generate hundreds of images per build. So now we have lot of images with lot of tags…

Finding out which tags are latest is quite easy — you just have to go to Docker hub page of linaro/debian-source-base image and switch to tags view. But how to know which build is complete? We had some builds where all images except one got built and pushed. And the missing one is first in deployment… So whole set was b0rken.

How to remove those tags? One solution is to login to Docker hub website and go image by image and click all those tags to be removed. No one is so insane to suggest it. And we do not have credentials to do that as well.

So let’s handle it as we do that in SDI team: by automation. Docker has some API so it’s hub should have some too, right? Hmm…

I went through some pages, then issues, bug reports, random projects. Saw code in JavaScript, Ruby, Bash but nothing usable in Python. Some of projects assume that no one has more than one hundred of images (no paging in getting list of images) and limits itself to some queries.

Started reading docs and some code. Learnt that GET/POST are not the only methods of doing HTTP. There is also DELETE one which was exactly what I needed. Sorted out authentication, web paths and something started to work.

First version was simple: login and remove tag from image. Then added querying for whole list of images (with proper paging) and looping through the list with removal of requested tags from requested images:

15:53 (s) hrw@gossamer:docker$ ./delimage.py haerwu debian-source 5.0.0
haerwu/debian-source-memcached:5.0.0 removed
haerwu/debian-source-glance-api:5.0.0 removed
haerwu/debian-source-nova-api:5.0.0 removed
haerwu/debian-source-rabbitmq:5.0.0 removed
haerwu/debian-source-nova-consoleauth:5.0.0 removed
haerwu/debian-source-nova-placement-api:5.0.0 removed
haerwu/debian-source-glance-registry:5.0.0 removed
haerwu/debian-source-nova-compute:5.0.0 removed
haerwu/debian-source-keystone:5.0.0 removed
haerwu/debian-source-horizon:5.0.0 removed
haerwu/debian-source-neutron-dhcp-agent:5.0.0 removed
haerwu/debian-source-openvswitch-db-server:5.0.0 removed
haerwu/debian-source-neutron-metadata-agent:5.0.0 removed
haerwu/debian-source-heat-api:5.0.0 removed

Final version got MIT license as usual, I created git repo for it and pushed code. Next step? Probably creation of a job on Linaro CI to have a way of removing no longer supported builds. And some more helper scripts.

development docker linaro