TensorFlow build times

The big part of my work during last months was related to TensorFlow. We build it on Linaro CI. Resulting wheels are available on Linaro snapshots server.

Versions

When you dig there you can find those versions:

How to install?

TensorFlow requires several other Python packages and some of them are not distributed as AArch64 binary wheels. For this we have Python cache repository at Linaro snapshots.

So how to install TensorFlow (2.6.0 or 2.7.0 version):

$ export PIP_EXTRA_INDEX_URL=https://snapshots.linaro.org/ldcg/python-cache/
$ pip install tensorflow-aarch64

And it will be done. Package is renamed from “tensorflow-cpu” as there is a plan of uploading them to Pypi. For older versions please check our snapshots server.

How do we build?

Jenkins runs shell script which then starts container, installs Ansible and then the rest of build is done using it. Playbooks, roles and shell scripts are stored in Jenkins jobs repository.

The process is quite simple — we choose which versions to build (from 1.5, 2.4, 2.5, 2.6, git HEAD selection) and then Ansible loops over them. All dependencies’ versions are stored in variables file.

Whole work is done inside of “manylinux2014” container to get a way of building for wide selection of Python releases. Build covers versions from 3.6 to 3.9 one (we plan to enable 3.10 when possible) in one run.

Build times

To compare speed of several systems I have available I ran a build on each and compared to times on Linaro CI machine.

Some details:

Machine name processor cores threads memory note
Oracle cloud A1 Altra 16 16 96 VM.Standard.A1.Flex
Linaro CI ThunderX2 2x28 56 240 SMT disabled
SolidRun HoneyComb LX2160 16 16 32
my work laptop i7-8665U 4 8 32 SMT enabled
my desktop Ryzen 5 3600 6 12 32 SMT enabled

Build time includes fetching files from network. Python packages comes from either Pypi or Linaro Python cache repository.

Procedure

Install Docker, pull manylinux2014 image, fetch script from Linaro CI job git repository, run it.

$ docker pull quay.io/pypa/manylinux2014_aarch64
$ wget https://git.linaro.org/ci/job/configs.git/plain/ldcg-python-manylinux-tensorflow/build.sh
$ mkdir BUILD
$ cd BUILD
$ export WORKSPACE=
$ time build26=true bash ../build.sh

BUILD directory created as job starts with cleaning workspace (stored in WORKSPACE variable).

Results

Machine name build time
Oracle cloud A1 3:15:53.250
Linaro CI 1:56:15.994
SolidRun HoneyComb 7:29:01.922
my work laptop 10:01:19.044
my desktop 3:30:34.821
Build time (in hours)
Build time (in hours)

Comments

HoneyComb is a good system for development work. As long as there are other systems which will do hard work on building things.

ThunderX2 is a workhorse. Give it something to do and it delivers. It was also very expensive (Avantek started with 13k USD for workstation).

Oracle cloud instance used Ampere Altra cpu. Would not be surprised if it beats ThunderX2 system when more cores are used (16 cores was limit of free tier).

I used my work laptop because it was available. Did not expected much from it. But in past benchmarks it was close to my previous desktop system.

And with my desktop… It was quite cheap solution 2 years ago. Looks like price/performance is something where x86-64 is a king.

aarch64 development honeycomb linaro x86