From the diary of AArch64 porter — Arm CPU features table

This post is part 13 of the "From the diary of AArch64 porter" series:

  1. From the diary of AArch64 porter — autoconf
  2. From the diary of AArch64 porter — rpm packaging
  3. From the diary of AArch64 porter — testsuites
  4. From the diary of AArch64 porter — POSIX.1 functionality
  5. From the diary of AArch64 porter — PAGE_SIZE
  6. From the diary of AArch64 porter — vfp precision
  7. From the diary of AArch64 porter — system calls
  8. From the diary of AArch64 porter — parallel builds
  9. From the diary of AArch64 porter — firefighting
  10. From the diary of AArch64 porter — drive-by coding
  11. From the diary of AArch64 porter — manylinux2014
  12. From the diary of AArch64 porter — handling big patches
  13. From the diary of AArch64 porter — Arm CPU features table

Last week I had some discussions about future and projects where I am involved. And as kind of break I started yet another personal project for fun…

AArch64 SoC features table

Let make a table showing which AArch64 SoCs support which processor features. And how bad situation is.

Source of data

Under Linux system there is that /proc/cpuinfo file describing processor, cpu cores implementer, version of them and features they support:

processor       : 0
BogoMIPS        : 26.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd05
CPU revision    : 0

Most of AArch64 systems I used have much shorter list so I started wondering which SoC has which features listed.

First version

Took some code from my Linux system calls table and started with cpuinfo dumps from my systems. Later added some entries from the Internet. Then a bunch of mobile phones.

First version of AArch64 SoC features table was created.

Let check Linux

Once I got QEMU “max” and Apple M2 cpuinfo dumps I thought that I have all features I need. Oh, how wrong I was…

I went to Linux kernel sources and gathered a list of all supported entries. Now lot of horizontal scrolling is needed as table lists 66 cpu features.

How to help

At this moment AArch64 SoC features table lists 29 systems. More is needed to make it useful.

You can help — submit an issue on github and let me take care of rest. Needed information:

Product name/url can be used later and to check details.

Features listed

There are many features recognized by Linux kernel. Let me try to group them by architecture level.

Arm v8.0

The base of all AArch64 systems and the most popular one as it is present in far too many SBC devices. Just few features present:

Name Description
fp floating point present
asimd advanced SIMD present
evtstrm timer event stream generation
cpuid CPU features can be read

Then go cryptographic extensions (optional):

Name Description
aes AESD and AESE instructions
crc32 CRC32* instructions
pmull PMULL, PMULL2 instructions
sha1 SHA1* instructions
sha2 SHA256* instructions

Arm v8.1

I was surprised seeing just two entries for v8.1 being present:

Name Description
asimdrdm Advanced SIMD rounding double multiply accumulate instructions
atomics Atomic instructions

Arm v8.2

8.2 version of Arm architecture was quite a refresh. New cpu cores announced, SVE (Scalable Vector Extension) was defined and several other calculation extensions. Note that most of them are optional (as usual).

Name Description
asimddp Advanced SIMD dot product instructions
asimdfhm Floating-point half-precision multiplication instructions
asimdhp Advanced SIMD with BFloat16 instructions
bf16 AArch64 BFloat16 instructions
dcpodp DC CVADP instruction
dcpop DC CVAP instruction
flagm Flag manipulation instructions v2
fphp Half-precision floating-point data processing
i8mm AArch64 Int8 matrix multiplication instructions
sha3 Advanced SIMD SHA3 instructions
sha512 Advanced SIMD SHA512 instructions
sm3 Advanced SIMD SM3 instructions
sm4 Advanced SIMD SM4 instructions
sve Scalable Vector Extension
svebf16 AArch64 BFloat16 instructions (SVE)
svef32mm Single-precision Matrix Multiplication (SVE)
svef64mm Double-precision Matrix Multiplication (SVE)
svei8mm AArch64 Int8 matrix multiplication instructions (SVE)
uscat Unaligned single-copy atomicity and atomic functions with a 16-byte address range aligned to 16-bytes are supported

Arm v8.3

I heard that v8.3 was “interesting experiment which needed fixing”…

Name Description
fcma Floating-point complex number instructions
jscvt JavaScript conversion instructions
lrcpc Load-Acquire RCpc instructions

Arm v8.4

v8.4 got some fixes for v8.3 features. Also pointer authentication stuff was defined.

It is also lowest level for (optional) nested virtualization (which was added in v8.3 but needed improvements).

Name Description
dit Data Independent Timing instructions
ilrcpc Load-Acquire RCpc instructions v2
paca Faulting on AUT* instructions
pacg Enhanced pointer authentication functionality

Arm v8.5

v8.5 was the time of Spectre, Meldown etc. vulnerabilities and fixes for them. Several lower cores implemented those too.

Some interesting security features are BTI and MTE.

Name Description
bti Branch Target Identification
flagm2 Enhancements to flag manipulation instructions
frint Floating-point to integer instructions
mte Memory Tagging Extension
mte3 MTE Asymmetric Fault Handling
rng Random number generator

Arm v8.6

This is so far into “Arm fairy tales” that I do not know what to write here.

Name Description
ecv Enhanced Counter Virtualization

Arm v8.7

Like above. And “afp” sounds scary…

Name Description
afp Alternate floating-point behaviour
rpres Increased precision of Reciprocal Estimate and Reciprocal Square Root Estimate
wfxt WFE and WFI instructions with timeout

Arm v9.0

This is a fork of Arm v8.5 with SVE2 on top. I think that it was an attempt to have a new start as lot of SoCs still used old cores.

Also “Arm v9” gives marketing boost ;D

Name Description
sve2 Scalable Vector Extension version 2
sveaes Scalable Vector AES instructions
svebitperm Scalable Vector Bit Permutes instruction
svepmull Scalable Vector PMULL instructions
svesha3 Scalable Vector SHA3 instructions
svesm4 Scalable Vector SM4 instructions

Arm v9.2

“v9.2 is the new v8.7” could be a marketing slogan.

Name Description
ebf16 AArch64 Extended BFloat16 instructions
sme Scalable Matrix Extension
smeb16f32 SME support for instructions that accumulate BFloat16 outer products into FP32 single-precision floating-point tiles
smef16f32 SME support for instructions that accumulate FP16 half-precision floating-point outer products into FP32 single-precision floating-point tiles
smef32f32 SME support for instructions that accumulate FP32 single-precision floating-point outer products into single-precision floating-point tiles
smef64f64 SME support for instructions that accumulate into FP64 double-precision floating-point elements in the ZA array
smefa64 Full Streaming SVE mode instructions
smei8i32 SME support for instructions that accumulate 8-bit integer outer products into 32-bit integer tiles
smei16i64 SME support for instructions that accumulate into 64-bit integer elements in the ZA array
sveebf16 AArch64 Extended BFloat16 instructions (SVE)

Will table be useful?

I hope that AArch64 SoC features table will be useful. Once populated with data it will allow to see which SoC has features we want to target. Of course there are some problems:

