From a diary of AArch64 porter — Arm CPU features table

This post is part 13 of the "From a diary of AArch64 porter" series:

Last week I had some discussions about future and projects where I am involved. And as kind of break I started yet another personal project for fun…

AArch64 SoC features table

Let make a table showing which AArch64 SoCs support which processor features. And how bad situation is.

Source of data

Under Linux system there is that /proc/cpuinfo file describing processor, cpu cores implementer, version of them and features they support:

processor       : 0
BogoMIPS        : 26.00
Features        : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant     : 0x1
CPU part        : 0xd05
CPU revision    : 0

Most of AArch64 systems I used have much shorter list so I started wondering which SoC has which features listed.

First version

Took some code from my Linux system calls table and started with cpuinfo dumps from my systems. Later added some entries from the Internet. Then a bunch of mobile phones.

First version of AArch64 SoC features table was created.

Let check Linux

Once I got QEMU “max” and Apple M2 cpuinfo dumps I thought that I have all features I need. Oh, how wrong I was…

I went to Linux kernel sources and gathered a list of all supported entries. Now lot of horizontal scrolling is needed as table lists 66 cpu features.

How to help

At this moment AArch64 SoC features table lists 29 systems. More is needed to make it useful.

You can help — submit an issue on github and let me take care of rest. Needed information:

/proc/cpuinfo file contents
SoC name (like “RK3399”, “M1 Max” etc.)
SoC vendor name (“Rockchip”, “Apple” etc.)
product name (not used in table)
url to product page (also not used in table)

Product name/url can be used later and to check details.

Features listed

There are many features recognized by Linux kernel. Let me try to group them by architecture level.

Arm v8.0

The base of all AArch64 systems and the most popular one as it is present in far too many SBC devices. Just few features present:

Name	Description
fp	floating point present
asimd	advanced SIMD present
evtstrm	timer event stream generation
cpuid	CPU features can be read

Then go cryptographic extensions (optional):

Name	Description
aes	AESD and AESE instructions
crc32	CRC32* instructions
pmull	PMULL, PMULL2 instructions
sha1	SHA1* instructions
sha2	SHA256* instructions

Arm v8.1

I was surprised seeing just two entries for v8.1 being present:

Name	Description
asimdrdm	Advanced SIMD rounding double multiply accumulate instructions
atomics	Atomic instructions

Arm v8.2

8.2 version of Arm architecture was quite a refresh. New cpu cores announced, SVE (Scalable Vector Extension) was defined and several other calculation extensions. Note that most of them are optional (as usual).

Name	Description
asimddp	Advanced SIMD dot product instructions
asimdfhm	Floating-point half-precision multiplication instructions
asimdhp	Advanced SIMD with BFloat16 instructions
bf16	AArch64 BFloat16 instructions
dcpodp	DC CVADP instruction
dcpop	DC CVAP instruction
flagm	Flag manipulation instructions v2
fphp	Half-precision floating-point data processing
i8mm	AArch64 Int8 matrix multiplication instructions
sha3	Advanced SIMD SHA3 instructions
sha512	Advanced SIMD SHA512 instructions
sm3	Advanced SIMD SM3 instructions
sm4	Advanced SIMD SM4 instructions
sve	Scalable Vector Extension
svebf16	AArch64 BFloat16 instructions (SVE)
svef32mm	Single-precision Matrix Multiplication (SVE)
svef64mm	Double-precision Matrix Multiplication (SVE)
svei8mm	AArch64 Int8 matrix multiplication instructions (SVE)
uscat	Unaligned single-copy atomicity and atomic functions with a 16-byte address range aligned to 16-bytes are supported

Arm v8.3

I heard that v8.3 was “interesting experiment which needed fixing”…

Name	Description
fcma	Floating-point complex number instructions
jscvt	JavaScript conversion instructions
lrcpc	Load-Acquire RCpc instructions

Arm v8.4

v8.4 got some fixes for v8.3 features. Also pointer authentication stuff was defined.

It is also lowest level for (optional) nested virtualization (which was added in v8.3 but needed improvements).

Name	Description
dit	Data Independent Timing instructions
ilrcpc	Load-Acquire RCpc instructions v2
paca	Faulting on AUT* instructions
pacg	Enhanced pointer authentication functionality

Arm v8.5

v8.5 was the time of Spectre, Meldown etc. vulnerabilities and fixes for them. Several lower cores implemented those too.

Some interesting security features are BTI and MTE.

Name	Description
bti	Branch Target Identification
flagm2	Enhancements to flag manipulation instructions
frint	Floating-point to integer instructions
mte	Memory Tagging Extension
mte3	MTE Asymmetric Fault Handling
rng	Random number generator

Arm v8.6

This is so far into “Arm fairy tales” that I do not know what to write here.

Name	Description
ecv	Enhanced Counter Virtualization

Arm v8.7

Like above. And “afp” sounds scary…

Name	Description
afp	Alternate floating-point behaviour
rpres	Increased precision of Reciprocal Estimate and Reciprocal Square Root Estimate
wfxt	WFE and WFI instructions with timeout

Arm v9.0

This is a fork of Arm v8.5 with SVE2 on top. I think that it was an attempt to have a new start as lot of SoCs still used old cores.

Also “Arm v9” gives marketing boost ;D

Name	Description
sve2	Scalable Vector Extension version 2
sveaes	Scalable Vector AES instructions
svebitperm	Scalable Vector Bit Permutes instruction
svepmull	Scalable Vector PMULL instructions
svesha3	Scalable Vector SHA3 instructions
svesm4	Scalable Vector SM4 instructions

Arm v9.2

“v9.2 is the new v8.7” could be a marketing slogan.

Name	Description
ebf16	AArch64 Extended BFloat16 instructions
sme	Scalable Matrix Extension
smeb16f32	SME support for instructions that accumulate BFloat16 outer products into FP32 single-precision floating-point tiles
smef16f32	SME support for instructions that accumulate FP16 half-precision floating-point outer products into FP32 single-precision floating-point tiles
smef32f32	SME support for instructions that accumulate FP32 single-precision floating-point outer products into single-precision floating-point tiles
smef64f64	SME support for instructions that accumulate into FP64 double-precision floating-point elements in the ZA array
smefa64	Full Streaming SVE mode instructions
smei8i32	SME support for instructions that accumulate 8-bit integer outer products into 32-bit integer tiles
smei16i64	SME support for instructions that accumulate into 64-bit integer elements in the ZA array
sveebf16	AArch64 Extended BFloat16 instructions (SVE)

Will table be useful?

I hope that AArch64 SoC features table will be useful. Once populated with data it will allow to see which SoC has features we want to target. Of course there are some problems:

is there a hardware with those features at all
will we live long enough to see such hardware