ARMology

When last time I was in Cambridge we had a discussion about ARM processors. Paweł used term “ARMology” then. And with recent announcement of Cortex-A12 cpu core I thought that it may be a good idea to write a blog post about it.

Please note that my knowledge of ARM processors started in 2003 so I can make mistakes in everything older. Tried to understand articles about old times but sometimes they do not keep one version of story.

Ancient times

ARM1 got released in 1985 as CPU add-on to BBC Micro manufactured by Acorn Computers Ltd. as result of few years of research work. They wanted to have new processor to replace ageing 6502 used in BBC Micro and Acorn Electron and none of existing ones did not fit their requirements. Note that it was not market product but rather development tool made available for selected users.

But it was ARM2 which landed in new computers — Acorn Archimedes (1987 year). Had multiply instructions added so new version of instruction set was created: ARMv2. Just 8MHz clock but remember that it was first computer with new CPU

Then ARM3 came — with cache controller integrated and 25MHz clock. ISA was bumped to ARMv2a due to SWP instruction added. And it was released in another Acorn computer: A5000. This was also used in Acorn A4 which was first ARM powered laptop (but term “ARM Powered” was created few years later). I hope that one day I will be able to play with all those old machines…

There was also ARM250 processor with ARMv2a instruction set like in ARM3 but no cache controller. But it is worth mentioning as it can be seen as first SoC due to ARM, MEMC, VIDC, IOC chips integrated in one piece of silicon. This allowed to create budget versions of computers.

ARM Ltd.

In 1990 Acorn, Apple and VLSI co-founded Advanced RISC Machines Ltd. company which took over research and development of ARM processors. Their business model was simple: “we work on cpu cores and other companies pay us license costs to make chips”.

Their first cpu was ARM60 with new instruction set: ARMv3. It had 32bit address space (compared to 26bit in older versions), was endian agnostic (so both big and little endian was possible) and there were other improvements.

Please note lack of ARM4 and ARM5 processors. I heard some rumours about that but will not repeat them here as some of them just do not fit when compared against facts.

ARM610 was powering Apple Newton PDA and first Acorn RiscPC machines where it was replaced by ARM710 (still ARMv3 instruction set but ~30% faster).

First licensees

You can create new processor cores but someone has to buy them and manufacture… In 1992 GEC Plessey and Sharp licensed ARM technology, next year added Cirrus Logic and Texas Instruments, then AKM (Asahi Kasei Microsystems) and Samsung joined in 1994 and then others…

From that list I recognize only Cirrus Logic (used their crazy EP93xx family), TI and Samsung as vendors of processors ;D

Thumb

One of next cpu cores was ARM7TDMI (Thumb+Debug+Multiplier+ICE) which added new instruction set: Thumb.

The Thumb instructions were not only to improve code density, but also to bring the power of the ARM into cheaper devices which may primarily only have a 16 bit datapath on the circuit board (for 32 bit paths are costlier). When in Thumb mode, the processor executes Thumb instructions. While most of these instructions directly map onto normal ARM instructions, the space saving is by reducing the number of options and possibilities available — for example, conditional execution is lost, only branches can be conditional. Fewer registers can be directly accessed in many instructions, etc. However, given all of this, good Thumb code can perform extremely well in a 16 bit world (as each instruction is a 16 bit entity and can be loaded directly).

ARM7TDMI landed nearly everywhere - MP3 players, cell phones, microwaves and any place where microcontroller could be used. I heard that few years ago half of ARM Ltd. income was from license costs of this cpu core…

ARM7

But ARM7 did not ended at ARM7TDMI… There was ARM7EJ-S core which used ARMv5TE instruction set and also ARM720T and ARM740T with ARMv4T. You can run Linux on Cirrus Logic CLPS711x/EP721x/EP731x ones ;)

According to ARM Ltd. page about ARM7 the ARM7 family is the world’s most widely used 32-bit embedded processor family, with more than 170 silicon licensees and over 10 Billion units shipped since its introduction in 1994.

ARM8

I heard that ARM8 is one of those things you should not ask ARM Ltd. people about. Nothing strange when you look at history…

ARM810 processor made use of ARMv4 instruction set and had 72MHz clock. At same time DEC released StrongARM with 200MHz clock… 1996 was definitively year of StrongARM.

In 2004 I bought my first Linux/ARM powered device: Sharp Zaurus SL-5500.

ARM9

Ah ARM9… this was huge family of processor cores…

ARM moved from a von Neumann architecture (Princeton architecture) to a Harvard architecture with separate instruction and data buses (and caches), significantly increasing its potential speed.

There were two different instruction sets used in this family: ARMv4T and ARMv5TE. Also some kind of Java support was added in the latter one but who knows how to use it — ARM keeps details of Jazelle behind doors which can be open only with huge amount of money.

ARMv4T

Here we have ARM9TDMI, ARM920T, ARM922T, ARM925T and ARM940T cores. I mostly saw 920T one in far too many chips.

My collection includes:

ARMv5T

Note: by ARMv5T I mean every cpu never mind which extensions it has built-in (Enhanced DSP, Jazelle etc).

I consider this one to be most popular one (probably after ARM7TDMI). Countless companies had own processors based on those cores (mostly on ARM926EJ-S one). You can get them even in QFP form so hand soldering is possible. CPU frequency goes over 1GHz with Kirkwood cores from Marvell.

In my collection I have:

Had also at91sam9m10, Kirkwood based Sheevaplug and ixp425 based NSLU2 but they found new home.

ARM10

Another quiet moment in ARM history. ARM1020E, ARM1022E, ARM1026EJ-S cores existed but did not looked popular.

UPDATE: Conexant uses ARM10 core in their next generation DSL CPE systems such as bridge/routers, wireless DSL routers and DSL VoIP IADs.

ARM11

Released in 2002 as four new cores: ARM1136J, ARM1156T2, ARM1176JZ and ARM11 MPCore. Several improvements over ARM9 family including optional VFP unit. New instruction set: ARMv6 (and ARMv6K extensions). There was also Thumb2 support in arm1156 core (but I do not know did someone made chips with it). arm1176 core got TrustZone support.

I have:

Currently most popular chip with this family is BCM2835 GPU which got arm1136 cpu core on die because there was some space left and none of Cortex-A processor core fit there.

Cortex

New family of processor cores was announced in 2004 with Cortex-M3 as first cpu. There are three branches:

All of them (with exception of Cortex-M0 which is ARMv6) use new instruction sets: ARMv7 and Thumb-2 (some from R/M lines are Thumb-2 only). Several cpu modules were announced (some with newer cores):

I will not cover R/M lines as did not played with them.

Cortex-A8

Announced in 2006 single core ARMv7a processor core. Released in chips by Texas Instruments, Samsung, Allwinner, Apple, Freescale, Rockchip and probably few others.

Has higher clocks than ARM11 cores and achieves roughly twice the instructions executed per clock cycle due to dual-issue superscalar design.

So far collected:

Cortex-A9

First multiple core design in Cortex family. Allows up to 4 cores in one processor. Announced in 2007. Looks like most of companies which had previous cores licensed also this one but there were also new vendors.

There are also single core Cortex-A9 processors on a market.

I have products based on omap4430 from Texas Instruments and Tegra3 from NVidia.

Cortex-A5

Announced around the end of 2009 (I remember discussion about something new from ARM with someone at ELC/E). Up to 4 cores, mostly for use in all designs where ARM9 and ARM11 cores were used. In other words new low-end cpu with modern instruction set.

Cortex-A15

The fastest (so far) core in ARMv7a part of Cortex family. Up to 4 cores. Announced in 2010 and expanded ARM line with several new things:

I have Chromebook with Exynos5250 cpu and have to admit that it is best device for ARM software development. Fast, portable and hackable.

Cortex-A7

Announced in 2011. Younger brother of Cortex-A15 design. Slower but eats much less power.

Cortex-A12

Announced in 2013 as modern replacement for Cortex-A9 designs. Has everything from Cortex-A15/A7 and is ~40% faster than Cortex-A9 at same clock frequency. No chips on a market yet.

big.LITTLE

That’s interesting part which was announced in 2011. It is not new core but combination of them. Vendor can mix Cortex-A7/12/15 cores to have kind of dual-multicore processor which runs different cores for different needs. For example normal operation on A7 to save energy but go up for A15 when more processing power is needed. And amount of cores in each of them does not even have to match.

It is also possible to make use of all cores all together which may result in 8-core ARM processor scheduling tasks on different cpu cores.

There are few implementations already: ARM TC2 testing platform, HiSilicon K3V3, Samsung Exynos 5 Octa and Renesas Mobile MP6530 were announced. They differ in amount of cores but all (except TC2) use the same amount of A7/A15 cores.

ARMv8

In 2011 ARM announced new 64-bit architecture called AArch64. There will be two cores: Cortex-A53 and Cortex-A57 and big.LITTLE combination will be possible as well.

Lot of things got changed here. VFP and NEON are parts of standard. Lot of work went into making sure that all designs will not be so fragmented like 32-bit architecture is.

I worked on AArch64 bootstrapping in OpenEmbedded build system and did also porting of several applications.

Hope to see hardware in 2014 with possibility to play with it to check how it will play compared to current systems.

Other designs

ARM Ltd. is not the only company which releases new cpu cores. That’s due to fact that there are few types of license you can buy. Most vendors just buy licence for existing core and make use of it in their designs. But some companies (Intel, Marvell, Qualcomm, Microsoft, Apple, Faraday and others) paid for ‘architectural license’ which allows to design own cores.

XScale

Probably oldest one was StrongARM made by DEC, later sold to Intel where it was used as a base for XScale family with ARMv5TEJ instruction set. Later IWMMXT got added in PXA27x line.

In 2006 Intel sold whole ARM line to Marvell which released newer processor lines and later moved to own designs.

There were few lines in this family:

One day I will undust my Sharp Zaurus c760 just to check how recent kernels work on PXA255 ;D

Marvell

Their Feroceon/PJ1/PJ4 cores were independent ARMv5TE implementations. Feroceon was Marvell’s own ARM9 compatible CPU in Kirkwood and others, while PJ1 was based on that and replaced XScale in later PXA chips. PJ4 is the ARMv7 compatible version used in all modern Marvell designs, both the embedded and the PXA side.

Qualcomm

Company known mostly from wireless networks (GSM/CDMA/3G) released first ARM based processors in 2007. First ones were based on ARM11 core (ARMv6 instruction set) and in next year also ARMv7a were available. Their high-end designs (Scorpion and Krait) are similar to Cortex family but have different performance. Company also has Cortex-A5 and A7 in low-end products.

Nexus 4 uses Snapdragon S4 Pro and I also have S4 Plus based Snapdragon development board.

Faraday

Faraday Technology Corporation released own processors which used ARMv4 instruction set (ARMv5TE in newer cores). They were FA510, FA526, FA626 for v4 and FA606TE, FA626TE, FMP626TE and FA726TE for v5te. Note that FMP626TE is dual core!

They also have license for Cortex-A5 and A9 cores.

Project Denver

Quoting Wikipedia article about Project Denver:

Project Denver is an ARM architecture CPU being designed by Nvidia, targeted at personal computers, servers, and supercomputers. The CPU package will include an Nvidia GPU on-chip.

The existence of Project Denver was revealed at the 2011 Consumer Electronics Show. In a March 4, 2011 Q&A article CEO Jen-Hsun Huang revealed that Project Denver is a five year 64-bit ARM architecture CPU development on which hundreds of engineers had already worked for three and half years and which also has 32-bit ARM architecture backward compatibility.

The Project Denver CPU may internally translate the ARM instructions to an internal instruction set, using firmware in the CPU.

X-Gene

AppliedMicro announced that they will release AArch64 processors based on own cores.

Final note

If you spotted any mistakes please write in comments and I will do my best to fix them. If you have something interesting to add also please do a comment.

I used several sources to collect data for this post. Wikipedia articles helped me with details about Acorn products and ARM listings. ARM infocenter provided other information. Dates were taken from Wikipedia or ARM Company Milestones page. Ancient times part based on The ARM Family and The history of the ARM CPU articles. The history of the ARM architecture was interesting and helpful as well.

Please do not copy this article without providing author information. Took me quite long time to finish it.

Changelog

8 June evening

Thanks to notes from Arnd Bergmann I did some changes:

David Alan Gilbert mentioned that ARM1 was not freely available on a market. Added note about it.

aarch64 arm beagleboard chromebook collie development laptop linaro linux nokia nvidia omap openmoko openzaurus pandaboard phone qualcomm ubuntu zaurus