Every processor ever built contains an underlying “architecture,” representing deep-seated characteristics that transcend any single CPU core or physical design. This architecture defines how a processor works, what it can do, how memory is accessed, and much more. A change in processor architecture marks a major milestone, complete with all-new physical hardware designs, instruction sets, and capabilities.
When it comes to smartphones, we’ve been using processors based on Arm’s Armv8 architecture and revisions for the better part of a decade. The arrival of Armv9 will soon be followed by all-new CPU cores destined for next-gen SoCs packed into future smartphones. With that crash course out of the way, let’s talk about Arm’s latest Armv9 architecture.
Armv9 is the first new Arm architecture in a decade and will define the next generation of mobile, server, and other processors over the next 10 years. For starters, Arm boasts that the next two generations of CPU designs will see a 30% improvement over today’s highest performance Cortex-X1 CPU core. That’s not including clock speed and other manufacturing benefits that might help eke out even more performance. The other key takeaways are that Armv9 will be much faster than Armv8 for machine learning workloads and also much more secure to help protect our most sensitive data.
Armv9: Faster machine learning for everyone
Arm is keeping the exact inner working of Armv9 close to its chest for now. We’ll like have to wait for the first processors based on the architecture to find out more. These will likely appear later in 2021. But we know quite a bit about the advanced machine learning and security features that make up the bulk of the improvements in Armv9.
Let’s start with the math crunching improvements, which come about from enhanced matrix math capabilities and the second generation of Arm’s Scalable Vector Extension (SVE2). The first-gen SVE was designed for the Fugaku supercomputer, but SVE2 has been distilled down for general-purpose computers. SVE2 builds on Arm’s NEON math library’s principles but is redesigned from scratch for improved data parallelism. Importantly, SVE2 supports NEON too, so it will be used for digital signal processing (DSP) functions.
Like SVE1, SVE2 allows for flexible rather than fixed vector length implementations in 128-bit increments up to 2048 bits. This gives CPU designers greater control over the number-crunching capabilities of their CPU cores. It also supports new data types and instructions, such as bitwise permute, complex integer multiply-add with rotate, and other multi-precision arithmetic bits for large integer arithmetic and cryptography. SVE2 is also designed to accelerate common algorithms used for computer vision, multimedia, LTE baseband processing, web servicing, and more.
SVE2 will greatly accelerate machine learning performance and other DSP workloads directly on the CPU, lessening the need for external DSP and AI processing hardware. The age of heterogeneous compute certainly isn’t over. Still, Arm sees these functions as so essential for the future of computing that every CPU should be capable of performing them efficiently.
Armv9: Improved hardware-based security
The importance of security in modern processors can’t be understated. I’m sure you all recall the fuss made about exploits like Heartbleed, Spectre, and the like. Preventing memory-leak and overflow problems like this and avoiding new ones in the future requires new hardware-based approaches to security. And there are a couple of important ones included in Armv9 — Memory Tagging Extension (MTE) and Realm Management Extension — as part of Arm’s Confidential Compute Architecture (CCA).
Tagged memory may sound familiar to those who closely follow Android development, as this feature is already supported by Android 11, as well as OpenSUSE. Arm debuted memory tagging in Armv8.5, but there aren’t any mobile CPU cores built on this revision. MTE is designed to prevent memory vulnerabilities with a “lock and key” approach to access. Memory pointers are tagged upon creation and checked during load/store instructions to ensure the memory is accessed from the correct place. Exceptions are raised on a mismatch, allowing developers to track down potential security problems.
Running memory tagging in hardware on the CPU reduces the performance penalty from this checking process. Likewise, hardware-based checks are much more tamperproof, making it much harder for malicious actors to produce exploits.
Arm’s Realm Management Extension and CCA are even broader in scope. It builds on the ideas of Arm TrustZone, allowing applications to run in their own secure environment isolated from the main operating system and other applications. Unlike Hypervisors and virtual machines, which run separated operating systems side by side, Realms also supports the secure separation of individual apps and services that share a common OS. You can think of this like Linux containers, only even more secure and built into the hardware.
The idea is simple enough. Each Realm can’t see what the other is doing, greatly reducing the risk of sensitive data leaking to another compromised app or even the operating system. So your banking apps’ software and processing resources are securely separated from a game you’re running, which is isolated from Facebook, etc. Hardware-based security features like this are increasingly important to protect sensitive data, like biometric information, stored on our devices.
However, we’ll need to wait to learn more about how exactly Arm accomplishes this, what’s exposed between services, how the OS shares resources around, etc. We do know that Realms requires major changes throughout the operating system, such as Google’s Android. As such, Realms won’t be supported with first-generation Armv9 processors. The feature is expected to appear a little later in the architecture’s lifecycle.
The first Armv9 processors
Arm’s Armv9 architecture will make its way to Arm microcontroller, real-time, and application processors over the coming years. The first will fall under the Cortex-A line destined for smartphone SoCs, followed by server chips. Arm anticipates we’ll see our first Armv9 chipset for mobile phones announced this year, with the first devices landing on the market in 2022.
Tucked away in Arm’s press briefing, there was also a slide on upcoming Mali GPU features. These include variable-rate shading and ray tracing, two features currently turning heads on the game console and high-end graphics card markets. There’s plenty to look forward to from the broader Arm hardware portfolio in the coming years.