CPUs are incredibly complex beasts.

Memory performance is a key factor in the performance of modern CPUs, specifically as a limiting factor.

Contents

Why is memory speed so important?

CPUs are incredibly fast, with the latest generations running at 5.7GHz when adequately cooled.

This allows them to complete 5.7 billion operations every second.

Main system memory, known as RAM, is also very fast.

Unfortunately, its very fast when compared to anything other than the CPU.

The absolute latency on modern high-end RAM is on the order of 60 nanoseconds.

Unfortunately, that translates to roughly 342 CPU cycles.

To speed up memory access, a CPU cache is used that dynamically caches data.

Unfortunately, the CPU cache is also a lot smaller than system RAM, generally not totalling even 100MB.

Still, despite its diminutive size, the tiered CPU cache system massively increases system performance.

Here comes virtual memory to mess everything up

Modern computers utilise a system called virtual memory.

Rather than allocating physical memory addresses to processes, virtual memory addresses are used.

Each process has its own virtual memory address space.

This has two benefits.

Firstly, it provides easy separation between memory that belongs to one process and memory that belongs to another.

It also hides the physical memory structure from the process.

This allows the computer to gently manage scenarios where more RAM is required than is physically present.

This requires a table to store all the translations of virtual memory addresses to physical memory addresses.

The size of this directly depends on the amount of RAM in use.

Its generally fairly small, at least when compared to the capacity of system RAM.

One to find the physical address to request and then another to actually access that location.

The CPU cache would fit the bill nicely, at least from a speed perspective.

The problem with that, however, is that the CPU cache is tiny, and already heavily utilised.

And that is exactly what the Translation Lookaside Buffer, or TLB, is.

Its a high-speed, cache for recent address translations.

Any memory request goes via the TLB.

If theres a TLB miss, the lookup has to be performed from main memory.

Note: There are different schemes for TLB eviction.

Some may use a First In, First Out, or FIFO scheme.

Others may use a Least Frequently Used or LFU scheme.

By caching recent translations memory latency can be greatly reduced for TLB hits.

Care must be taken to ensure that cached translations are relevant to the currently active process.

As each process has a different virtual address space they cant be reused.

Not strictly limiting this was the cause behind the Meltdown vulnerability.