CPUs are incredibly complex beasts.
One of the key features of a CPU is the cache.
Its not a flashy feature.
It doesnt advertise as well as the core count or peak boost frequency.
It is critical to performance, though.
Contents
Why Cache?
Modern CPUs are incredibly fast.
They perform more than five billion operations every second.
Keeping the CPU fed with data when it operates that fast is difficult.
The RAM has enough capacity to supply the CPU with data.
It can even transfer data every second, thanks to very high bandwidths.
Thats not the problem, though.
The problem is latency.
RAM can respond very quickly.
The problem is that very quickly is a long time when you do five billion things every second.
Even the fastest RAM has a latency above 60 nanoseconds.
Again, 60 nanoseconds sound like no time at all.
The problem is that if the CPU ran at 1GHz, it would take 1ns to complete a cycle.
With high-end CPUs hitting 5.7GHz, thats one cycle every 175 picoseconds.
Howre those 60 nanoseconds of latency looking now?
Thats 342 cycles of latency.
That sort of latency would be a killer for any CPU performance.
To get around that, a cache is used.
The cache is placed on the CPU die itself.
Its also much smaller than RAM and uses a different structure, SRAM rather than DRAM.
This makes it much quicker to respond than the main system RAM.
Lower tiers are faster but smaller.
L1 can have a latency of four or five clock cycles, much better than 342.
But Some CPUs Mention an L0?
The terminology for L1, L2, and L3 is pretty standard.
The vague understanding of what they mean and do is relatively common, even across CPU vendors.
This is because theyre governed by material and electrical physics; not much can change.
you might have a fast cache or a big cache, not both.
It needs to be bigger if you share a cache between multiple cores.
To that end, L1 and L2 tend to be core specific.
The larger L3 cache tends to be shared between some or all cores on the CPU or chiplet.
It doesnt help to understand what it means, though.
you’re free to probably guess some things, though.
The other name it goes by can help a bit; thats micro-op cache.
Instead of caching data from memory, or full instructions, L0 caches micro-ops.
As werecently described, a micro-op is a feature of modern CPUs.
Instructions in x86 and other ISAs are big, complex, and challenging to fit efficiently in a pipeline.
you’re free to pipeline them much more efficiently if you break them down into constituent micro-ops.
CPU Architecture ft Micro-Op Cache
To execute an instruction, a modern CPU decodes it.
This involves splitting the instruction into its constituent micro-ops and determining the memory locations that should be referenced.
This means that the exact instructions can be called again and again.
This then means that the same micro-ops get called again and again.
And if the same micro-ops are needed repeatedly, they can be cached.
Conclusion
L0 cache is another name for the micro-op cache.
It can be a part of modern CPUs that utilize micro-operations.
It typically holds a few thousand entries and has capacities listed in numbers of entries rather than bytes.
L0 can be accessed faster than L1, typically with a 1- or 0-cycle latency.