1Learning Outcomes¶
Define key components of the memory hierarchy: processor, caches, memory, disk.
Explain how caches leverage temporal and spatial locality.
Trace memory access with caches.
Get familiar with key cache terminology: cache hit, cache miss, and cache line (i.e., block).
🎥 Lecture Video: Locality, Design, and Management
🎥 Lecture Video: Memory Hierarchy
🎥 Lecture Video: Actual CPUs
2The Memory Hierarchy, Revisited¶
We now continue our earlier discussion of memory hierarchy. Earlier, we assumed there were only two layers of our memory hierarchy: registers (on the CPU) and memory (DRAM is close, but on a separate chip).

Figure 3:Great Idea 3: The Principle of Locality / Memory Hierarchy
The mismatch between processor and memory speeds (the “careful tango” described earlier leads us to add a new level: The memory cache, or cache for short. Caches are usually on the same chip as the CPU and fit into the memory hierarchy as follows:
Size: Smaller than memory, but certainly larger than the 32 registers on our RISC-V processor.
Speed: Use hardware that is much faster than DRAM (used for main memory), but slower than registers.
Cost: Use hardware that is more expensive than DRAM.
There are additional levels lower than main memory: disk is a huge one (literally). Just as the cache contains a copy of a subset of data in main memory, main memory contains copies of data on disk.
Data moves differently between different levels of the memory hierarchy:
Registers and memory: Managed by the compiler. Loads and stores move data in and out.
Cache and memory: Managed by cache controller hardware. We will describe the high-level operation, but leave the implementation to a later course.
Memory and disk: Managed by the operating system and special hardware via virtual memory, a concept that we will discuss later.[1] Additionally managed by the programer/user via files and file streams.
If useful, we revisit Jim Gray’s analogy of data access time on registers, on the cache, in main memory, and on disk.

Figure 2:Great Idea 3: The Principle of Locality / Memory Hierarchy
2.1Multi-Level Caches¶
You may have noticed that the memory hierarchy diagram contains multiple caches labeled Level 1, Level 2, and Level 3. A computer can have multiple caches, where each cache is a copy of data from lower in the memory hierarchy.
Consider Apple’s A14 bionic chip, which we introduced earlier:

Figure 2:Apple A14 Bionic Chip (sources: Wikipedia, TechInsights
The L1 cache is The L2 cache is located on the integrated circuit, often adjacent to the CPU. The System Level Cache labeled in the diagram is likely a Level 3 cache, shared across multiple CPU cores.[2]
L1 cache (L1$[3]): Usually directly embedded on the CPU, hence why it is not labeled in the above diagram.
L2 cache (L2$): Located on the integrated circuit, often adjacent to the CPU.
Size: Tens or hundreds of MiB.
Hit Time: Few clock cycles
Miss rate: 10-20%
2.2Demo¶
To find out the sizes of different components of the memory hierarchy on a Linux-based machine, we can use df and sysctl. The following commands were run on a Mac OS X machine.
To determine disk size, use df. The default display is in blocks (e.g., lines); use the -h option for IEC prefixes (base-two), and the -H option for base-10 prefixes.
$ df -h
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk3s1s1 460Gi 17Gi 38Gi 31% 427k 395M 0% /
devfs 215Ki 215Ki 0Bi 100% 744 0 100% /dev
...
$ df -H
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk3s1s1 494G 18G 40G 31% 427k 395M 0% /
devfs 220k 220k 0B 100% 744 0 100% /dev
...To determine cache size and memory size, use sysctl. Because this command lists all attributes of the system kernel, we pipe the output through grep to get what we want. The default unit is bytes for memory and caches.
$ sysctl -a | grep hw.memsize
hw.memsize: 25769803776
hw.memsize_usable: 25143640064
$ sysctl -a | grep "hw.l.*size"
hw.l1icachesize: 131072
hw.l1dcachesize: 65536
hw.l2cachesize: 4194304Solution to Exercise 1 #
D.
3Memory Caches¶
Caches are the basis of the memory hierarchy.
How do we create the illusion of a large memory that we can access fast? From P&H 5.1:
Just as you did not need to access all the books in the library at once with equal probability, a program does not access all of its code or data at once with equal probability. Otherwise, it would be impossible to make most memory accesses fast and still have large memory in computers, just as it would be impossible for you to fit all the library books on your desk and still find what you wanted quickly.
Table 1:Principles of temporal and spatial locality.
| Property | Temporal Locality | Spatial Locality |
|---|---|---|
| Idea | If we use it now, chances are that we’ll want to use it again soon. | If we use a piece of memory, chances are we’ll use the neighboring pieces soon. |
| Library Analogy | We keep a book on the desk while we check out another book. | If we check out volume 1 of a reference book, while we’re at it, we’ll also check out volume 2. Libraries put books on the same topic together on the same shelves to increase spatial locality. |
| Memory | If a memory location is referenced, then it will tend to be referenced again soon. Therefore, keep most recently accessed data items closer to the processor. | If a memory location is referenced, the locations with nearby addresses will tend to be referenced soon. Move lines consisting of contiguous words closer to the processor. |
4Key Cache Terminology¶
Lines (also called blocks) of data are copied from memory to the cache.
Memory is byte-addressable, meaning each byte in memory has a memory address. This is identical to our concept of memory from earlier.
Consider how memory access works with a cache, as in Figure 1.

Figure 1:Caches in the basic computer layout (from an earlier section).
When a load or store instruction is accessed, memory data access is requested. There are two situations that can occur:
Cache hit: The data you were looking for is in the cache. Retrieve the data from the cache and bring it to the processor.
Cache miss: The data you were looking for is not in the cache. Go to a lower layer in the memory hierarchy to find the data, put the data in the cache. Then, bring the data to the processor.
5Storage¶
🎥 Lecture Video: Storage
Written version coming soon!