Summary

1And in Conclusion $\dots$ ¶

Replacement policies:

For direct-mapped caches, each block of memory maps to one specific block in our cache. On a cache miss, if there is data present in that cache block, then we must evict the block to make room for our new data.
For non-direct-mapped caches, we can choose one of multiple cache blocks to place our new data. When our cache is full, we will have to decide which block to evict to make space for the new data. Block Replacement policies decide which block should be replaced.
- Least recenty used (LRU)
- First-in, first-out (FIFO)
- Random

Table 1:Comparison of cache replacement policies.

Feature	LRU	FIFO	Random
Description	Replace the entry that has not been used for the longest time	Replace the oldest block in the set (queue)	Replace a random block
Implementation	Bit counters for each cache block; see quick check	FIFO queue or similar approximation	Flip a figurative hardware coin
Advantage	Temporal locality	Reasonable approximation to LRU	Easy to implement
Disadvantage	Complicated hardware to keep track of access history	–	Terrible if workload leverages high temporal locality

Write policies:

Table 2:Comparison of cache write policies.

Feature	Write-through	Write-back
Description	Write to the cache and memory at the same time	Write to the cache. Only write back to memory when the block is replaced.
Implementation	Easy	More complicated
“Synchronized” with memory?^[6]	Yes	No. Sometimes cache will have the most current copy of data.
Optimization	–	Include a dirty bit to only write back modified blocks.
Hit time for writes	Each write to a cache block also requires a write to memory. Longer.	Multiple writes within a cache block require only one write to memory (the latter happens only on replacement). Faster.
Miss penalty	“Read” misses are fast and never result in a write to memory. Shorter.	A “read” miss has variable time, since a write to memory may also be needed if the block to be replaced is dirty. Longer on average.
AMAT	Longer	Shorter

2Textbook Readings¶

P&H 5.1-5.4, 5.8, 5.9, 5.13

3Additional References¶

Cache Flowchart

Amazing Illustrations by Ketrina (Yim) Thompson: CS Illustrated Cache Handouts

4Exercises¶

Check your knowledge!

4.1Short Exercises¶

Solution to Exercise 1 #

False.

Solution to Exercise 2 #

True. A direct-mapped cache needs to index every block of the cache, whereas a 4-way set associative cache needs to index every set of 4 blocks. The 4-way set associative cache will have 2 fewer index bits than the direct-mapped cache.

Solution to Exercise 3 #

False. Similar to the previous question, the impact depends on the program. If a program iterates through contiguous memory (like an array), having larger block sizes with fewer blocks may be beneficial as each block contains more contiguous data. For instance, if Cache A has 10 blocks and a block size of 8 bytes while Cache B has 20 block and a block size of 4 bytes, and we loop through an array of 80 characters, Cache A will experience 10 cache misses and 70 hits, while Cache B will have 20 misses and 60 hits.