Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1Learning Outcomes

2Conceptual Layout of a Computer

In order to learn an ISA, we must first understand Figure 1, which shows a conceptual layout of a computer:

"TODO"

Figure 1:Basic computer layout (See: von Neumann architecture).

3Registers

Importantly, the processor is designed to be fast. Four example, if a processor runs at 4 GHz, then it can execute instructions on some data once per cycle, or every 0.25 ns (nanoseconds). This data must also be physically located close to the processor!

Consider that the speed of light (approximately 3.0×1083.0 \times 10^8 m/s), which physically defines the fastest speed with which to access data from a certain physical location. In other words, accessing something about 10 cm away will already take 0.3 ns (thankfully most of our integrated chips are much smaller than this distance). Nevertheless, in all modern architectures we have at least two pieces of hardware for data:

"TODO"

Figure 2:Great Idea 3: The Principle of Locality / Memory Hierarchy

4Memory Hierarchy

Each ISA specifies a predetermined number of hardware registers, defining how each of the registers should be used for instruction execution. RISC-V defines 32 registers; read more in the next section.

Remember the picture of the principal memory hierarchy in (Figure 3):

"TODO"

Figure 3:Great Idea 3: The Principle of Locality / Memory Hierarchy

At the very top, we have the processor core with its registers. On a separate chip, we typically have the main memory, implemented using DRAM (Dynamic Random Access Memory). You might have heard of flavors like DDR3, 4, or 5, or High Bandwidth Memory (HBM). While DRAM is fast, it is not nearly as fast as registers. At a reasonable price point, you can get many gigabytes for a few tens of dollars, providing medium capacity.

Physics dictates that smaller is faster. How big is the gap between registers and memory? While a processor has only about 128 bytes of total register storage, a laptop might have 2 to 64 gigabytes of DRAM, and a server might have a terabyte. On the other hand, if we think in terms of latency, registers are about 50 to 500 times faster than DRAM.

Let’s go back to Jim Gray’s storage latency analogy[3] in Figure 2. Put another way–if retrieving data from a register in your head takes one minute, retrieving data from memory (which is 100x slower) would be like driving to Sacramento to get a piece of paper you forgot. If the gap is 500x, that’s like driving to Los Angeles and back. That is a massive penalty just to retrieve one isolated data item!

We only have a small number of registers–they are extremely fast and share precious real estate with the processor core, making them extremely expensive. Designing an ISA (and an associated architecture) therefore involves a careful (?) tango (?) of performing operations on data in registers where possible, and spacing out limited but hefty trips to memory and disk.

Footnotes
  1. 32 x 4B = 128 B of register data on a RV32 architecture.

  2. 2-64 GB of memory on modern laptops.

  3. At some point, we will go back and write in Great Ideas (our introductory lecture).