Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1Learning Outcomes

In our discussion of instruction timing, we computed the fastest clock frequency possible for our single-cycle datapath. With some assumptions, we determined that the shortest clock period 800 ps was determined by the critical path of load instructions. This resulted in a maximum clock frequency of 1.25 GHz, meaning the CPU could finish 1.25 billion instructions per second.

Consider one of the great ideas of this course: Performance measurement and improvement. Can we improve our processor’s performance? It’s not obvious what we mean by this question. There are many performance measures possible:

2Latency, Throughput, and Energy Efficiency

We consider three performance measures in this course:

Depending on the unit of analysis (in the computer architecture context, what we are trying to optimize), our units for these measures may vary. We discuss this tension with a transportation analogy.

3Transportation Analogy

Consider two vehicles: a sports car and a bus, with the features in Table 1.

Table 1:Two vehicles with different features.

CategorySports CarBus
Passenger Capacity2 people50 people
Travel Speed200 mph50 mph
Gas Mileage5 mpg2 mpg

To compare which vehicle performs “better,” we must determine a benchmark task, such as the following:

Transport 100 passengers over 50 miles.[1]

Next, let’s consider how the two vehicles fare on this task across several different performance measures in Table 2:

Table 2:Vehicle performance on a specified benchmark task.

Performance MeasureSports CarBus
Time per trip15 mins60 mins
Time for 100 passengers750 mins (50 2-person trips)120 mins (2 50-person trips)
Passengers per hour8 passengers/hour50 passengers per hour
Gallons per passenger5 gallons0.5 gallons

The sports car has to take many trips, even though each individual trip time is short. By contrast, the bus completes the overall task faster and with less energy because of its higher capacity. While the sports car has better latency per trip, the bus has much better throughput and energy efficiency per passenger.

Let us translate this analogy back to the computer world in Table 3.

Table 3:Transportation Analogy vs. Performance Measures in Computers.

Transportation AnalogyMeasureIn a Computer
Time per tripLatencyInstruction latency
Time for 100 passengersLatencyProgram execution time (e.g. time to update display)
Passengers/hourThroughputTotal tasks per unit time (e.g. # of server requests handled per hour) or instruction throughput (e.g., instructions/second)
Gallons per passengerEnergy EfficiencyEnergy per task (e.g. how many movies you can watch per battery charge)

4In This Course

In this unit, we use this discussion of latency and throughput to motivate our RISC-V five-stage pipeline, which increases instruction throughput from the single-cycle case. After describing the details of this architecture, we consider how to fix pipeline hazards—when instructions cannot execute properly.

We then discuss latency with the “iron law of processor performance,” which identifies the high-level components that impact program execution time on a given processor, regardless of implementation.

While not the focus of our class, we must also mention real industry metrics like cost over lifetime. The sports car might be $64,000 with a “life” expectancy of 8 years, whereas the bus may be $400,000 with an expectancy of 12 years.

Footnotes
  1. Assume 50 miles round trip, or 50 miles one-way where the vehicle immediately warps back upon completion of each trip. Also assume passenger loading/unloading, gas refilling, traffic, etc., all incurs negligible delay.