1Learning Outcomes¶
Explain what “parallelizing a program” means.
Identify the four different categories of Flynn’s Taxonomy: SISD, SIMD, MISD, and MIMD.
🎥 Lecture Video
Can we be more efficient in writing sequential C code on a serial processor? We discussed some optimizations in an earlier section. We can write assembly code and do better with “micro” optimizations like loop unrolling—but in general, compilers are pretty good nowadays and it’s not that easy to beat them. We can rewrite our programs to make better use of the memory hierarchy, like we explored with our cache blocking exercise earlier.
But how do we leverage hardware improvements? That is the topic of this section.
2Parallelism: Software vs. Hardware¶
Because of the abrupt shift in processor design towards parallelism, there are a LOT of closely related terms when it comes to paralellism. Table 1 is an adaptaion of P&H 6.1 to clarify the terms used in software versus hardware.
Table 1:Software perspective on concurrency vs. hardware perspective on parallelism.
| Software | ||
|---|---|---|
| Hardware | Sequential | Concurrent |
| Serial | Basic matrix multiply running on single-core system | Operating System running on single-core system |
| Parallel | Basic matrix multiply running on modern laptop | Operating System running on modern laptop |
Notes:
Software is inherently sequential or concurrent.
Hardware is serial or parallel.
Concurrent software can run on serial and parallel hardware; similarly, sequential software can run on serial and parallel hardware.
“Parallelizing” a program colloquially means figuring out how to write a software program to run efficiently on parallel hardware. This can involve making naturally sequential software have high performance on parallel hardware, or to make concurrent processors have high performance on multicore systems as the number of cores (processors) increases.
The last of these points is the most important.
3Flynn’s Taxonomy¶
Now let’s shift towards classifying serial and parallel hardware using Flynn’s Taxonomy.[1] There are four entries in Table 2 that classify different levels of parallelism based on data streams and instruction streams.
Table 2:Flynn’s taxonomy: A system for classifying parallel hardware.
| Data Streams | ||
|---|---|---|
| Instruction Streams | Single | Multiple |
| Single | SISD | SIMD |
| Multiple | MISD | MIMD |
SISD (Single Instruction, Single Data, pronounced “SIS-dee”): A serial computer that exploits no parallelism in either the instruction or data streams.
SIMD (Single Instruction, Multiple Data, pronounced “SIM-dee”): Computer that applies a single instruction stream to multiple data streams for operations that may be naturally parallelized.
MISD (Multiple Instruction, Single Data, pronounced “MIZ-dee”): Exploits multiple instruction streams against a single data stream for data operations that can be naturally parallelized.
MIMD (Multiple Instruction, Multiple Data, pronounced “MIM-dee”): Multiple autonomous processors simultaneously executing different instructions on different data.
Toggle the tabs below to discover examples of each type of architecture.

Figure 1:SISD: Single Instruction/Single Data Stream
A sequential processor steps through the instruction pool, matches it with the data in memory, and processes each instruction-data pair one at a time.
Our RISC-V processor (single-cycle or five-stage pipeline)
(out of scope) Superscalar processors because the programming model is sequential

Figure 2:SIMD: Single Instruction/Multiple Data Stream
Issue one instruction (e.g., “add”) that operates on multiple data pairs at the same time. Useful for neural nets, imaging, and scientific applications.
Intel SIMD instruction extensions (see later section on Intel intrinsics)
NVIDIA Graphics Processing Unit (GPU)[2]

Figure 3:MISD: Multiple Instruction/Single Data Stream
None nowadays.
Historical significance, e.g., certain kinds of array processors
Professor Nikolic: “It’s like whether you’d like your eggs scrambled or sunny side up and saying, ‘both’.”

Figure 4:MIMD: Multiple Instruction/Multiple Data Stream
Anything system involving multiple processors operating concurrently.
Multicore processors (e.g., modern laptops)
Warehouse Scale Computers (e.g., datacenters)
Over the next few lectures, we will explore these architectures and write programs that exploit the hardware parallelism available. Stay tuned!!!
For what it’s worth, most programs written today assume a SPMD (single program, multiple data) model. With SPMD, we write a single program that uses multiple degrees of parallelism across processors of a MIMD computer with some cross-processor coordination.