1Learning Outcomes¶
Gain an intuition of how hardware word size can impact memory layout of a compiled C program.
Differentiate between 32-bit and 64-bit architectures.
Read memory layouts of C programs compiled on little endian machines.
Understand how padding and packing can impact the memory layout of members within a C struct.
No lecture video.
2Words¶
What’s in a word? In computer architecture, a hardware word is an important unit of data. The word size determines many aspects of a computer’s structure and operation, from how to load and store data from memory to how the compiler translates a single C arithmetic operation into multiple assembly instructions.
On most modern architectures, the size of the word determines the largest possible address size and therefore the size of a C pointer (see address space. A 32-bit architecture has a word size of 32 bits, or 4 bytes. A 64-bit architecture has a word size of 64 bits, or 8 bytes.
We will cover hardware words in much more detail when we learn about instruction set architectures. For now, we use the notion of a word to remind us that compiled C programs produce memory layouts that are architecture-dependent. We discuss a few architecture-dependent characteristics of compiled programs below.
3The Address Space¶
The address space is the hypothetical range of addressable memory locations on a particular machine. For example, a 32-bit architecture, a pointer can address 232 locations in memory[1]. Because memory is byte-addressable and contiguous, our address space size for a program is therefore 232 bytes (or 4 GiB, “four gibi-bytes”. We cover this notation later).
Show answer
A pointer on a 32-bit architecture must be large enough to represent all possible addresses in the address space. The address space of a 32-bit architecture is the 232 byte addresses ranging from 0x00000000 to 0xFFFFFFFF. These correspond to bit patterns of 32 bits, so a pointer must be able to store 32 bits of information.
All pointers on a 32-bit architecture must therefore be 4 bytes wide[2], so sizeof(int *) is sizeof(char *) is sizeof(int **) is 4.
4Another View of Address Space¶
Before we discuss our example, we’d like to share a diagram of memory that, while confusing at first glance, will be extremely useful in interpreting the memory layout of any compiled C program.
Recall that memory on a 32-bit architecture is laid out as a very long array of 232 bytes. A very long array would not fit on any page, whether horizontally or vertically. Instead, we use a visualization like Table 1, which shows memory as rows of 4 bytes, from low to high addresses:
In the rightmost four columns, “xx” values refer to data (hypothetical or otherwise) at each of four bytes of memory. These four bytes have contiguous memory addresses.
The leftmost column denotes the lowest address of the bytes in that row, i.e., the address of the rightmost byte.
In a given row, the rightmost byte has the lowest address, and the leftmost byte has the highest address. This is explained by the +3, +2, +1, and +0 headers.
Table 1:Memory is a very long array of bytes, but this diagram “wraps” the long array into rows of 4 bytes.
| address | +3 | +2 | +1 | +0 |
|---|---|---|---|---|
0x0 | xx | xx | xx | xx |
0x4 | xx | xx | xx | xx |
... | ... | ... | ... | ... |
0xFFFFFFF8 | xx | xx | xx | xx |
0xFFFFFFFC | xx | xx | xx | xx |
Example byte addresses
The upper-right “xx” is at adddress
0x0000000, or0b0000 0000 ... 0000 0000. This is the lowest possible address in the 32-bit address space.The upper-left “xx” is at address
0x00000003, or0b0000 0000 ... 0000 0011.The bottom-right “xx” is at address
0xFFFFFFC, or0b1111 1111 ... 1111 1100.The bottom-left “xx” is at address
0xFFFFFFFF, or0b1111 1111 ... 1111 1111. This is the highest possible address in the 32-bit address space.
A notable property of this visualization is that the addresses in the left column are multiples of 4. This layout effectively aligns our memory layout to words, because a 32-bit architecture has 4-byte words.
A confusing property of this visualization is that “lower” addresses are in earlier rows, whereas “higher” addresses are in later rows. This is not great for those of us that value meaningful naming conventions. However, when displaying large ranges of memory using debuggers like gdb, command-line output often displays data starting from lower addresses first, just like in this visualization.
5Compiled program example¶
Suppose that the following C program is compiled on a 32-bit architecture and produces the memory layout in Table 2.
#include <stdio.h>
#include <stdint.h>
int main(int argc, char *argv[]) {
int32_t value = 0x12345678;
char str1[] = "hi!";
char str2[] = "cs61c";
int16_t short_val = 0xaabb;
…
return 0;
}Table 2:Data layout of program on a 32-bit little endian machine.
| address | +3 | +2 | +1 | +0 |
|---|---|---|---|---|
0x0 | xx | xx | xx | xx |
0x4 | xx | xx | xx | xx |
... | ... | ... | ... | ... |
0x7F...FE164 | 0xaa | 0xbb | xx | xx |
0x7F...FE168 | 0x12 | 0x34 | 0x56 | 0x78 |
0x7F...FE16C | 'i' | 'h' | xx | xx |
0x7F...FE170 | 's' | 'c' | '\0' | '!' |
0x7F...FE174 | '\0' | 'c' | '1' | '6' |
0x7F...FE178 | xx | xx | xx | xx |
... | ... | ... | ... | ... |
0xFFFFFFFC | xx | xx | xx | xx |
There are two architecture-dependent aspects of this memory layout:
The bytes of
valueappear to be stored in “reverse” order. The least significant byte,0x78, has the lowest address!The word-sized
valuehas address0x7FFFE168, which is a multiple of four.
Together, these two observations tell us that the architecture is little endian, and that 32-bit integers (and perhaps other word-sized values) are word-aligned.
6Endianness¶
When data occupies multiple contiguous bytes in memory, the computer must determine which of the bytes is stored at the lowest address. This decision is often informed by the hardware architecture and in what order bytes are read from memory.
This property is called endianness. For a given word:
Little endian machines store the least_significant byte first, at the lowest address of the word.
Big endian machines store the most significant byte_ first, at the lowest address of the word.
The choice of endianness is one of convention[3]. Nearly all modern computer architectures are little endian.
Read more about endianness on Wikipedia.
7Alignment¶
7.1Word Alignment¶
One critical operation that the hardware word defines is memory access. As we shall see, many architectures are optimized to word-aligned memory access. This means that it is very fast to access an entire word when that word is located at a memory address that is a multiple of the word size. For a 32-bit architecture, this means reading 4 bytes, where the first byte lies on a 4-byte boundary.
Of the four variables in Program 1, only value is the size of a word. The compiler has therefore aligned value to a word boundary. The other variables str1, str2, and short_val do not have values that are word-sized and therefore do not have such a constraint.
7.2Struct Alignment¶
Let us revisit the idea of a struct. and consider how much space each declared struct occupies.
From Wikipedia:
Data structure alignment is the way data is arranged and accessed in computer memory. It consists of three separate but related issues: data alignment, data structure padding, and packing.
Data alignment is the aligning of elements according to their natural alignment. To ensure natural alignment, it may be necessary to insert some padding between structure elements or after the last element of a structure. For example, on a 32-bit machine, a data structure containing a 16-bit value followed by a 32-bit value could have 16 bits of padding between the 16-bit value and the 32-bit value to align the 32-bit value on a 32-bit boundary. Alternatively, one can pack the structure, omitting the padding, which may lead to slower access, but saves 16 bits of memory.
Consider the foo struct:
struct foo {
int32_t a;
char b;
struct foo *c;
}By themselves, a 32-bit integer, and character, and a struct pointer occupy 9 bytes. However, when declared together as a struct, C compilers may often choose to introduce padding into the struct itself to align the members of the struct. Padding a struct allows operations on its members to leverage the same speedups from word alignment had the members been declared separately.
Table 3:Structs can introduce byte padding. On the below 32-bit architecture, sizeof(struct foo) is 12.
| +3 | +2 | +1 | +0 |
|---|---|---|---|
| AA | AA | AA | AA |
| xx | xx | xx | BB |
| CC | CC | CC | CC |
Explanation
Suppose we declare struct foo s; and compile a program onto a 32-bit architecture. We might see Table 3 as above.
AA denotes the four bytes occupied by
s.a.sizeof(s.a)is 4.BB denotes the single byte occupied by
s.b.sizeof(s.b)is 1. The precise alignment ofs.bis implementation-specific.CC denotes the four bytes occupied by the
s.c.sizeof(s.c)is 3.
Ultimately, the struct declaration is a guideline for how to arrange a bunch of bytes in a bucket. The precise size of a struct—and field order within a struct type–depends on the C compiler and whether it is optimizing for padding or packing. We recommend you always check sizes with a debugger like gdb.
Logically, not in practice. Some areas of memory are read/write protected, e.g., accessing memory at the address
0(NULL) causes an error.Why do we not make 32-bit architecture pointers larger than 4 bytes? The primary reason is storage efficiency; if pointer addresses will never be larger than 4 bytes, do not waste bytes by allocating extra. The secondary reason is convention; by definition, a 32-bit architecture defines word size, which defines pointer size.