Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1Learning Outcomes

We continue our exploration of memory by studying C arrays. On the surface, C arrays seem fairly similar to what you might recognize from Java. In this section, we learn that arrays in C are neither variables nor pointers. When used in C statements, array names often behave like pointer variable names, for reasons we will describe shortly.

To declare an array of two elements without initializing its values, we can use the below statement. This statement declares a block of memory large enough to hold two contiguous ints. It does not initialize values, so we can assume elements contain garbage:

int arr_unitialized[2];

To initialize and declare an array of two elements 795 and 635, in that order:

int arr2[] = {795, 635};

or equivalently

int arr2[2] = {795, 635};

Square-bracket indexing is one way to access elements of the array. Like many languages, C specifies zero-indexed arrays:

arr2[0]; // 795

2Array indexing uses pointer arithmetic

Is there another way to access array elements? Yes, otherwise we would not have been so cryptic earlier.

Square-bracket indexing for C arrays is what we call “syntactic sugar”–meaning, it exists for human readability, but the C compiler will translate it to two operations: pointer arithmetic followed by dereference:

The expression arr[i] is equivalent to the expression *(arr+i). The latter treats the array name arr as a pointer, increments it, then dereferences.

2.1Example

Suppose that when compiled, Program 1 below produces the memory layout in Figure 1. q is a pointer to a 32-bit unsigned integer, while arr is an array, i.e., a 24-byte contiguous block of 32-bit unsigned integers.

1
2
3
4
5
6
7
8
9
10
#include <stdio.h>

int main () {
  uint32_t arr[] = {50, 60, 70}; // 32-bit unsigned array
  uint32_t *q = arr;

  printf("    *q: %d is %d\n", *q, q[0]);
  printf("*(q+1): %d is %d\n", *(q+1), q[1]);
  printf("*(q-1): %d is %d\n", *(q-1), q[-1]);
}
"TODO"

Figure 1:Memory layout for Program 1.

Because square-indexing is syntactic sugar for pointer arithmetic and dereference:

3Arrays are not pointers

From K&R:

There is one difference between an array name [(such as a)] and a pointer [(such as pa)] that must be kept in mind. A pointer is a variable, so pa=a and pa++ are legal. But an array name is not a variable; constructions like a=pa and a++ are illegal.

Also from K&R:

The name of an array is a synonym for the location of the initial element.

Pointers and arrays therefore differ in how they behave with the address operator, &. Consider Program 2:[1]

1
2
3
4
5
6
7
8
9
10
11
12
13
int *p, *q, x;
int a[4];
p = &x;
q = a + 1;

*p = 1;
printf("*p:%d, p:%x, &p:%x\n", *p, p, &p);

*q = 2;
printf("*q:%d, q:%x, &q:%x\n", *q, q, &q);

*a = 3;
printf("*a:%d, a:%x, &a:%x\n", *a, a, &a);

With the memory layout in Figure 2, the output is:

*p:1, p:108, &p:100
*q:2, q:110, &q:104
*a:3, a:10c, &a:10c
"TODO"

Figure 2:Memory layout for Program 2.

The address of the array a is the address of the array itself, i.e., the address of the large contiguous memory block of ints!

4Array names “decay” with functions

When used with functions, arrays decay to pointers in two ways. We use Program 3 below as an example.

1
2
3
4
5
6
7
8
9
int bar(int arr[], size_t nelems){
   … arr[…] … 
}
int main(void) {
    int a[5], b[10];
    … 
    bar(a, 5);
    …
}

1. When used as formal parameters for function definitions. On Line 2 of Program 3, the definition int arr[] is syntactic sugar for the definition int *arr. We recommend using the latter where possible to avoid confusion.

2. When passed in as arguments to function calls. On Line 7 of Program 3, the argument a is an array but decays to a pointer when the function is called. This decay effectively passes in the address of a as the first argument of bar.

5sizeof with arrays

We’ve discussed sizeof many times. For arrays, the compile-time operator will evaluate to the size of the array, in bytes.[2] This observation informs the behavior of Program 4:

1
2
3
4
5
6
7
8
9
10
11
void mystery(short arr[], int len) {
    printf("%d ", len);
    printf("%d\n", sizeof(arr));
}

int main() {
    short nums[] = {1, 2, 3, 99, 100};
    printf("%d ", sizeof(nums));
    mystery(nums, sizeof(nums)/sizeof(short));
    return 0;
}

In practice, C programmers will commonly use sizeof(nums)/sizeof(short) to count the number of elements in the array nums. Note that nums must be declared in the same scope, otherwise it decays to a pointer.

6Arrays are primitive! Reminders

Hopefully this section has convinced you that arrays are relatively primitive constructs:

We close with a few final reminders of how this primitive nature begets responsible C practices.

Footnotes
  1. %d: signed decimal, %x: hex. Wikipedia

  2. We thought long and hard about how to explain &a and sizeof(a) (it involved sitting in a dark room with loud music). Both operations likely boil down to reasonable C design. After all, there must be some way to refer to the address and the size of an array. Instead of erroring, these two expressions are likely the only exception to treating array names as synonymous with the address of the first element. If you, the reader, have a better explanation, we’d love to use it. Submit a pull request!

  3. Take Computer Security to learn more! Wikipedia