Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

1Learning Outcomes

As discussed earlier, threads operate under a shared memory model, where different threads can read from and write to the same locations in memory.

Consider the below program. What are possible values of x after running this program with four OpenMP threads?

#include <stdio.h>
#include <omp.h>
int main() {
  int x = 0;                        /* shared variable */
  #pragma omp parallel
  {
    x += 1;
  }
}

2Data Race

Two memory accesses form a data race if:

In our multi-thread execution model, instructions from different threads have their execution interleaved in time, thus causing data races.[1]

Let us translate the parallel section of our example code into the three RISC-V instructions below. With four OpenMP threads, there are four sets of three sections to execute. Each thread accesses the same variable x in (shared) memory but has its copy of register t0 used in arithmetic, resulting in the data race.

lw t0 0(sp)   # x @ sp
addi t0 t0 1
sw t0 0(sp)

Because of the many possibilities of interleaving the execution of these twelve instructions, the final value of x is not always the same.[2] Consider the cases below.

Case 1
Case 2
Case 3

Case 1: All threads run sequentially.

ThreadInstructionx memory access
1loadread x: 0
1storewrite x: 1
2loadread x: 1
2storewrite x: 2
3loadread x: 2
3storewrite x: 3
4loadread x: 3
4storewrite x: 4

Final value of x: 4

3Critical Sections in OpenMP

To enforce multithreaded program correctness, we often need to synchronize threads, i.e., coordinate their execution. Most commonly, we must identify when one thread is finished writing so that it is safe for another to read.

Synchronization can be specified in user-level routines, i.e., in higher-level languages. A critical section is a segment of code that must be executed by a single thread at a time, thereby enforcing synchronization. Once a thread enters a critical section, it can safely execute all code in that critical section, knowing that it is the only thread that can execute that section at that time.

We discuss two OpenMP synchronization constucts:

Returning to our example code, we can specify a critical section and prevent any data races:

#include <stdio.h>
#include <omp.h>
int main() {
  int x = 0;                        /* shared variable */
  #pragma omp parallel
  {
    #pragma omp critical    
    {
      x += 1;
    }
  }
}

4Data Race Examples

4.1OpenMP Hello World: Add

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <stdio.h>
#include <omp.h>
int main() {
  int x = 0;                        /* shared variable */
  #pragma omp parallel
  {
    int tid = omp_get_thread_num(); /* private variable */
    #pragma omp critical    
    {
      x++;
    }
    printf("Hello World from thread = %d, x = %d\n", tid, x);

    #pragma omp barrier
    if (tid == 0) {
      printf("Number of threads = %d\n", omp_get_num_threads());
    }
  }
}
Footnotes
  1. A data race is not a data hazard. While data hazards result from instruction-level parallelism on a pipelined processor architecture, data races can occur even on single-cycle processors. Data races result from not determining instruction execution order between threads—i.e., which thread accesses memory first.

  2. Formally, a multithreaded program is only considered correct if ANY interlacing of threads yield the same result. Our example code is an incorrect program. For those curious, there are 8!/(2!)4=25208!/(2!)4 = 2520 different possible orders of load and store instruction pairs (or 105 orders if we consider that all threads are identical).

  3. If caches are in the memory hierarchy, it is likely that the two threads will each have their own copy of the same block of C in their respective cache. So while there is no data race, these copies will be “out of sync.” Read more about cache coherency in a later section.