1Learning Outcomes¶
Understand how the IEEE 754 standard represents zero, infinity, and NaNs
Understand what overflow or overflow mean with floating point numbers
Understand how denormalized numbers implement “gradual” underflow
Convert denormalized numbers into their decimal counterpart
🎥 Lecture Video (overflow and underflow)
Overflow and Underflow, 6:54 - 8:40
🎥 Lecture Video (everything else)
Normalized numbers are only a fraction (heh) of floating point representations. For single-precision (32-bit), IEEE defines the following numbers based on the exponent field (here, the “biased exponent”):
Table 1:Exponent field values for IEEE 754 single-precision.
| Biased Exponent | Significand field | Description |
|---|---|---|
0 (0000000) | all zeros | zero |
0 (0000000) | nonzero | Denormalized numbers, aka denorms |
| 1 – 254 | anything | Normalized floating point (mantissa has implicit leading 1) |
255 (1111111) | all zeros | infinity |
255 (1111111) | nonzero | NaNs |
In this section, we will motivate why these “special numbers” exist by considering the pitfalls of overflow and underflow. Then, we’ll define each of the special numbers.
2Overflow and Underflow¶
Coming soon!
3Special Numbers¶
Unlike integer representations, floating point representations similar to the IEEE 754 standards can more “gracefully” handle overflow, underflow, and errors with special numbers. We discuss the four non-normalized categories shown in Table 1.
3.1Zero¶
Just like in the sign-magnitude zero, IEEE 754 floating point has two zeros (Table 2). Recall that the standard was built for scientific computing! Having two zeros is mathematically useful. Two examples: limits towards zero and computing , the latter of which we discuss next).
Table 2:IEEE 754 single-precision: Zero
| value | s | exponent | significand |
|---|---|---|---|
| +0 | 0 | 0000 0000 | 000 0000 0000 0000 0000 0000 |
| -0 | 1 | 0000 0000 | 000 0000 0000 0000 0000 0000 |
If we consider its mathematical representation, zero is our first encounter with a floating point representation that is not normalized. After all, there 0.0 in scientific notation has no leading 1!
Floating point hardware often implements zero by reserving the biased exponent value zero 00000000 to signal no normalization, i.e., not to implicitly add 1. If the significand is additionally all zeros, then the hardware knows it is zero. If the significand is non zero, we represent other non-normalized numbers, which we discuss below as denormalized numbers.
3.2Infinity¶
The IEEE 754 standard defines positive infinity () and negative infinity (), as shown in Table 2. To represent infinity, we reserve the biased exponent value 11111111 and set the significand to zero.
Table 3:IEEE 754 single-precision: Infinity
| value | s | exponent | significand |
|---|---|---|---|
0 | 1111 1111 | 000 0000 0000 0000 0000 0000 | |
1 | 1111 1111 | 000 0000 0000 0000 0000 0000 |
Because infinity is such an important concept in mathematics, the standard differentiates infinity from other arithmetic errors (which we discuss next). Importantly, dividing by yields . Computations like should be representable,[1] even if not as actual “numbers.”
3.3Not a Number (NaN)¶
What if we try to compute invalid arithmetic, like or ? For scientific computing, it may be more valuable to “bubble” these errors up to the user–instead of explicitly crashing the program or computing incorrect values due to wrap-around (e.g., in integer overflow).
NaNs (Not a Number) are values of the following form (Table 4):
Table 4:IEEE 754 single-precision: NaNs
| value | s | exponent | significand |
|---|---|---|---|
| NaN | either | 1111 1111 | non-zero |
Because these values are triggered upon overflow (note the high exponent), they contaminate: .
Certain proprietary hardware for floating point go further and use the significand to encode or identify where errors occur. This practice of error codes is not defined in the standard.
3.4Denormalized Numbers¶
Coming soon!
We defer to math majors.