Fast Inverse Square Root is an algorithm that computes an approximation of , the inverse square root of a 32-bit floating-point number. The algorithm most notably helped optimize graphics calculations in the 1999 video game Quake III Arena.
A unit vector is a vector that has been scaled to have a length of 1. Game engines normalize vectors so they have unit length while preserving direction. Unit vectors make dot products, reflections, and direction computations stable and comparable.
To find a normal vector , we can multiply the vector by the inverse of the vector’s magnitude, which is the square root of the dot product of the vector with itself.
The motivation for this algorithm is to compute this inverse square root quickly. The following C implementation includes the original comments from the Quake III source code:
float Q_rsqrt( float number )
{
long i;
float x2, y;
const float threehalfs = 1.5F;
x2 = number * 0.5F;
y = number;
i = * ( long * ) &y; // evil floating point bit level hacking
i = 0x5f3759df - ( i >> 1 ); // what the fuck?
y = * ( float * ) &i;
y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration
// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed
return y;
}
IEEE 754
The IEEE 754 standard specifies a 32-bit float as having a sign bit, 8 exponent bits, and 23 significand bits. We can ignore the sign bit for now, as the algorithm only works with real numbers, and is only defined for .
The exponent bits are biased with a value of 127 representing zero. Therefore exponents range from −126 to +127 (1 to 254 in the exponent field), since 0 and 255 are reserved for special numbers.
The significand bits have an implicit leading bit with value 1. The remaining 23 bits represent the fractional part of the significand, meaning that the significand is always in the range . Thus, a normalized floating-point number can be represented as:
Bit Hack
Since the algorithm operates on the bit-level representation of floating-point numbers, we need to reinterpret the bits of a float as an long. This is done using pointer casting in C:
i = * ( long * ) &y;
This line takes the address of the float variable y, casts it to a pointer to a long, and then dereferences it to get the bit representation of the float as a long integer.
Logarithm Approximation
The core of the algorithm is treating the float’s exponent as a base-2 logarithm, halving it to approximate a square root, then minimizing the error of the approximation using a constant.
Since , we can approximate , where is an arbitrary constant. Setting makes the approximation exact at and , but leads to larger interior error.
An optimal approximation of can be found using techniques such as least squares or minimax approximation.
Let’s alias a float to an integer to get a rough approximation of its logarithm. Starting with the float :
Using this intermediate approximation and interpreting the float as an integer , i.e., I_x = * ( long * ) &x;, we have:
Now, we can use this approximation and the logarithm identity:
to calculate :
This approximation for is implemented in the line:
i = 0x5f3759df - ( i >> 1 );
where the constant 0x5f3759df is derived from with the author’s chosen optimal . It is not know how this specific value was determined by the original authors.
Newton’s Method
The initial approximation can be refined using Newton’s method, which is an iterative method for finding successively better approximations to the roots of a function. Given an approximation for , then a better approximation is given by , where is the derivative of at .
We want to approximate , which can be reformatted as finding the root of the function . Then we can apply Newton’s method:
An iteration of this method is implemented in the line:
y = y * ( threehalfs - ( x2 * y * y ) );
Example
Let’s approximate the inverse square root of 4 using the Fast Inverse Square Root algorithm. The IEEE 754 representation of 4 is 0x40800000 in hexadecimal. Shift right by one bit to get 0x20400000.
Then subtract this from the magic constant 0x5f3759df to get 0x3ef759df. Reinterpreting 0x3ef759df as a float gives us an initial approximation of .
Performing one iteration of Newton’s method gives us , which is very close to the actual value of .