Loss of Precision

A floating number x, labelled \( fl(x) \) will therefore always be represented as $$ \begin{equation} fl(x) = x(1\pm \epsilon_x), \tag{6} \end{equation} $$ with \( x \) the exact number and the error \( |\epsilon_x| \le |\epsilon_M| \), where \( \epsilon_M \) is the precision assigned. A number like \( 1/10 \) has no exact binary representation with single or double precision. Since the mantissa $$ \left(1.a_{-1}a_{-2}\dots a_{-n}\right)_2 $$ is always truncated at some stage \( n \) due to its limited number of bits, there is only a limited number of real binary numbers. The spacing between every real binary number is given by the chosen machine precision. For a 32 bit words this number is approximately $ \epsilon_M \sim 10^{-7}$ and for double precision (64 bits) we have $ \epsilon_M \sim 10^{-16}$, or in terms of a binary base as \( 2^{-23} \) and \( 2^{-52} \) for single and double precision, respectively.