Rounding and Other Errors, Part 1

November 4, 2009

I have just finished reading an interesting, though I think slightly flawed, article at the TechRadar site, entitled “Why Computers Suck at Maths”. (The site is from the UK, so the plural is correct in local usage.) It gives interesting examples of some of the snares for the unwary in doing calculations — particularly floating point calculations — but it unfortunately sometimes misdiagnoses the problem, beginning with the title.   It is not that computers suck at math, but that too many programmers are among the mathematically unwashed.  So I am going to write a couple of notes here as a sort of educational feature, and try to explain some of the different sources of potential error in computer calculations.  Broadly speaking, they fall into a few categories:

  1. Inappropriate use of integer arithmetic
  2. Rounding error (the real thing)
  3. Representational limitations (often, as in the article, mislabeled as rounding error)
  4. Loss of significance
  5. Perverse formulations

I have seen many examples of all of these over the course of almost 40 years of using computers, and I am convinced that it is important that programmers understand them, and the differences between them, because different problems have different potential remedies.  There is a relationship among them, of course, but usually one or another plays the starring role in producing a particular problematic result.

1.  Inappropriate Use of Integer Arithmetic

This is really the simplest problem to explain.  The set of all integers (whole numbers) is closed under the operations of addition, subtraction, and multiplication, but not under division.  This is one of those things that everyone knows; but it can nonetheless trip up the unwary when it is strictly enforced by the computer’s unteger arithmetic.  The following integer expressions all have the same value, namely zero:

0/5   1/5   2/5   3/5   4/5

whereas 5/5 is, of course, 1.  If you want to use something other than strict integer arithmetic, you have a choice:  you can write your own routines (or find someone else’s) that can deal with remainders and modular arithmetic generally, you can figure out some way to re-scale your data to avoid the need for problematic divisions, or you can use floating point, and learn about the other problems on  this list.

2.  Rounding Error (the real thing)

What I am calling “real” rounding error is the kind that can also occur in manual calculations.  You have all probably seen financial data, or the results of surveys, that contain a statement such as: “Totals may not add to 100% due to rounding error.”   Here is an example of what they are talking about.  First, let’s do an addition of three numbers; assume they are exactly correct, to any desired precision:


Now let’s suppose that we are reporting results to one decimal place.  Our addition becomes:


Of course, if we are maintaining full precision in the calculation internally, that last sum will be displayed as 15.9 rather than 15.8, resulting in the canonical disclaimer I mentioned earlier.

This can be especially troublesome in financial applications, where it is considered important to match figures exactly (to the penny, for instance).  The remedy here is simple: make the unit of the calculation the smallest possible unit.  If you are computing in US Dollars, make the unit of measure one cent, not one dollar.  (Of course, if you have one of those special cases where fractions of a cent are meaningful, then this advice needs adjusting.)

3.  Representational Limitations

This is often incorrectly described as rounding error, which description always irritates me, because it seems to suggest that increasing the precision of the calculation (going from single- to double-precision floating point — from float to double in C, for example) will somehow fix things, which is not true.  The problem is, in some sense, an artifact of how we represent numbers.

We all learned, sometime shortly after being introduced to fractions and their decimal representations, that there were some fractions that were easily represented as decimal equivalents:

1/2 = 0.5

3/10 = 0.3

2/5 = 0.4

However, there were other common fractions that had non-terminating decimal representations:

1/3 = 0.33333…….

This is whar I call a representation problem, and is a consequence of the Fundamental Theorem of Arithmetic.  Only fractions which are powers of the prime factors of the number-system base can be written exactly as terminating decimals (in this case).  Since the prime factors of 10 are 2 and 5, fractions such as 1/2, 1/4, 1/5, and 1/10 can be written quite handily.  Fractions which have other prime factors (like 3) don’t have a terminating representation.

Now computers use binary arithmetic, and two is a prime number.  So only fractions that are powers of two have terminating representations in binary notation.  This is particularly unfortunate since humans have more or less settled on base 10 as the most convenient base for calculations generally.  (It’s really a shame that Turing or von Neuman didn’t invent a computer with ten toes.)   In particular, it means that the common fraction 1/10, easily expressible in decimal notation as 0.10, cannot be expressed exactly as a terminating fraction in binary, no matter how high the precision.  So, if you naively add 0.10 to 0.10 in binary floating point, you will almost certainly not get exactly 0.20, but something like 0.19999997 or 0.20000000000001.

This is another problem that tends to come up in financial applications.  One early solution, used by IBM in the System/360, for example, was to provide for hardware and corresponding data types to support base-10 arithmetic directly (IBM called the data format packed decimal).  This, to a first approximation, essentially reproduced manual calculation inside the computer..  Ir worked, but was  horribly inefficient, without specialized hardware support, and somehow reminiscent of the idea of putting an automobile’s engine in the front, because that’s where the horse went.

Another solution is to change the representation of the data.  My first “real” job after grad school was in an investment management organization.  We were trying to put together a daily (!) reporting system for our portfolio managers.   For various reasons that were cogent at the time, but not all that interesting, the development had to be done in FORTRAN.   The idea of doing monetary computations with integer values representing a number of cents was tried; but a 32-bit integer can only hold a value of around two billion, which in cents is $20 million.  Not good, even before subsequent inflation.  But a DOUBLE PRECISION value had sixty-four bits in all, and the significance of the fraction portion was equivalent to about 15 decimal digits. So the remedy we adopted was to store all monetary amounts as a DOUBLE PRECISION number of cents.  That meant that, within a range of amounts (conservatively)  ± $1,000,000,000,000 ($1 trillion), we could make all the numbers balance to the penny.  Now this might still have problems if you are doing national income accounts, but for our purposes it did quite nicely.  The key is that we were using the double-length floating point format to do, essentially, integer arithmetic.

More to come …

In the next post, I’ll talk about the remaining two sources of problems: Loss of Significance, and Perverse Formulations.

Sun Updates Java

November 4, 2009

Sun Microsystems has released a new version of the Java run-time environment (which includes the Java browser plugins), Version 6 Update 17.  This update fixes a number of security vulnerabilities, as well as other bugs.  Further information on the changes is in the Release Notes.  The new version (for Windows, Solaris, and Linux) can be downloaded here.    Mac users need to get updated versions via Apple’s Software Update service.


%d bloggers like this: