The UK magazine, PC Pro, has an amusing article on what it calls “the world’s ten most calamitous computer cock-ups”. While this is of course tongue-in-cheek — I certainly don’t propose to be one of those amazingly tiresome people who debate whether this is the correct “top ten” — there are some interesting points in common with ideas we have touched on here.
The first observation is that there is only one entry on the list that, strictly speaking, was a problem with the computer itself (that is, the hardware). That is the (in)famous Intel Pentium floating-point bug from 1994. The new Pentium chips had a logic error in the floating-point processor, and sometimes didn’t give quite the right answer when a FDIV
instruction (floating-point divide) was used.
At the heart of the problem were errors and missing tables in the FPU’s on-chip instructions for division, meaning that in certain circumstances sums were miscalculated.
Intel initially downplayed the problem, but it quickly became an embarrassment. Quite a few jokes circulated on the nascent Internet; there was even a song, with several variants, to be sung to the tune of A Bicycle Built for Two:
Daisy, Daisy, give me your answer, do.
My math’s crazy; can’t divide three by two.
Right answers, I can not see ’em-
They’re stuck in my Pen-tee-um.
I could be fleet,
My answers sweet,
With a workable FPU.
Some of the other entries touch on familiar themes. For example, NASA’s Mars Climate Orbiter was burned up in the Martian atmosphere on its arrival, because some members of its design team apparently got metric and imperial units mixed up.
There are other familiar culprits, too. The US power outage in August 2003, and possibly the AT&T phone outage of January 1990, were caused by race conditions in their (complex) control systems. Of course, the most common cause of all, human screw-ups, are well represented.
It’s an amusing article, and it’s worth remembering how small mistakes can have large, expensive, and embarrassing consequences.