How Random is That?

Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.
John von Neumann

The Science Daily Web site has an article describing a new approach to generating random numbers.  The technique itself is interesting, and it potentially has some important applications.  Unfortunately, the article tends to oversell the potential benefits.

A new approach to generating truly random numbers could lead to improved Internet security and better weather forecasts, according to researchers writing in the International Journal of Critical Computer-Based Systems.

Traditionally, people seeking a stream of random numbers have taken one of two broad approaches to getting them:

  1. Some physical process, which is believed to be random (such as radioactive decay, thermal noise, or radio noise), is measured, and then (possibly) adjusted to remove any measurement bias.  Of course, particularly if only a small set of numbers is wanted, lower-tech methods, like rolling dice or flipping coins, can be used.
  2. An algorithmic method is used to generate a sequence of numbers that have a specified set of desired statistical properties.  Note that, in this case, the numbers are not “truly” random, since the algorithm is by definition deterministic.

What the researchers, from the University of Hagen and BTC AG in Germany, have done is to introduce a physical random number device into an algorithmic generation process.  The device uses a circuit element called a “flip-flop”, which can be in either of two states (conventionally, 0 or 1); by initializing it to a metastable state, its final state becomes unpredictable.  This potentially inserts a truly random element into the stream of numbers generated, without fooling around with Geiger counters or dice.

In essence, this addresses the weakness of the algorithmic approach: that if the algorithm is known, it is possible to predict the subsequent sequence of numbers.  There are some uses for random numbers for which this is very important.  Consider, for example, selecting the winning numbers in a daily lottery.  It is obviously undesirable, from the lottery promoter’s point of view, that anyone should be able to predict future winning numbers.  Similarly, in cryptographic applications, security often depends on the unpredictability of the number sequence.  (As, for example, in the one-time pad, the one provably unbreakable cipher.)

There is another class of applications for random numbers, though, where the unpredictability of the sequence is less important, as long as the numbers possess desired statistical properties.  The most prominent example is simulation (sometimes called stochastic or Monte Carlo simulation).  In a simulation application, we are often concerned with modeling the behavior of a real system under a range of possible conditions.  For example, we might want to simulate the returns to an investment strategy under a variety of economic and market conditions.  In this kind of application, the main difficulties arise in making sure that the “random” numbers accurately reflect the probability distribution of outcomes.

At one level, this can be an algorithmic problem.  Let’s suppose that we want to model a variable Q, which has a standard normal distribution.  One approach to generating values for Q, which I have actually seen used, makes use of a set of six uniformly distributed random values U1, U2, … U6 on the interval (0,1).  Then, by appeal to the Central Limit Theorem, the value Q’:

is approximately normally distributed with mean 0 and standard deviation 1.  However, this is, generally speaking, a very bad method for generating Q.  It should be obvious, for starters, that a value of Q’ greater than 3 can never occur.  While it is true that outcomes 3 or more standard deviations from the mean are unlikely, that is not the same as impossible.   This could be a serious problem in our example of simulating an investment strategy, since we probably are interested (or, at least, we should be) in extreme outcomes, especially negative ones.

There are much better algorithms for generating numbers drawn from a normal distribution, or other common distributions.  Of course, determining the true distribution for a real-world random variable is not always entirely straightforward (and assuming everything is normally distributed is very dangerous).

So this is a development with some promise, but it is not a panacea for everything that ails the use of random numbers.

2 Responses to How Random is That?

  1. luckytoilet says:

    I think for many applications of random numbers, the actual statistical randomness matters less than the speed of generation.

    Arithmetic random generation is fine for all except the strictest mathematical and scientific programs, I think.

    • Rich says:

      There are certainly some applications where speed of generation is quite important. Games in which some of the actions are “randomly” selected might be one example. I worked once, quite a few years ago, on a simulation system for air traffic control. It was connected to a flight trainer with a real pilot, so the simulated bits had to be able to keep up. (There were problems with that at first, but fortunately it turned out to be a programming botch, not a real fundamental problem)

      The key thing, really, is to make sure you understand the application well enough to know what properties you need.

%d bloggers like this: