HOME  > >  FLAT EARTH ACADEMY HOME
Bookmark this on Delicious     Recommend toStumbleUpon

## Random Numbers

### Measuring their randomness

You want some random numbers? You won't do much better than a gaming die! (Thank you, WikiCommons. And, as ever Wikipedia has a wonderful article on dice... Once again I was amazed at how much there is to know about even a simple thing.)

Hmm... or is a die so very wonderful?

Assuming it is an "honest" die, you will get, say, a "4" truly randomly. What do I mean by "truly randomly"? Where do we get "un-truly random" numbers? (Why we care to do so is a story for another time. Just believe me: If you want to do cryptography, random numbers will be a huge help to you. And if you want to crack codes, you'll need to understand how the code creators use random numbers. Anyway... hope that some of the following is "just fun", even if not of any apparent great "importance".)

The die gets full marks for randomness... but it falls down on other points.

For a start, it is relatively easy to guess what number will come up.

What? Easy to guess what number will come up?

Tell you what... I'll bet you \$1 that the next number to come up will be a 4. I'll give you a dollar if it isn't... but you have to give me seven dollars if it is. And you have to agree to do the bet with me 100 times. Don't stop reading yet.

I also happen to have a die with 12 faces, lettered "A" to "L". If you want, I'll do a similar bet with you with this die, too. I'll pay you seven dollars if the die come up as "E"; you pay me one dollar if it doesn't. If you want the first bet, you also have to take this bet. Seven-to-one odds in both cases. Once for you; once against you. What could be wrong with that? See one of the reasons a die is weak?

A six sided die has a 1 in 6 chance of turning up a 4. A twelve sided die has a 1 in 12 chance of turning up a 12.

In all that follows, you can "improve" the random number generator, in most respects, if you expand the range of values it can return. We will leave that parameter there, and from now on almost always discuss random number generators which return one of the base ten digits: 0,1,2... 7,8 or 9. In every (almost every?) case, whatever is said below can be adapted to also be true of a generator returning a larger range of results.

You might say that the difference is moot between a generator which returns 0-9 (inclusive. I will always quote ranges inclusively on this page) and one that returns, A-Z... you can "just" "draw" two digits at a time from the 0-9 generator, and use 0 for A, 1 for B, ... 25 for Z. (Not 26. We started from zero, remember?) (And "throw away" digit pairs which come to more than 25.)

Not so. You're going to waste a lot of time generating and throwing away useless digit pairs. Of course, if a computer is encrypting something, the wasted time may not be perceptible... but don't be too casual about such things.

If you can generate 0-9 (nearly) randomly, then it usually takes very little to tweak your algorithm to make it generate a different range of values.

### What Random Numbers Have To Do With Cryptography

A "random" number generator is often at the core of a cipher. Or, a pseudo-random number generator, anyway.

Before we worry about the "pseudo-" part... what do we mean by "random", for a start? The answer to that can (and has) filled books. A second shelf can be devoted to books about how randomness is measured.

Flip a coin 100 times. Write down "H" or "T" each time you get a "heads" or a "tails". Congratulations! You have just generated a random sequence of H's and T's. But the process is hardly convenient. And, while there are many ways you could use it for encrypting a message... encrypting it quite "strongly"... you would have to let "Bob" (the person entitled to decrypt the message) have a copy of your list of H's and T's... and as soon as that list is in transit, your message's security is threatened.

Even the crudest computer programming language comes with a "random" number generator. And it "will do" for many things. But the numbers aren't actually "random". And if used for encrypting, they introduce vulnerabilities.

I've also written a less "theoretical" page which goes into a way to use random numbers for encrypting. On the same page I've explained the elegance of XOR, which is also useful for cryptography. And said a bit about modular arithmetic.

### "Random" generators for 0-9

Most modern computer programming languages provide the user with a "random" number generator. They even sometimes use the more precise term, "pseudo-random number generator". (It's the randomness which is "pseudo", not the generator! "Pseudo": "Pretending to be something it is not".)

Usually something like "RND(10)" will return a number in the range 0-9 (inclusive, remember).

And few humans would see a pattern in the numbers returned, if they looked at, say, 100 such numbers, produced, in sequence, by a program written in any of the languages.

But the numbers are NOT random. "Random enough" for many purposes... but not random.

#### Not random

Here's the first clue:

Usually, you use the "RND" (or whatever it is called) function to return a value, to generate a number. But often you can assign a number to the RND function, or use a separate procedure to "seed" the pseudo-random number generator.

The first number you get from RND after seeding it with, say, 5, will always be the same. And the one that follows, while it may be a different number, will be the same one you got second the last time you seeded the generator with 5.

The fact that you can "seed" the generator, and get the same sequence of numbers out of it afterwards is very useful in de-bugging work.

Seeding can also be used as follows: As the program starts, you "ask" the machine for the current time. From that, you extract how many milliseconds have passed since the last new second began, and you use that number (the millisecond count) to seed the random number generator. Now your program always runs in only 1 of a thousand ways... but which of those thousand is down to chance, unless you have a way to start the program at a precise time.

#### And...

So... if you seed the generator, you get the same string of random numbers each time you run it.

Worse yet... although where you START in the series may change from run to run, depending on chance, or on how you seeded the generator, the sequence of numbers is always the same. (Not the disaster it seems... don't panic yet.)

So... in a Very Bad language, you might see the following sequences, if the generator was only generating a 0, a 1 or a 2. (I've made the series a little seemingly not random, in order for you to see my point... but in fact there is no reason that three 0s couldn't arise in a row, any more than it is impossible for you to throw 8 "heads" in a row flipping an honest coin. Not impossible... only unlikely.

```0002101210012000210121001200021012100120002101210012
-            -            -            -
or
1012100120002101210012000210121001200021012100120002
-            -            -            -
or
0121001200021012100120002101210012000210121001200021
-            -            -            -
```

What I've imagined for my example is a bad "random" number generator which turns out 0002101210012 over and over again. (The hyphens under each string are there to help you find a common "repeats from here" point.) Its only "claim to fame" is that is doesn't always start with the first part of the string of numbers it endlessly repeats.

"Well that's pretty dreadful!", I hear you exclaim.

Actually, that's what all of the commonly available pseudo-random number generators do, as far as I am aware.

But. The number of values returned in the string of values before the generator goes back and goes through them again can be HUGE. So that's (mostly) all right, then.

Furthermore, we can USE the predictability of such generators to our advantage, both as people doing encrypting, and as people trying to decrypt things we're not supposed to. The famous, and very good, "Enigma" machine, important in World War Two used a predictable "random" sequence. The machine wasn't weak. Turing and Co. at Bletchley Park were very good.

And by the way... the heroes who obtained, at great personal risk, an enemy code machine (according to Wikipedia... I recalled it as a codebook, but that's moot), an action terribly helpful to cracking an important code, were not, as depicted in the movie U-571 , from a warship of the USA. The USA was still taking its time about joining the fight at the time of the events which inspired the movie.

### Testing Randomness

You may, at some point, want to create your own random number generator, to make an encryption process more secure. What tests should a generator of the numbers 0-9 pass?

#### Frequency

For a start, your generator should throw up, in 1000 calls, 100 0s, 100 1s, 100 2s... etc. That would be proof of pretty good "randomness", don't you think?

.... think?

Don't scroll down just yet. You think, yes, that means "random" missing? You're (until you think harder) missing something.

Think you've got it? Final answer? Have you thought of all the objections? Scroll down when you are ready to see where they went wrong. You might, if you are a programmer, think a bit about how you would write a program to test a generator for the frequency of the numbers it turns out.

The following passes the "same frequency for 0s, 1s, 2s, etc" test, doesn't it. But I wouldn't call it random!...

```01234567890123456789012345678
90123456789012345678901234567
89012345678901234567890123456
78901234567890123456789012345
67890123456789012345678901234
56789012345678901234567890123
45678901234567890123456789012
34567890123456789012345678901
23456789012345678901234567890
12345678901234567890123456789```

So! More than a uniform distribution is needed. How would you look for "random" sequencing of a uniform distribution of numbers. It is quite a fun challenge. One approach is to generate a table. I've made one up for a NEARLY "perfectly" "random"... on some measures... generator, and shown you the results from 1000 calls of that generator. The table says that in almost every case, 10 instances of one case arose. There were two little bits of non-"random"ness. The generator threw out a 2 following a 0 once more than you would hope, and it threw out a 1 following a 3 once less than you would hope.

```     0   1   2   3   4   5   6   7   8   9
0   10  10  11  10  10  10  10  10  10  10
1   10  10  10  10  10  10  10  10  10  10
2   10  10  10  10  10  10  10  10  10  10
3   10  09  10  10  10  10  10  10  10  10
4   10  10  10  10  10  10  10  10  10  10
5   10  10  10  10  10  10  10  10  10  10
6   10  10  10  10  10  10  10  10  10  10
7   10  10  10  10  10  10  10  10  10  10
8   10  10  10  10  10  10  10  10  10  10
9   10  10  10  10  10  10  10  10  10  10
```

I suspect that an imperfect "random" number generator could "pass" even this test, but I am going to leave the question here for now.

### Patterns in languages

Cryptanalysts... those who try to "break" codes... have, for a long time, made much use of the fact that the distribution of letters in text is not random.

In English, the most common letter is "E", and you can get tables of the frequencies of all the other letters, and of the letters in other languages.

You can also get tables like the big one just a little way up the page, but for the occurrence of pairs of letters.

There is a letter pair in English which is a major gift to cryptanalysts.... can you think of what it would be?

Again... I've created some vertical white space to spare you accidentally seeing the answer too soon.

Yes! "QU". In English, it is exceedingly uncommon for a Q to arise without a following U. (All of the exceptions I can think of are borrowed words.)

But cryptanalysis is a story for another day. I hope this little look at "random" will support your understanding of material to come.

That's about it, for this page. I have done a separate page, showing one way that a sensible random number generator can help someone encrypt a message.

I've also provided a second page relating to things mentioned about. The second page goes into the frequency of letters in English, and in some other languages. And not just the "basic" frequencies, either!

### Enough!

... unless you are ready for another topic, in which case, if you came here from a link on the cryptography topic's main page, just close this window or tab, and you should find it sitting there, waiting for you.

Have you heard of Flattr? Great new idea to make it easy for you to send small thank you\$ to people who provide Good Stuff on the web. If you want to send \$\$erious thank yous, there are better ways, but for a small "tip" here and there, Flattr ticks a lot of boxes which no one else has found a way to do yet. Please at least check out my introduction to Flattr, if you haven't heard of it? "No obligation", as they say!

Search across all my sites with the Google search button at the top of the page the link will take you to.
Or...

Search just this site without using forms,
Or... again to search just this site, use...

The search engine merely looks for the words you type, so....
*!  Spell them properly   !*
Don't bother with "How do I get rich?" That will merely return pages with "how", "do", "I", "get" and "rich".

I have other sites. My Google custom search button will include things from them....
One of my SheepdogGuides pages.
My site at Arunet.