CommonLounge Archive

How Random Number Generation works, with Algorithms and Examples

February 03, 2018

Random number generation is important for lotteries, games and security. In cryptography randomness is important because it removes any reasoning and therefore any predictability. An attacker is usually trying to attain information on a system, when this information is randomly generated there are no clues as to what it maybe and therefore no open opportunities to attack the system.

The difference between True Random and Pseudo-random

Pseudo-random

Most standard libraries for random number generation will produce pseudo-random numbers. These are numbers that satisfy at least one test for randomness but are generated by a deterministic causal process. This ultimately means that they are not true random numbers because the process will produce the same set of ‘random’ numbers in each run. This is often useful in computational modeling but can render cryptographic systems vulnerable by introducing an element of predictability. All that would be needed for an attacker to attain the pseudo-random numbers used in encryption is the algorithm used for generating the numbers and the initial input passed to that algorithm (also called “seed”).

Randomness is used to obscure links between keys and messages and to remove any pattern in choices of secure numbers. The fact that most computerized random number generators are deterministic means that the numbers generated come from a finite pool and will ultimately at some point repeat and follow some pattern. Recognition of this pattern renders the so-called random number generation non-random to those with information on the system. In World War II the Japanese PURPLE cipher machine implemented a cryptographically secure cipher for diplomatic messaging however their pseudo-random number generation was not truly random enabling the allied cryptanalysts to identify patterns in the supposedly random key values and thereby repeatedly break the cipher. If true random numbers had been used this would not have been possible and the cipher would have not been compromised.

True random from nature

True randomness is only observed in natural phenomena. This is the source of true random number generation. In computation the randomness is often taken from measuring thermal noise from a semiconductor resistor. The Linux operating system maintains a pool of true random numbers generated by (a) the speed of key presses while typing and (b) the least significant bit of digit voltage measurements, amongst other non-deterministic and measurable processes.

The problem of using this style of random number generation in cryptography is that one needs to be sure that your adversary cannot measure the same thing and thereby attain the same random number. This leads to a meeting of physical security and mathematical security which is undesirable. Instead cryptographically secure pseudo-random number generating functions are used.

Cryptographically secure Random Number Generators

To be cryptographically secure a pseudo-random number generator must pass the next-bit test and withstand state compromise extensions.

The next-bit test requires that given the first k bits of a random sequence there is no polynomial-time algorithm that offers a greater than 50% probability of correctly predicting the k + 1th bit of the sequence.

The ability to withstand a state compromise extension requires that if the state of the number generating process is revealed at a given point then it is not possible to use this knowledge to reconstruct the stream of pseudo-random numbers already generated.

For example, if the pseudo-random number generator works by picking digits from a randomly chosen point in the natural number e in sequence this will satisfy the next-bit test but will be unable to withstand state compromise extensions since once the current bit in use has been determined all previous pseudo-random output can be determined by reading backwards through e.

Random Number Generating functions

Seed

Given that pseudo-random number generators are deterministic they must be given some input as a starting point. This input determines the sequence of numbers generated and is known as a seed. Seeds can be chosen to optimize performance or for consistency in modeling runs. If the same seed is used twice the same set of pseudo-random numbers will be generated, therefore in cryptography it is important to change seeds often to avoid repeated and predicable series of supposedly random numbers.

Linear Congruential Generators

The simplest and once frequently implemented pseudo-random number generation system is that of linear congruential generators. These generators produce a series of pseudo-random numbers given an initial seed x0 and integer parameters a, b and m by the following congruence:

$$ x_n\equiv ax_{n-1}+b\ (mod\ m) $$

Example:

If we have a seed x0 = 1073 and parameters a = 35, b = 528 and m = 2547 then the series of random numbers will begin as:

$$ \begin{gathered} x_1\equiv35\times1073+528\equiv38083\equiv2425\ (mod\ 2547) \\ \\ x_2\equiv35\times2425+528\equiv85403\equiv1352\ (mod\ 2547) \\ \\ x_3\equiv35\times1352+528\equiv47848\equiv2002\ (mod\ 2547) \\ \\ x_4\equiv35\times2002+528\equiv70598\equiv1829\ (mod\ 2547) \\ \\ x_5\equiv35\times1829+528\equiv64543\equiv868\ (mod\ 2547) \end{gathered} $$

This style of random number generation is often used in experimentation but given its predictability (even if the parameters are not known) and its repletion after a certain point this style of pseudo-random number generation is avoided in cryptographic applications.

Blum-Blum-Shub Generator

Often for cryptographically secure pseudo-random number generation one-way functions are used in the generation of random numbers to ensure that the bits generated are not predictable. One-way functions are easy to compute but hard to reverse engineer. A function y = f(x) is one-way if it is simple to compute y given x but (near) impossible to compute x given y and f.

Suppose that you have a one-way function f and a random seed s. Let x = f(s + j) for j = 1, 2, 3, …. If we let bj be the least significant bit of xj, then the sequence b0, b1, … will be a cryptographically secure sequence of pseudo-random bits.

One of the most popular cryptographically secure pseudo-random bit generators is the Blum-Blum-Shub (BBS) pseudo-random bit generator which builds upon an intractable problem from number theory. This is also known as the quadratic residue generator. The set up for this generator requires two large primes p and q to be chosen such that p ≡ 3 (mod 4) ≡ q. The product of these primes is then calculated to attain the value n. Then an integer value x is chosen such that x is coprime to n. The initial seed of the BBS generator is then given by x0x2 (mod n). The sequence of random bits b0, b1, … is then generated from the following where b­j is the least significant bit of the value x.

$$ x_j\equiv x_{j-1}^2\ (mod\ n) $$

This process is computationally expensive for a long series of random bits. To make it faster, it is possible to let b0 through bk be the k least significant bits of xj provided that k < log2(log2(n)).

Example:

As an example let p = 67 and q = 79 so that n = 67 × 79 = 5293. Let x = 127. Then the initial bits are calculated as follows.

$$ \begin{aligned} 127 ^2\ (mod\ 5293)&\equiv250\longrightarrow11111010_2 \\ \\ 250^2\ (mod\ 5293)&\equiv4277\longrightarrow1000010110101_2 \\ \\ 4277^2\ (mod\ 5293)&\equiv121\longrightarrow1111001_2 \end{aligned} $$

This process can continue as long as is necessary.

Since log2(log2(5293)) = 3.62, we can choose k = 3. Taking the least significant three bits from each outcome we attain the random binary bits 010 101 001.

If we want to generate 4-bit random numbers, i.e. random numbers from the range 0 to 15, then we can split the above sequence of random bits into chunks of 4 bits - 0101 0100. This gives us the numbers 5 and 8 as the first two numbers generated.

Applications

Random numbers are required in cryptography for key generation, nonces (arbitrary single-use numbers often used in authentication and time-stamping), salts (random inputs used in hash functions) and one-time pads (single use encryption mechanisms).

Random numbers are also used more generally to obscure deterministic patterns, pad messages where the length needs to be changed and remove human biases from procedures.


© 2016-2022. All rights reserved.