Discover more from The Palindrome
1111111111 is a random sequence
Biases in our perception of randomness
You are (probably) wrong about randomness.
I’ll show it to you. Let’s do an experiment. Pick up a pen, paper, or your favorite text editor, and manually generate a random sequence of ten characters, composed of ones and zeros.
My guess is, your answer will be:
Probabilistically speaking, I should be right in only about 1 out of 1024 cases.
Statistically speaking, I am right way more frequently.
There is nothing wrong with
0110100110. It is a perfectly valid result. What’s striking is that there are no larger subsequences of repeating 0s or 1s. Intuitively, we consider
1111111111 not as random, but this has the same probability as any other 0-1 sequence of ten characters!
Don’t believe me? I have even performed an experiment to find out what you think is random.
The result is (not) surprising: your random 0-1 sequences are not as random as you think. Read on to find out why.
Understanding mathematics is a superpower. Subscribing to The Palindrome will instantly unlock it for you. For sure. (Or at least help you get there, step by step.)
Let’s go back to our initial experiment: generating random 0-1 sequences.
This is the thought process going on inside my head:
Ok, I’ll start with 0. I should continue with 1, as
00is not random enough. On the same note,
01should also continue as
010is alternating, thus it is not random enough.
As three 1-s in a row is again not random,
0110is the way to go. But I can’t continue with another 0: that’s two blocks of two repeated characters next to each other. Not random.
And it goes on like that. What is the issue? That what I perceive as random is biasing my thinking. Things like
blocks of alternating characters,
blocks of repeated characters,
and blocks of identical blocks
are considered not random, whatever that might be. Paradoxically, avoiding them causes sequences that are exactly predictable, like
What is random, then?
Think of the assembly of a random 0-1 sequence as flipping coins. Heads, and we write down 1. Tails, and it’s 0. Each outcome has 1/2 probability, and each toss is independent of each other.
We can model our sequence-generating process with independent, identically distributed Bernoulli random variables.
As the Xᵢ-s are independent, we can compute the probability of a given sequence by multiplying the individual probabilities.
And thus, the probability of obtaining the “random”
0110100110 and “not random“
1111111111 is the same.
This seems paradoxical at first, but there is a clear resolution: true and perceived randomness is not the same.
Surprisingly, one of the telltale signs of true randomness is the presence of runs, that is, subsequences of repeated characters. Let’s talk about them.
…and repeated characters
So, if we randomly generate a 0-1 sequence of ten characters,
0111111110 is just as likely as
0110100110. However, the former one contains a block of eight ones. There are exactly ten possible ways we can cram a block of precisely eight repeated characters, making sequences such as
This begs the question: what is the frequency of runs?
To make the problem simpler, we’ll only study the longest run of a given sequence. (Recall that a run is simply a subsequence of repeated characters.) For instance, the length of the longest run in
0111110011 is five, as it contains
11111, but no larger run.
As each 0-1 sequence of a fixed size have an equal probability, it is enough flex our combinatorial muscles and count the ones with the longest run having length k.
This is not a simple task. There is an explicit formula, see this ancient landmark paper about The Distribution Theory of Runs by A. M. Mood, but as I don’t want to shock you (or myself) with towering mathematical expressions, I’ll just generate a million random sequences and count the longest runs by sheer brute force. I know, I am getting old. (If you are interested in the code, here is the Google Colab notebook. Be warned though: writing clean code was not my aim.)
Here is (a very close approximation of) the distribution.
As you can read from the plot, the probability of having larger runs is much more significant than what we intuitively expect.
(I just found out that you can style matplotlib figures like XKCD comics, and I am in love.)
To help you quantify this, here are the tails of the distribution; that is, the probabilities of a ten-character sequence having at least k-long runs.
Some concrete numbers. The chance of at least one 4-long run is 46%; at least one 5-long run is 21%. The expected length of the maximal run is ~3.661436.
These are in stark contrast with what we perceive as random. For humans,
0111110101 looks like a pattern, yet similar runs randomly happen 21% of the time.
What do we think is random, then?
Humans are not so random
To find out, I have performed an experiment on Twitter.
I have asked you to give me a 0-1 sequence of ten characters. What I was mostly interested in were the longest runs. I conjectured two things:
the length of the longest run in most sequences will be 2 or 3,
and sequences with 9 or 10 long runs will be significantly over-represented compared to the truly random.
Why? Simple. Either you are aware of our biases towards randomness and you want to troll me, or you are biased.
Both of my guesses were confirmed. Check out the result:
To simplify the comparison, here are the distributions for true random vs human-generated sequences, plotted on top of each other.
See the difference?
Let’s do some statistics to see if the difference is real. One way to check that is Pearson’s chi-squared test, which tells us whether a set of categorical observations is coming from the specified distribution or not.
In our case, the categories are the lengths of the longest runs: 1, 2, …, 9, and 10.
The chi-squared test works by calculating the normalized sum of squared differences between the observed and expected distributions.
Our null hypothesis (H₀) is that the observed and expected distributions are the same. If H₀ is true, the test statistic follows a chi-squared distribution with n-1 degrees of freedom, where n is the number of values our distributions can take. (Which is ten in our case.)
The smaller the value of 𝜒² is, the more likely the null hypothesis.
Pearson’s chi-squared test reveals that human-generated random 0-1 sequences come from a different distribution with a whopping 99.97967% chance.
So, your random numbers are definitely not random.
(The user-generated data is far from perfect. There are multiple issues, mainly,
as my audience is more knowledgeable with probability and statistics than an average human, many participants were conscious of their biases,
and I was unable to reinforce that no random generators were used (some sequences were obtained from quantizing the digits of π or converting random integers to base two).
This experiment also fails to directly address our bias against runs. An “adversarial” approach would be better for that purpose: participants would be given a list of 0-1 sequences which they have to label “random” or “not random”.
Moreover, as Twitter cut off its free API access, I had to manually copy and paste your answers. Yeah, I know. I suspect that there are more answers than that the Twitter UI is showing me, so I’ll redo the plots when I’ll get API access.)
As the sequences grow
What happens as the size of 0-1 sequences grow?
For one, runs can be confidently used to determine if they are indeed random. This is called the Wald-Wolfowitz runs test. If the number of runs significantly deviate from the expected, the consequtive elements are (probably) not independent; that is, they are not random.
So, how big of a runs we should expect?
Erdős and Rényi - two legendary mathematicians - showed that in N-long random 0-1 sequences, the probability of finding a log₂N -long run of 1-s is asymptotically 1.
You can read the result in its full glory in the paper On a new law of large numbers by Paul Erdős and Alfréd Rényi. It’s technical and convoluted, so allow me to unpack it. Fasten your seatbelts.
Rather than studying runs, Erdős and Rényi were interested in the density of 1-s. These are given by partial and running sums. The n-th partial sum Sₙ is defined by simply summing the sequence up to its n-th member.
Runs can be expressed in terms of running sums, that is, the difference of partial sums.
Running sums are obtained by defining a window, then sliding that window through our sequence and summing up the values inside it.
If a particular running sum equals the size of the window, it is a run. The average of running sums describe the density of 1-s.
Erdős and Rényi were interested in the maximal running sum of a given sequence.
Just like all the previous quantities, the maximal running sum is a random variable.
Phew. After a lengthy setup, here comes the result. If N (the length of the sequence) goes to infinity, the probability of finding a log₂N -long run goes to 1.
Once more, I have simulated some sequences and estimated the distribution of the maximal running sums over log₂N-sized windows. Here is the result:
I know, this is not the most exciting figure ever, but this is exactly the point: as the length grows, we will find a log-sized run with increasing probability.
Humans are instinctively biased about randomness. We tend to over-represent small blocks of repeating subsequences, and under-represent larger ones. When asked, we tend to generate predictable and boring sequences such as
However, true randomness doesn’t make a distinction between
1111111111: both are equally probable, even though we perceive the latter one less random. In fact, the lack of long runs signals the lack of randomness.