Number of simulations = 1000
prop_wins_game_1
1 0.514
prop_wins_game_2
1 0.487
Definitions, axioms, and examples
In order to be taken seriously, we need to be careful about how we collect data, and then how we generalize our findings. For example, you may have observed that some polling companies are more successful than others in their estimates and predictions, and consequently people pay more attention to them. Below is a snapshot of rankings of polling organizations from the well-known website FiveThirtyEight1, and one can imagine that not many take heed of the polling done by the firms with C or worse grades. According to the website, the rankings are based on the polling organization’s ``historical accuracy and methodology”.
In order to make estimates as these polling organizations are doing, or understand the results of a clinical trial, or other such questions in which we generalize from our data sample to a larger group, we have to understand the variations in data introduced by randomness in our sampling methods. Each time we poll a different group of voters, for example, we will get a different estimate of the proportion of voters that will vote for Joe Biden in the next election. To understand variation, we first have to understand how probability was used to collect the data.
Since classical probability came out of gambling games played with dice and coins, we can begin our study by thinking about those.
In 17th century France, gamblers would bet on anything. In particular, they would bet on a fair six-sided die landing 6 at least once in four rolls. Antoine Gombaud, aka the Chevalier de Méré, was a gambler who also considered himself something of a mathematician. He computed the chance of a getting at least one six in four rolls as 2/3
The next popular dice game was betting on at least one double six in twenty-four rolls of a pair of dice. De Méré knew that there were 36 possible outcomes when rolling a pair of dice, and therefore the chance of a double six was 1/36. Using this he concluded that the chance of at least one double six in 24 rolls was the same as that of at least one six in four rolls, that is, 2/3 (
We will see later how to compute this probability, but for now we can estimate the value by simulating the game many times (1000 times each) and looking at the proportion of times we see at least one six in 4 rolls of a fair die, and do the same with at least one double six in 24 rolls.
Number of simulations = 1000
prop_wins_game_1
1 0.514
prop_wins_game_2
1 0.487
You can see here that the poor Chevalier wasn’t as good a mathematician as he imagined himself to be, and didn’t compute the chances correctly. The simulated probabilities are nowhere close to 4/6 and 2/3, the probabilities that he computed for the first and second game, respectively. 🧐
By the end of this unit, you’ll be able to conduct simulations like these yourself in R! For today, we are going to begin by introducing the conceptual building blocks behind probability.
First, let’s establish some terminology:
Let’s say that there are
Suppose we toss a fair coin, and I ask you what is the chance of the coin landing heads. Like most people, you reply 50%. Why? Well… (you reply) there are two possible things that can happen, and if the coin is fair, then they are both equally likely, so the probability of heads is 1/2 or 50%.
Here, we have thought about an event (the coin landing heads), seen that there is one outcome in that event, and two outcomes in the outcome space, so we say the probability of the event,
Consider rolling a fair six-sided die: six outcomes are possible so
Outcome | ||||||
---|---|---|---|---|---|---|
Probability |
Let
In order to compute the probabilities of events, we need to set some basic mathematical rules called axioms (which are intuitively clear if you think of the probability of an event as the proportion of the outcomes that are in it). There are three basic rules that will help us compute probabilities:
Before we write the third rule, we need some more definitions and notation:
Now we consider events that don’t intersect or overlap at all, that is, they are disjoint from each other, or mutually exclusive:
If
For example, if we are playing De Méré’s second game, the event
However, if we roll a die, the event
Here’s another example that might interest soccer fans: The event that Manchester City wins the English Premier League (EPL) in 2024, and the event that Liverpool wins the EPL in 2024 are mutually exclusive, but the events that Manchester City are EPL champions in 2024 and Manchester City are EPL champions in 2023 are not mutually exclusive.
Now for the third axiom:
For example, consider rolling a fair six-sided die, and the two events
The only outcome in
The complement rule
Here is an important consequence of axiom 3. Let
This is because
Consider the penguins dataset, which has 344 observations, of which 152 are Adelie penguins and 68 are Chinstrap penguins. Suppose we pick a penguin at random, what is the probability that we would pick an Adelie penguin? What about a Gentoo penguin?
Let
Assuming that all the penguins are equally likely to be picked, we see that then
Since only one penguin is picked, we see that
Therefore the complement of
Finally, the complement rule tells us that
We use
We often represent events using Venn diagrams. The outcome space
Here is a Venn diagram showing two mutually exclusive events (no overlap):
Suppose we toss a coin twice and record the equally likely outcomes. What is
Solution:
Now, let
Alternatively, we can consider
In this case,
Now you try: Let
Let
If
Consider the box above which has five almost identical tickets. The only difference is the value written on them. Imagine that we shake the box to mix the tickets up, and then draw one ticket without looking so that all the tickets are equally likely to be drawn3.
What is the chance of drawing an even number?
Solution:
Let
Suppose I have a coin that is twice as likely to land heads as it is to land tails. This means that I cannot represent
Solution:
In this case, we want to represent equally likely outcomes, and want
Suppose we toss the coin twice. How would we list the outcomes so that they are equally likely? Now we have to be careful, and think about all the things that can happen on the second toss if we have
This is much easier to imagine if we imagine drawing twice from a box of tickets, but putting the first ticket back before drawing the second (to represent the fact that the probabilities of landing
Now, imagine the box of tickets that represents
From this picture, where we use color to distinguish the two different outcomes of heads and one outcome of tails, we can see that there are 9 possible outcomes that are equally likely, and we get the following probabilities (where
What box would we use if the coin is not a fair coin, but lands heads
An American roulette wheel has 38 pockets4, of which 18 are red, 18 black, and 2 are green. The wheel is spun, and a small ball is thrown on the wheel so that it is equally likely to land in any of the 38 pockets. Players bet on which colored or numbered pocket the ball will come to rest in. If you bet one dollar that the ball will land on red, and it does, you get your dollar back, and you win one more dollar, so your net gain is $1. If it doesn’t, and lands on a black or green number, you lose your dollar, and your net “gain” is -$1.
What is the chance that we will win one dollar on a single spin of the wheel?
Hint Write out the chance of the ball landing in a red pocket, and not landing in a red pocket.
sample()
and replicate()
, and learned another useful function seq()
This website was begun as poll aggregation site, by the statistician Nate Silver.↩︎
The singular is die and the plural is dice. If we use the word “die” without any qualifiers, we will mean a fair, six-sided die.↩︎
We call the tickets equally likely when each ticket has the same chance of being drawn. That is, if there are
Photo via unsplash.com↩︎