Probability Foundations

Definitions, axioms, and examples

Our first step toward simulating experiments is introducing randomness in R. The following three functions are a good start.

sample()

Drawing from a box of tickets is easily simulated in R, since there is a convenient function sample() that does exactly what we need: draw tickets from a “box” (which needs to be a vector).

  • Arguments
    • x: the vector to be sampled from, this must be specified
    • size: the number of items to be sampled, the default value is the length of x
    • replace: whether we replace a drawn item before we draw again, the default value is FALSE, indicating that we would draw without replacement.

Example: one sample of size 2 from a box with tickets from 1 to 6

What would happen if we don’t specify values for size and replace?

What would we do differently if we wanted to simulate two rolls of a die?

Check your answer

We would sample twice from the vector die with replacement:

set.seed()

The random number generator in R is called a “Pseudo Random Number Generator”, because the process can be controlled by a “seed number”. These are algorithmic random number generators, which means that if you provide the same seed (a starting number), R will generate the same sequence of random numbers. This makes it easier to debug your code, and reproduce your results if needed.

  • Arguments
    • n: the seed number to use. You can use any number you like, for example 1, or 31415 etc You might have noticed that each time you run sample in the code chunk above, it gives you a different sample. Sometimes we want it to give the same sample so that we can check how the code is working without the sample changing each time. We will use the set.seed function for this, which ensures that we will get the same random sample each time we run the code.

Example: one sample of size 2 from a box with tickets from 1 to 6

Example: another sample of size 2 from a box with tickets from 1 to 6

Notice that we get the same sample. You can try to run sample(die) without using set.seed() and see what happens.

Though we used set.seed() twice here to demonstrate its purpose, generally, you will only need to run set.seed() once time per document. This is a line of code that fits perfectly at the beginning of your work, when you are also loading libraries and packages.

seq()

Above, we created the vector die using die <- c(1, 2, 3, 4, 5, 6), which is fine, but this method would be tedious if we wanted to simulate a 20-sided die, for instance. The function seq() allows us to create any sequence we like by specifying the starting number, how we want to increment the numbers, and either the ending number or the length of the sequence we want.

  • Arguments
    • from: where to start
    • by: size of jump from number to number (the increment)

You can end a sequence in one of two ways: - to: at what number should the sequence end - length: how long should the sequence be

Example: sequence with the to argument

Example: sequence with the length argument