Continuous Distributions and Normal Approximations
Connections to boxes, continuous distributions, and a fundamental result
Ideas in code
Useful functions
Uniform\((a, b)\)
dunif
computes the density \(f(x)\) of \(X\) where \(f(x) = \displaystyle \frac{1}{b-a}\), for \(a < x < b\).
- Arguments:
x
: the value of \(x\) in \(f(x)\)min
: the parameter \(a\), the lower end point of the interval for \(X\). The default value ismin = 0
max
: the parameter \(b\), the upper end point of the interval for \(X\). The default value ismax = 1
punif
computes the cdf \(F(x) = P(X \le x)\) of \(X\).
- Arguments:
q
: the value of \(x\) in \(F(x)\)min
: the parameter \(a\), the lower end point of the interval for \(X\). The default value ismin = 0
max
: the parameter \(b\), the upper end point of the interval for \(X\). The default value ismax = 1
runif
generates random numbers from the \(Unif(a,b)\) distribution.
- Arguments:
n
: the size of the sample we wantmin
: the parameter \(a\), the lower end point of the interval for \(X\). The default value ismin = 0
max
: the parameter \(b\), the upper end point of the interval for \(X\). The default value ismax = 1
Normal\((\mu, \sigma^2)\)
dnorm
computes the density \(f(x)\) of \(X \sim N(\mu,\sigma^2)\)
- Arguments:
x
: the value of \(x\) in \(f(x)\)mean
: the parameter \(\mu\), the mean of the distribution. The default value ismean = 0
sd
: the parameter \(\sigma\), the sd of the distribution. The default value issd = 1
pnorm
computes the cdf \(F(x) = P(X \le x)\) of \(X\).
- Arguments:
q
: the value of \(x\) in \(F(x)\)mean
: the parameter \(\mu\), the mean of the distribution. The default value ismean = 0
sd
: the parameter \(\sigma\), the sd of the distribution. The default value issd = 1
rnorm
generates random numbers from the Normal\((\mu, \sigma^2)\) distribution.
- Arguments:
n
: the size of the sample we wantmean
: the parameter \(\mu\), the mean of the distribution. The default value ismean = 0
sd
: the parameter \(\sigma\), the sd of the distribution. The default value issd = 1
Example
Let’s verify the empirical rule for the standard normal random variable:
Note that (for example) \(P(-1 \le X \le 1) = F(1) - F(-1)\):
pnorm(q = 1) - pnorm(q = -1)
[1] 0.6826895
pnorm(q = 2) - pnorm(q = -2)
[1] 0.9544997
pnorm(q = 3) - pnorm(q = -3)
[1] 0.9973002
The argument prob
in the function sample()
We have seen the function sample()
, but so far, have only used it when we were sampling uniformly at random. That is, all the values are equally likely. We can sample according to a weighted probability, though, by putting in a vector of probabilities. Let’s look at the example of net gain while betting on red on a roulette spin. Recall that if we bet a dollar on red, then our net gain is \(+1\) with a probability of \(\displaystyle \frac{18}{38}\) and \(-1\) with a probability of \(\displaystyle \frac{20}{38}\).
<- c(1,-1) # define the gain for a single spin
gain <- c(18/38,20/38) #define the corresponding probabilities
prob_gain <- sum(gain*prob_gain)
exp_gain exp_gain
[1] -0.05263158
set.seed(123)
#simulate gain from 10 spins of the wheel
sample(x = gain, size = 10, prob = prob_gain, replace = TRUE)
[1] -1 1 -1 1 1 -1 1 1 1 -1
#simulate net gain from 10 spins of the wheel which would sum these
sum(sample(x = gain, size = 10, prob = prob_gain, replace = TRUE))
[1] 0
Here is a simulation showing the Central Limit Theorem at work, with the empirical distribution becoming gradually more bell-shaped. Net gain is the sum of \(n\) draws with replacement from the vector gain
defined above using the prob_gain
vector.