Introduction To Simulation Using R: 13.1 Analysis Versus Computer Simulation
Introduction To Simulation Using R: 13.1 Analysis Versus Computer Simulation
13.1
A computer simulation is a computer program which attempts to represent the real world based
on a model. The accuracy of the simulation depends on the precision of the model. Suppose that
the probability of heads in a coin toss experiment is unknown. We can perform the experiment
of tossing the coin n times repetitively to approximate the probability of heads.
P (H) =
However, for many practical problems it is not possible to determine the probabilities by executing experiments a large number of times. With todays computers processing capabilities, we
only need a high-level language, such as R, which can generate random numbers, to deal with
these problems.
In this chapter, we present basic methods of generating random variables and simulate
probabilistic systems. The provided algorithms are general and can be implemented in any
computer language. However, to have concrete examples, we provide the actual codes in R. If
you are unfamiliar with R, you should still be able to understand the algorithms.
13.2
Introduction: What is R?
R is a programming language that helps engineers and scientists find solutions for given statistical problems with fewer lines of codes than traditional programming languages, such as C/C++
or Java, by utilizing built-in statistical functions. There are many built-in statistical functions
and add-on packages available in R. It also has high quality customizable graphics capabilities.
R is available for Unix/Linux, Windows, and Mac. Besides all these features, R is free!
13.3
Most of the programming languages can deliver samples from the uniform distribution to us
(In reality, the given values are pseudo-random instead of being completely random.) The rest
1
13.3.1
p = 0.5;
U = runif (1, min = 0, max = 1);
X = (U < p);
Since the runif(1, min = 0, max = 1) command returns a number between 0 and 1, we divided
the interval [0, 1] into two parts, p and 1 p in length. Then, the value of X is determined based
on where the number generated from uniform distribution fell.
Example 2. (Coin Toss Simulation) Write codes to simulate tossing a fair coin to see how the
law of large numbers works.
Solution: You can write:
n = 1000;
U = runif (n, min = 0, max = 1);
toss = U < 0.5;
a = numeric(n + 1);
avg = numeric(n);
f or(i
in
2 : n + 1)
{
a[i] = a[i 1] + toss[i 1];
avg[i 1] = a[i]/(i 1);
}
plot(1 : n, avg[1 : n], type = l, lwd = 5, col = blue, ylab = P roportionof Heads,
xlab = CoinT ossN umber, cex.main = 1.25, cex.lab = 1.5, cex.axis = 1.75)
If you run the above codes to compute the proportion of ones in the variable toss, the result
will look like Figure 13.1. You can also assume the coin is unbiased with probability of heads
equal to 0.6 by replacing the third line of the previous code with:
p = 0.2;
n = 50;
U = runif (n, min = 0, max = 1);
X = sum(U < p);
x0 if (U < p0 )
x1 if (p0 U < p0 + p1 )
.
..
X=
P
Pj
j1
x
if
p
U
<
p
k=0 k
k=0 k
.
.
.
In other words
X = xj
P (X = xj ) = P
j1
X
k=0
= pj
pk U <
j
X
k=0
!
pk
pj
Example 4. Give an algorithm to simulate the value of a random variable X such that
P (X = 1) = 0.35
P (X = 2) = 0.15
P (X = 3) = 0.4
P (X = 4) = 0.1
Solution: We divide the interval [0, 1] into subintervals as follows:
A0 = [0, 0.35)
A1 = [0.35, 0.5)
A2 = [0.5, 0.9)
A3 = [0.9, 1)
Subinterval Ai has length pi . We obtain a uniform number U . If U belongs to Ai , then X = xi .
P (X = xi ) = P (U Ai )
= pi
13.3.2
At least in principle, there is a way to convert a uniform distribution to any other distribution.
Lets see how we can do this. Let U U nif orm(0, 1) and F be a CDF. Also, assume F is
continuous and strictly increasing as a function.
(The CDF of X
is
F)
Proof:
P (X x) = P (F 1 (U ) x)
= P (U F (x))
(increasing function)
= F (x)
Now, lets see some examples. Note that to generate any continuous random variable X with
the continuous cdf F , F 1 (U ) has to be computed.
Example 5. (Exponential) Generate an Exponential(1) random variable.
Solution: To generate an Exponential random variable with parameter = 1, we proceed
as follows
F (x) = 1 ex
x>0
U U nif orm(0, 1)
X = F 1 (U )
= ln(1 U )
XF
This formula can be simplified since
1U
U nif orm(0, 1)
1U
n = 20;
lambda = 1;
X = (1/lambda) sum(log(runif (n, min = 0, max = 1)));
Example 7. (Poisson) Generate a Poisson random variable. Hint: In this example, use the fact
that the number of events in the interval [0, t] has Poisson distribution when the elapsed times
between the events are Exponential.
Solution: We want to employ the definition of Poisson processes. Assume N represents the
number of events (arrivals) in [0,t]. If the interarrival times are distributed exponentially (with
parameter ) and independently, then the number of arrivals occurred in [0,t], N , has Poisson
distribution with parameter t (Figure 13.4). Therefore, to solve this problem, we can repeat
generating Exponential() random variables while their sum is not larger than 1 (choosing
t = 1). More specifically, we generate Exponential() random variables
Ti =
1
ln(Ui )
1
X = max j :
ln(U1 Uj ) 1
Lambda = 2;
i = 0;
U = runif (1, min = 0, max = 1);
Y = (1/Lambda) log(U );
sum = Y ;
while(sum < 1)
{U = runif (1, min = 0, max = 1);
Y = (1/Lambda) log(U );
sum = sum + Y ;
i = i + 1; }
X=i
Exp()
Exp()
Exp()
Exp()
0
1
X= Maximum number of exponential random variables
Figure 13.4: Poisson Random Variable
To finish this section, lets see how to convert uniform numbers to normal random variables.
Normal distribution is extremely important in science because it is very commonly occuring.
Theorem 3. (Box-Muller transformation) We can generate a pair of independent normal variables (Z1 , Z2 ) by transforming a pair of independent U nif orm(0, 1) random variables (U1 , U2 )
[1].
Z1 = 2 ln U1 cos(2U2 )
Z2 = 2 ln U1 sin(2U2 )
Example 8. (Box-Muller) Generate 5000 pairs of normal random variables and plot both
histograms.
Solution: We display the pairs in Matrix form.
n = 5000;
U 1 = runif (n, min = 0, max = 1)
U 2 = runif (n, min = 0, max = 1)
Z1 = sqrt(2 log(U 1)) cos(2 pi U 2)
Z2 = sqrt(2 log(U 1)) sin(2 pi U 2)
hist(Z1, col = wheat, label = T )
Figure 13.5: Histogram of Z1, a normal random variable generated by Box-Muller transformation
13.4
In this section, we will see some useful commands for commonly employed distributions. Functions which start with p,q,d, and r give the cumulative distribution function (CDF),
the inverse CDF, the density function (PDF), and a random variable having the specified distribution respectively.
Distributions
Commands
Binomial
pbinom
qbinom
dbinom
rbinom
Geometric
pgeom
qgeom
dgeom
rgeom
N egativeBinomial
qnbinom
dnbinom
rnbinom
P oisson
ppois
qpois
dpois
rpois
Beta
pbeta
qbeta
dbeta
rbeta
Beta
pbeta
qbeta
dbeta
rbeta
Exponential
pexp
qexp
dexp
rexp
Gamma
13.5
pnbinom
pgamma
Studentt
pt
U nif orm
punif
qgamma
qt
qunif
dgamma
dt
dunif
rgamma
rt
runif
Exercises
K = 1;
p = 0.2;
while(runif (1) > p)
K = K + 1;
K
Now, we can generate Geometric random variable i times to obtain a Negative Binomial(i, p)
variable as a sum of i independent Geometric (p) random variables.
13.5. EXERCISES
11
K = 1;
p = 0.2;
r = 2;
success = 0;
while(success < r)
{if
{K = K + 1;
print = 0
#F ailure
}else
{success = success + 1;
print = 1}} #Success
K +r1
#N umber
of
trials
needed
to
obtain
successes
2. (Poisson) Use the algorithm for generating discrete random variables to obtain a Poisson
random variable with parameter = 2.
Solution: We know a Poisson random variable takes all nonnegative integer values with
probabilities
pi = P (X = xi ) = e
i
i!
for i = 0, 1, 2,
x0 if (U < p0 )
x if (p0 U < p0 + p1 )
1.
..
X=
P
Pj
j1
x
if
p
U
<
p
j
k
k
k=0
k=0
..
.
Here xi = i 1, so
F is CDF
lambda = 2;
i = 0;
U = runif (1);
cdf = exp(lambda);
while(U >= cdf )
{i = i + 1;
cdf = cdf + exp(lambda) lambda i/gamma(i + 1); }
X = i;
3. Explain how to generate a random variable with the density
FX (X) = X 2 = U
X=U
(0 < x < 1)
2
5
U = runif (1);
2
X = U 5;
We have the desired distribution.
4. Use the inverse transformation method to generate a random variable having distribution
function
x2 + x
F (x) =
, 0x1
2
Solution:
X2 + X
=U
2
1
1
(X + )2 = 2U
2
4 r
1
1
X + = 2U +
2
4
r
1 1
X = 2U +
4 2
13.5. EXERCISES
13
1
1
arctan(x) +
Assuming you have U U nif orm(0, 1), explain how to generate X. Then, use this result
to produce 1000 samples of X and compute the sample mean. Repeat the experiment 100
times. What do you observe and why?
Solution: Using Inverse Transformation Method:
U
1
U
2
1
1
= arctan(X)
2
= arctan(X)
1
X = tan (U )
2
U = numeric(1000);
n = 100;
average = numeric(n);
f or
(i
in
1 : n)
{U = runif (1000);
X = tan(pi (U 0.5));
average[i] = mean(X); }
plot(1 : n, average[1 : n], type = l, lwd = 2, col = blue)
for generating a random variable having density function g(x). Now, assume you want to
generate a random variable having density function f (x). Let c be a constant such that
f (y)
c (for all y)
g(y)
Show that the following method generates a random variable with density function f (x).
f (Y )
cg(Y ) ,
Solution: The number of times N that the first two steps of the algorithm need to be called
is itself a random variable and has a geometric distribution with success probability
p=P
f (Y )
U
cg(Y )
13.5. EXERCISES
15
Z
1
=
f (y)dy
c
1
=
c
Therefore, E(N ) = c
Let F be the desired CDF (CDF of X). Now, we must show that the conditional distrif (Y )
f (Y )
bution of Y given that U cg(Y
) is indeed F , i.e. P (Y y|U cg(Y ) ) = F (y). Assume
M = {U
f (Y )
cg(Y ) },
f (Y )
P (U cg(Y
f (Y )
) , Y y)
|Y y) =
P (U
cg(Y )
G(y)
Z y P (U f (y) |Y = v y)
cg(y)
=
g(v)dv
G(y)
Z y
1
f (v)
=
g(v)dv
G(y) cg(v)
Z y
1
f (v)dv
=
cG(y)
F (y)
=
cG(y)
Thus,
P (K|M ) = P (M |K)P (K)/P (M )
f (Y )
G(y)
|Y y) 1
= P (U
cg(Y )
c
=
F (y)
G(y)
1
cG(y)
c
= F (y)
7. Use the rejection method to generate a random variable having density function Beta(2, 4).
Hint: Assume g(x) = 1 for 0 < x < 1.
Solution:
f (x) = 20x(1 x)3
g(x) = 1 0 < x < 1
f (x)
= 20x(1 x)3
g(x)
0<x<1
g(x)
64
f (x)
256
Hence,
=
x(1 x)3
cg(x)
27
n = 1;
while (n == 1)
{U 1 = runif (1);
U 2 = runif (1);
if
(U 2 <= 256/27 U 1 (1 U 1) 3)
{X = U 1;
n = 0; }}
5
8. Use the rejection method to generate a random variable having the Gamma(
2 , 1) density
5
function. Hint: Assume g(x) is the pdf of the Gamma = 2 , = 1 .
Solution:
3
4
f (x) = x 2 ex , x > 0
3
2 2x
g(x) = e 5 x > 0
5
f (x)
10 3 3x
= x 2 e 5
g(x)
3
f (x)
g(x)
dx
Hence,
=0
x=
5
2
3
10 5 2 3
c=
e2
3 2
3
3x
f (x)
x2 e 5
= 3 3
cg(x)
5 2
e2
2
We know how to generate an Exponential random variable.
13.5. EXERCISES
17
3 3Y
2e 5
3 3
5 2
e 2
2
( )
9. Use the rejection method to generate a standard normal random variable. Hint: Assume
g(x) is the pdf of the exponential distribution with = 1.
Solution:
x2
2
f (x) = e 2 0 < x <
2
x
g(x) = e
0 < x < (Exponential density function with mean 1)
r
2 x x2
f (x)
=
e 2
Thus,
g(x)
f (x)
Thus, x = 1 maximizes
g(x)
r
2e
Thus, c =
(x1)2
f (x)
= e 2
cg(x)
(Y 1)2
2
10. Use the rejection method to generate a Gamma(2, 1) random variable conditional on its
value being greater than 5, that is
xex
f (x) = R x
5 xe dx
=
xex
6e5
(x 5)
f (x)
g(x)
c=
is decreasing. Therefore,
f (5)
5
=
g(5)
3
Y ( Y 25 )
,
5e
Bibliography
[1] http://projecteuclid.org/download/pdf_1/euclid.aoms/1177706645
[2] Michael Baron, Probability and Statistics for Computer Scientists. CRC Press, 2006
[3] Sheldon M. Ross, Simulation. Academic Press, 2012
19