0% found this document useful (0 votes)
58 views

Lab-2: Probability Distributions Name: Objective:To Compute Probability Density Function (PDF) and Cumulative Distribution Function (CDF) Outcomes

1. The document describes probability distributions in R and provides examples of computing probabilities for different distributions. 2. It discusses the binomial, Poisson, continuous uniform, exponential, normal, chi-squared, and Student's t distributions. 3. For each distribution, it gives the probability density function and shows examples of using R functions like dbinom, ppois, runif, and pnorm to calculate probabilities.

Uploaded by

Vishal Ramina
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Lab-2: Probability Distributions Name: Objective:To Compute Probability Density Function (PDF) and Cumulative Distribution Function (CDF) Outcomes

1. The document describes probability distributions in R and provides examples of computing probabilities for different distributions. 2. It discusses the binomial, Poisson, continuous uniform, exponential, normal, chi-squared, and Student's t distributions. 3. For each distribution, it gives the probability density function and shows examples of using R functions like dbinom, ppois, runif, and pnorm to calculate probabilities.

Uploaded by

Vishal Ramina
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Sardar Patel Institute of Technology,Mumbai

Department of Electronics and Telecommunication Engineering

T.E. Sem-V (2018-2019)

ETL54-Statistical Computational Laboratory

Lab-2: Probability Distributions

Name: ​Vishal Ramina Roll No. : ​2017120049

Objective:To compute probability density function (pdf) and cumulative


distribution function (cdf)

Outcomes:

1. To list and describe the well-known probability distributions with their


characteristics.

2. To compute the probability distributions which are frequently occurs in


Statistical Study

System Requirements: ​Ubuntu OS with R and RStudio installed

Introduction to Probability distribution:

A probability distribution describes how the values of a random variable is


distributed. There are two types of probability distributions: Discrete and Continuous

Well-known probability distributions which are frequently occurred in statistical


study:

Binomial Distribution

Poisson Distribution

Continuous Uniform Distribution

Exponential Distribution

Normal Distribution

Chi-squared Distribution
Student t Distribution

F Distribution
  Functions for Probability Distributions:
R

Every distribution that R handles has four functions. There is a root name, for
example, the root name for the normal distribution is norm. This root is prefixed by
one of the letters

p for "probability", the cumulative distribution function (c. d. f.)

q for "quantile", the inverse c. d. f.

d for "density", the density function (p. f. or p. d. f.)

r for "random", a random variable having the specified distribution

For the normal distribution, these functions are pnorm, qnorm, dnorm, and rnorm. For
the binomial distribution, these functions are pbinom, qbinom, dbinom, and rbinom.
And so forth.

For a continuous distribution (like the normal), the most useful functions for doing
problems involving probability calculations are the "p" and "q" functions (c. d. f. and
inverse c. d. f.), because the the density (p. d. f.) calculated by the "d" function can
only be used to calculate probabilities via integrals and R doesn't do integrals.

For a discrete distribution (like the binomial), the "d" function calculates the density
(p. f.), which in this case is a probability

f(x) = P(X = x)

and hence is useful in calculating probabilities.

R has functions to handle many probability distributions. The table below gives the
names of the functions for each distribution.

Table-1:Probability Distributions

Distribution Functions

Binomial pbinom qbinom dbinom rbinom

Cauchy pcauchy qcauchy dcauchy rcauchy

Chi-Square pchisq qchisq dchisq rchisq


Exponential pexp qexp dexp rexp

F pf qf df rf

Gamma pgamma qgamma dgamma rgamma

Geometric pgeom qgeom dgeom rgeom

Hypergeometric phyper qhyper dhyper rhyper

Logistic plogis qlogis dlogis rlogis

Log Normal plnorm qlnorm dlnorm rlnorm

Normal pnorm qnorm dnorm rnorm

Poisson ppois qpois dpois rpois

Student t pt qt dt rt

Uniform punif qunif dunif runif

Weibull pweibull qweibull dweibull rweibull

Procedure:

1. Open RStudio

2. Go to RConsole (>)

3. Probability distribution in R

>help(rnorm) #The normal Distribution

>help(dbinom) # The Binomial Distribution

Probability Distributions in R:

In R, probability functions take the form

[dpqr]distribution_abbreviation ()

where the first letter refers to the aspect of the distribution returned:

d = density
p = distribution function

q = quantile function

r = random generation (random deviates)

1. Binomial Distribution

The binomial distribution is a discrete probability distribution. It describes the


outcome of n independent trials in an experiment. Each trial is assumed to have only
two outcomes, either success or failure. If the probability of a successful trial is p,
then the probability of having x successful outcomes in an experiment of n
independent trials is as follows.

Problem

Suppose there are twelve multiple choice questions in an English class quiz. Each
question has five possible answers, and only one of them is correct. Find the
probability of having four or less correct answers if a student attempts to answer every
question at random.

Example Solution:

Since only one out of five possible answers is correct, the probability of answering a
question correctly by random is 1/5=0.2. We can find the probability of having
exactly 4 correct answers by random attempts as follows.

> dbinom(4, size=12, prob=0.2)


[1] 0.1329

To find the probability of having four or less correct answers by random attempts, we
apply the function dbinom with x = 0,…,4.

> dbinom(0, size=12, prob=0.2) +


+ dbinom(1, size=12, prob=0.2) +
+ dbinom(2, size=12, prob=0.2) +
+ dbinom(3, size=12, prob=0.2) +
+ dbinom(4, size=12, prob=0.2)
[1] 0.9274

Alternatively, we can use the cumulative probability function for binomial


distribution pbinom.
> pbinom(4, size=12, prob=0.2)
[1] 0.92744

Answer: ​The probability of four or less questions answered correctly by random in a


twelve question multiple choice quiz is 92.7%.

2. Poisson Distribution

The Poisson distribution is the probability distribution of independent event


occurrences in an interval. If λ is the mean occurrence per interval, then the
probability of having x occurrences within a given interval is:

Problem

If there are twelve cars crossing a bridge per minute on average, find the probability
of having seventeen or more cars crossing the bridge in a particular minute.

Solution

> ppois(16,lambda = 12,lower.tail = FALSE) 


[1] 0.101291 

Answer: ​If there are twelve cars crossing a bridge per minute on average, the probability
of having seventeen or more cars crossing the bridge in a particular minute is 10.1%.

3. ​Continuous Uniform Distribution

The continuous uniform distribution is the probability distribution of random number


selection from the continuous interval between a and b. Its density function is defined
by the following.
Here is a graph of the continuous uniform distribution with a = 1, b = 3.

Problem

Select ten random numbers between one and three.

Solution

> runif(10,min = 1,max = 3) 


[1] 2.566191 2.110777 2.026792 2.428348 2.039117 1.585373 2.538662 
2.820151 1.608856 1.215307 
> plot(runif) 

Answer: ​We
use the generation function runif() of the uniform distribution to generate ten random
numbers between one and three.

4. ​Exponential Distribution

The exponential distribution describes the arrival time of a randomly recurring


independent event sequence. If μ is the mean waiting time for the next event
recurrence, its probability density function is:
Here is a graph of the exponential distribution with μ = 1.

Problem

Suppose the mean checkout time of a supermarket cashier is three minutes. Find the
probability of a customer checkout being completed by the cashier in less than two
minutes.

Solution

> pexp(2,rate = 1/3,lower.tail = TRUE) 


[1] 0.4865829 
> plot(pexp) 
Answer: ​The probability of finishing a checkout in under two minutes by the cashier
is 48.7%.

5.Normal Distribution

The normal distribution is defined by the following probability density function,


where μ is the population mean and σ​2​ is the variance.

If a random variable X follows the normal distribution, then we write:

In particular, the normal distribution with μ = 0 and σ = 1 is called the standard


normal distribution, and is denoted as N(0,1). It can be graphed as follows.

The normal distribution is important because of the Central Limit Theorem, which
states that the population of all possible samples of size n from a population with
mean μ and variance σ​2​ approaches a normal distribution with mean μ and σ​2​⁄n when n
approaches infinity.

Problem

Assume that the test scores of a college entrance exam fits a normal distribution.
Furthermore, the mean test score is 72, and the standard deviation is 15.2. What is the
percentage of students scoring 84 or more in the exam?
Solution

> pnorm(84,mean = 72,sd =15.2, lower.tail = FALSE) 


[1] 0.2149176 
> plot(pnorm) 

Answer: ​The percentage of students scoring 84 or more in the college entrance exam
is 21.5%.

6.Chi-squared Distribution

If X​1​,X​2​,…,X​m​ are m independent random variables having the standard normal


distribution, then the following quantity follows a Chi-Squared distribution with m
degrees of freedom. Its mean is m, and its variance is 2m.

Here is a graph of the Chi-Squared distribution 7 degrees of freedom.


Problem

Find the 95​th​percentile of the Chi-Squared distribution with 7 degrees of freedom.

Solution

> qchisq(0.95,df = 7) 


[1] 14.06714 
Answer: ​The 95​th​ percentile of the Chi-Squared distribution with 7 degrees of freedom
is 14.067.

8.Student t Distribution

Assume that a random variable Z has the standard normal distribution, and another
random variable V has the Chi-Squared distribution with m degrees of freedom.
Assume further that Z and V are independent, then the following quantity follows a
Student t distribution with m degrees of freedom.

Here is a graph of the Student t distribution with 5 degrees of freedom.


Problem

Find the 2.5​th​ and 97.5​th​percentiles of the Student t distribution with 5 degrees of
freedom.

Solution

> qt(c(0.025,0.975),df = 5) 


[1] -2.570582 2.570582
Answer: ​The 2​.​5th​ ​ and 97​.5​ th​ ​ percentiles of the Student t distribution with 5 degrees of
freedom are -2.5706 and 2.5706 respectively.

8. F Distribution

If V ​1​ and V ​2​ are two independent random variables having the Chi-Squared
distribution with m​1​ and m​2​ degrees of freedom respectively, then the following
quantity follows an F distribution with m​1​ numerator degrees of freedom and m​2
denominator degrees of freedom, i.e., (m​1​,m​2​) degrees of freedom.

Here is a graph of the F distribution with (5, 2) degrees of freedom.


Problem

Find the 95​th​percentile of the F distribution with (5, 2) degrees of freedom.

Solution

> qf(0.95,5,2) 
[1] 19.29641 
Answer: ​The 95​th​ percentile of the F distribution with (5, 2) degrees of freedom is
19.296.

Describe the following with respect to probability distributions:

1.

x <- rnorm(1000, mean=100, sd=15)

hist(x, probability=TRUE)

xx <- seq(min(x), max(x), length=100)

lines(xx, dnorm(xx, mean=100, sd=15))

2. What is​P(​ ​X>


​ 19) ​when ​X​ has the ​N(17.46, 375.67) ​distribution?

> pnorm(19,17.46,19.38) 
[1] 0.531668 
3. Interpret the following

Interpretation:

In the first line, we are calculating the area to the left of 1.96, while in the second line
we are calculating the area to the right of 1.96

4. Run this in RStudio Script editor and explain it from plot

set.seed(3000)
xseq<- seq(-4,4,.01)
densities<-dnorm(xseq, 0,1)
cumulative<-pnorm(xseq, 0, 1)
randomdeviates<-rnorm(1000,0,1)
par(mfrow=​c​(1,3), mar=​c​(3,4,4,2))

plot(xseq, densities, col="darkgreen",xlab="", ylab="Density", type="l",lwd=2,


cex=2, main="PDF of Standard Normal", cex.axis=.8)

plot(xseq, cumulative, col="darkorange", xlab="", ylab="Cumulative


Probability",type="l",lwd=2, cex=2, main="CDF of Standard Normal", cex.axis=.8)

hist(randomdeviates, main="Random draws from Std Normal", cex.axis=.8,


xlim=​c​(4,4))

Output:
Explanation:

Let’s make up some data, where I add noise by using ​rnorm()​ – here I’m generating
the same amount of random numbers as is the length of the xseq vector, with a mean
of 0 and a standard deviation of 5.5.

xseq<-​seq​(​-​4​,​4​,​.01​)

y<-​2​*xseq + ​rnorm​(​length​(​xseq​)​,​0​,​5.5​)

And now we can plot a histogram of y and add a ​curve()​ function to the plot using the
mean and standard deviation of y as the parameters:

hist​(​y​,​ ​prob​=​TRUE​,​ ylim=​c​(​0​,​.06​)​,​ breaks=​20​)


curve​(​dnorm​(​x​,​ ​mean​(​y​)​,​ ​sd​(​y​))​,​ add=​TRUE​,​ ​col​=​"darkblue"​,​ lwd=​2​)

Here, the ​curve()​ function takes as its first parameter a function itself (or an
expression) that must be written as some function of x. Our function here is ​dnorm()​.
The x in the ​dnorm()​ function is not an object we have created; rather, it’s indicating
that there’s a variable that is being evaluated, and the evaluation is the normal density
at the mean of y and standard deviation of y. Make sure to include add=​TRUE​ so that

the curve is plotted on the same plot as the​ histogram.​


Conclusion:
● Model your data with the appropriate distribution according to the underlying
assumptions. There is a tendency for disregarding simple distributions, when in
fact they can help the most.
● While the definition of the Binomial and Poisson distributions is relatively
straightforward, it is not so easy to ascertain ‘normality’ in a distribution of a
continuous variable. A Box-Cox transformation might be helpful in resolving
distributions skewed to different extents (and sign) into a Normal one.
● There are many other distributions . The rule of the prefixes aforementioned for R
​ ,​ ​r​)
still applies (​e.g.p

Note: ​Complete your write-up with conclusion and upload your outputs on Google
classroom.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy