100% found this document useful (1 vote)
400 views

RANDOM - Discrete and Continuous - VARIABLE

Discrete and continuous random variables can be either discrete, taking on countable values, or continuous, taking any value in an interval. The probability distribution of a discrete random variable is a list of probabilities of each possible value. The probability density function of a continuous random variable describes the relative likelihood of values and must satisfy conditions including being greater than or equal to 0 and having its integral over the range equal to 1. Key metrics for random variables include the expected value (mean), mode, median, and variance.

Uploaded by

vikalp123123
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
400 views

RANDOM - Discrete and Continuous - VARIABLE

Discrete and continuous random variables can be either discrete, taking on countable values, or continuous, taking any value in an interval. The probability distribution of a discrete random variable is a list of probabilities of each possible value. The probability density function of a continuous random variable describes the relative likelihood of values and must satisfy conditions including being greater than or equal to 0 and having its integral over the range equal to 1. Key metrics for random variables include the expected value (mean), mode, median, and variance.

Uploaded by

vikalp123123
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

1

Discrete and Continuous Random


• A variable X whose value depends on the outcome of a random process is called a random variable.
ex: X is the outcome of a coin toss
ex: X is the 1st number drawn in the next lottery draw
ex: X is the age of an individual chosen at random from Zagreb population

Discrete Random Variables


• A discrete variable is a variable which can only take a countable number of values.
Thus, a discrete random variable X has possible values x1 , x2 , x3, .....
• The variable is said to be random if the sum of the probabilities is one.

Example:
A coin is tossed three times
X represents the number of heads that we throw

x 0 1 2 3
1 3 3 1
P(X = x)
8 8 8 8

Continuous Random Variables


• A continuous random variable is a random variable that can assume any value in an interval.
ex: X is the length of time until the next time you are sick.
ex: X is the weight of someone chosen at random from the Croatian population.

Probability distribution/ function


Discrete Random Variables
• The probability density function (p.d.f.) of X is a function which allocates probabilities. Put simply, it is a
function which tells you the probability of certain events occurring.
• it must satisfy the following conditions:

1. 0 ≤ 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) ≤ 1

2. � 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 1
𝑥𝑥
3. 𝑃𝑃(𝑋𝑋 = 𝑥𝑥𝑛𝑛 ) = 1 − � 𝑃𝑃(𝑋𝑋 = 𝑥𝑥𝑘𝑘 )
𝑘𝑘≠𝑛𝑛

P(X = x), the probability distribution of x, involves listing P(𝑥𝑥𝑖𝑖 ) for each 𝑥𝑥𝑖𝑖 .
𝑃𝑃 (𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑥𝑥𝑛𝑛 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜) = 1 − 𝑃𝑃(𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜)
Example:
A die is thrown repeatedly until a 6 is obtained. Find the probability density function for the number times we
throw the die.
Let X be the random variable representing the number of times we throw the die.
P(X = 1) = 1/6 (probability that we get 6 on our first throw)
P(X = 2) = (5/6) × (1/6) (probability that we get 6 on second throw; if we throw the die twice
before getting a 6, first throw is something that isn't a 6 (probability
is 5/6) and second throw is a 6
etc
5 𝑥𝑥−1 1
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = � �
6 6
2

Continuous Random Variables 𝑿𝑿 defined on 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏


probability density function (p.d.f.), f (x), describes the relative likelihood for this variable to take on
a given value x
cumulative distribution function (c.d.f.), 𝐹𝐹(𝑥𝑥), is the probability that 𝑿𝑿 will take a value that is less
then or equal to x. It is found by integrating the p.d.f. from minimum value of X to x.
𝑥𝑥
𝐹𝐹(𝑥𝑥) = 𝑃𝑃(𝑋𝑋 ≤ 𝑥𝑥) = ∫𝑎𝑎 𝑓𝑓(𝑡𝑡)𝑑𝑑𝑑𝑑
Definition: For a function 𝑓𝑓(𝑥𝑥) to be probability function, it must satisfy the following conditions:
1. 𝑓𝑓(𝑥𝑥) ≥ 0 𝑓𝑓𝑓𝑓𝑓𝑓 𝑎𝑎𝑎𝑎𝑎𝑎 𝑥𝑥 𝜖𝜖 (𝑎𝑎, 𝑏𝑏)
𝑏𝑏
2. ∫𝑎𝑎 𝑓𝑓(𝑥𝑥) = 1
𝑑𝑑
3. 𝑓𝑓𝑓𝑓𝑓𝑓 𝑎𝑎𝑎𝑎𝑎𝑎 𝑎𝑎 ≤ 𝑐𝑐 < 𝑑𝑑 ≤ 𝑏𝑏, 𝑃𝑃(𝑐𝑐 < 𝑋𝑋 < 𝑑𝑑) = ∫𝑐𝑐 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
N.B.

1. 𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑏𝑏 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑏𝑏𝑏𝑏 − ∞ 𝑎𝑎𝑎𝑎𝑎𝑎 ∞. 𝑀𝑀𝑀𝑀𝑀𝑀𝑀𝑀 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜 𝑖𝑖𝑖𝑖 𝐼𝐼𝐼𝐼.


2. 𝐹𝐹𝐹𝐹𝐹𝐹 𝑎𝑎 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣, 𝑡𝑡ℎ𝑒𝑒 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑧𝑧𝑧𝑧𝑟𝑟𝑟𝑟
𝑃𝑃(𝑋𝑋 = 𝑐𝑐) = 0 ⇒ 𝑃𝑃(𝑐𝑐 ≤ 𝑋𝑋 ≤ 𝑑𝑑) = 𝑃𝑃(𝑐𝑐 < 𝑋𝑋 < 𝑑𝑑) = 𝑃𝑃(𝑐𝑐 ≤ 𝑋𝑋 < 𝑑𝑑) 𝑒𝑒𝑒𝑒𝑒𝑒.

• Expected value (or mean) is a weighted average of the possible values that X can take, each value
being weighted according to the probability of that event occurring. The expected value of X is usually
written as E(X) or μ.

Discrete Random Variables Continuous Random Variables



𝑬𝑬(𝑿𝑿) = 𝝁𝝁 = ∑ 𝒙𝒙 𝑷𝑷(𝑿𝑿 = 𝒙𝒙) 𝑬𝑬(𝑿𝑿) = 𝝁𝝁 = ∫−∞ 𝒙𝒙 𝒇𝒇(𝒙𝒙)𝒅𝒅𝒅𝒅

sum of: [(each of the possible outcomes) × (the probability of the outcome occurring)].

Example
What is the expected value when we roll a fair die?
There are six possible outcomes: 1, 2, 3, 4, 5, 6. Each of these has a probability of 1/6 of occurring.
Let X represent the outcome of the experiment.
1
𝑃𝑃(𝑋𝑋 = 1) = 𝑃𝑃(𝑋𝑋 = 2) = ⋯ = 𝑃𝑃(𝑋𝑋 = 6) =
6
𝐸𝐸(𝑋𝑋) = 1 × 𝑃𝑃(𝑋𝑋 = 1) + 2 × 𝑃𝑃(𝑋𝑋 = 2) + 3 × 𝑃𝑃(𝑋𝑋 = 3) + 4 × 𝑃𝑃(𝑋𝑋 = 4) + 5 × 𝑃𝑃(𝑋𝑋 = 5) + 6 × 𝑃𝑃(𝑋𝑋 = 6)

1 2 3 4 5 6 7
𝐸𝐸(𝑋𝑋) = + + + + + = = 3.5
6 6 6 6 6 6 2

• Properties of expected value E(X)


1. 𝐼𝐼𝐼𝐼 𝑎𝑎 𝑎𝑎𝑎𝑎𝑎𝑎 𝑏𝑏 𝑎𝑎𝑎𝑎𝑎𝑎 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, 𝑡𝑡ℎ𝑒𝑒𝑒𝑒 𝐸𝐸(𝑎𝑎𝑎𝑎 + 𝑏𝑏) = 𝑎𝑎𝑎𝑎(𝑋𝑋) + 𝑏𝑏
2. 𝐸𝐸(𝑋𝑋 + 𝑌𝑌 ) = 𝐸𝐸(𝑋𝑋) + 𝐸𝐸(𝑌𝑌 )

Expected Value of a Function of X


𝐸𝐸[ 𝑔𝑔(𝑋𝑋)], 𝑤𝑤ℎ𝑒𝑒𝑒𝑒𝑒𝑒 𝑔𝑔(𝑋𝑋) 𝑖𝑖𝑖𝑖 𝑎𝑎 𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓𝑓 𝑜𝑜𝑜𝑜 𝑋𝑋 𝑖𝑖𝑖𝑖:

Discrete Random Variables Continuous Random Variables

𝐸𝐸[ 𝑔𝑔(𝑋𝑋)] = � 𝑔𝑔(𝑥𝑥)𝑃𝑃(𝑋𝑋 = 𝑥𝑥) 𝐸𝐸[ 𝑔𝑔(𝑋𝑋)] = ∫ 𝑔𝑔(𝑥𝑥)𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑


3

Example

For the experiment with the die, calculate E(X2)


f(x) = x2
f(1)=1, f(2)=4, f(3)=9, f(4)=16, f(5)=25, f(6)=36
1
𝑃𝑃(𝑋𝑋 = 1) = 𝑃𝑃(𝑋𝑋 = 2) = ⋯ = 𝑃𝑃(𝑋𝑋 = 6) =
6
So E(X2) = 1/6 + 4/6 + 9/6 + 16/6 + 25/6 + 36/6 = 91/6 = 15.167

• Mode is the most likely value of X.


Discrete Random Variables Continuous Random Variables
The mode is the value of x with The mode is the value of x where f(x)
largest 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) which can be is maximum (which may not be unique).
different from the expected value

• Median is the value with half the probabilities below and half above the median value
Discrete Random Variables Continuous Random Variables
The median of a random variable X is a
The median is the middle value when there are an
odd number of measurements listed, and it is the number m such that
𝑚𝑚
average of the two middle values when there are an 1
even number of measurements listed. It is the � 𝑓𝑓(𝑥𝑥) =
2
value of X such that 𝑎𝑎
1 1 The median m is the number for which
𝑃𝑃(𝑋𝑋 ≤ 𝑥𝑥) ≥ 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 ≥ 𝑥𝑥) ≥
2 2 the probability is exactly ½ that the
random variable will have a value
greater than m, and ½ that it will have
a value less than m.

• Variance
The variance of a random variable tells us something about the spread of the possible values of the variable.
Variance, Var( X ), is defined as the average of the squared differences of X from the mean:

𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿) = 𝝈𝝈𝟐𝟐 = 𝑬𝑬 (𝑿𝑿 – 𝝁𝝁)𝟐𝟐 = 𝐸𝐸(𝑋𝑋 2 )– 2𝜇𝜇 � �� + 𝜇𝜇 2 = 𝑬𝑬(𝑿𝑿𝟐𝟐 ) − 𝝁𝝁𝟐𝟐


𝐸𝐸(𝑋𝑋)
𝜇𝜇

Discrete Random Variables Continuous Random Variables


2 𝑏𝑏
𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿) = 𝛴𝛴 �𝑥𝑥 – 𝜇𝜇� 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) 𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿) = 𝝈𝝈𝟐𝟐 = ∫𝑎𝑎 (𝑥𝑥 − 𝜇𝜇)2 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
𝑏𝑏 𝑏𝑏
= 𝛴𝛴𝑥𝑥 2 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) − 2𝜇𝜇 𝛴𝛴 𝑥𝑥 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = ∫𝑎𝑎 𝑥𝑥 2 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 − 2𝜇𝜇 ∫𝑎𝑎 𝑥𝑥𝑥𝑥(𝑥𝑥)𝑑𝑑𝑑𝑑
𝑏𝑏
−𝜇𝜇2 𝛴𝛴 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) + 𝜇𝜇2 ∫𝑎𝑎 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
𝑏𝑏
= 𝛴𝛴𝑥𝑥 2 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) − 2𝜇𝜇 ∙ 𝜇𝜇 + 𝜇𝜇2 = ∫𝑎𝑎 𝑥𝑥 2 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 − 2𝜇𝜇 ∙ 𝜇𝜇 + 𝜇𝜇 2 =
𝑏𝑏
𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿) = 𝛴𝛴𝑥𝑥 2 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) − 𝜇𝜇2 𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿) = ∫𝑎𝑎 𝑥𝑥 2 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 − 𝜇𝜇 2
4

• Properties of variance

Note that the variance does not behave in the same way as expectation when we multiply and add constants to
random variables. In fact:

• 𝑉𝑉𝑉𝑉𝑉𝑉[𝑎𝑎𝑎𝑎 + 𝑏𝑏] = 𝑎𝑎2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋)

𝑉𝑉 𝑎𝑎𝑎𝑎(𝑎𝑎𝑎𝑎 + 𝑏𝑏) = 𝐸𝐸[ (𝑎𝑎𝑎𝑎 + 𝑏𝑏)2 ] − (𝐸𝐸 [𝑎𝑎𝑎𝑎 + 𝑏𝑏])2

= 𝐸𝐸[ 𝑎𝑎2 𝑋𝑋 2 + 2𝑎𝑎𝑎𝑎𝑎𝑎 + 𝑏𝑏 2 ] − (𝑎𝑎𝑎𝑎(𝑋𝑋) + 𝑏𝑏) 2

= 𝑎𝑎2 𝐸𝐸[𝑋𝑋 2 ] + 2𝑎𝑎𝑎𝑎𝑎𝑎(𝑋𝑋) + 𝑏𝑏 2 − 𝑎𝑎2 𝐸𝐸 2 (X) − 2𝑎𝑎𝑎𝑎𝑎𝑎(𝑋𝑋) − 𝑏𝑏 2

= 𝑎𝑎2 𝐸𝐸[𝑋𝑋 2 ] − 𝑎𝑎2 𝐸𝐸 2 (X)

= 𝑎𝑎2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋)

• 𝑉𝑉𝑉𝑉𝑉𝑉[𝑋𝑋 + 𝑌𝑌] = 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) + 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) (true only for independent)


• 𝐼𝐼𝐼𝐼 𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔𝑔 𝑉𝑉𝑉𝑉𝑉𝑉[𝑋𝑋 + 𝑌𝑌] ≠ 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) + 𝑉𝑉𝑉𝑉𝑉𝑉(𝑌𝑌) (but non independent − not IB)

• Standard deviation of X is the square root of Var(X).

𝝈𝝈 = �𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿)
5

Binomial Distribution – Discrete Random Variable


Criteria that must be met in order for a random probability distribution to be a binomial distribution:
 The experiment consists of n repeated trials.
 Each trial can result in just two possible outcomes. We call one of these outcomes a success and the
other, a failure.
 The probability of success, denoted by p, is the same on every trial.
 The trials are independent; that is, the outcome on one trial does not affect the outcome on other trials –
the probability of success is a constant in each trial

If a random variable X has a binomial distribution, we write


• 𝑿𝑿 ~ 𝑩𝑩(𝒏𝒏, 𝒑𝒑) (~ means ‘has distribution…’).
𝑛𝑛
Probability density function is: • 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = � � 𝑝𝑝 𝑥𝑥 (1 − 𝑝𝑝)𝑛𝑛−𝑥𝑥 𝑥𝑥 = 0, 1, … , 𝑛𝑛
𝑥𝑥
If X is a random variable is binomial with parameters n and p, then mean and variance are:

• 𝐸𝐸(𝑋𝑋) = 𝜇𝜇 = 𝑛𝑛𝑛𝑛 • n is number of trials


• p is the probability of a success
• 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋) = 𝜎𝜎 2 = 𝑛𝑛𝑛𝑛(1 − 𝑝𝑝)
• (1 – p) is the probability of a failure.

Example
Suppose a die is tossed 5 times. What is the probability of getting exactly 2 fours?
Solution: This is a binomial experiment in which the number of trials is equal to 5, the number of
successes is equal to 2, and the probability of success on a single trial is 1/6 or about 0.167.
Therefore, the binomial probability is:
n = 5, p = 0.167 x=2
5
𝑃𝑃(𝑋𝑋 = 2) = � � 0.1672 (0.833)3 = 0.161
2

• Binompdf(# of trials , probability of success , # of specific event)


Computes probability of obtaining exactly “x” events/successes in “n” trials
𝑃𝑃(𝑋𝑋 = 𝑥𝑥 ) = 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑝𝑝𝑑𝑑𝑑𝑑 (𝑛𝑛, 𝑝𝑝, 𝑥𝑥)

• Binomcdf((# of trials , probability of success , , # of specific event)


Computes cumulative probability when the number of successes is at most the value x
within n trials (sum of all probabilities up to x)
𝑃𝑃(𝑋𝑋 ≤ 𝑥𝑥) = 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑐𝑐𝑑𝑑𝑑𝑑(𝑛𝑛, 𝑝𝑝, 𝑥𝑥)

Binomial Probability: "Exactly", "At Most", "At Least"


Question: A fair coin is tossed 100 times. What is the probability that:
a. heads will appear exactly 52 times?
binompdf(100,.5,52)
• 𝑃𝑃(𝑋𝑋 = 52) = �
100
� 0.552 0.548 ■ 0.0735270104
52

b. there will be at most 52 heads?

52
binomcdf(100,.5,52)
100
• 𝑃𝑃(𝑥𝑥 ≤ 52) = � � � 0.5𝑥𝑥 0.5100−𝑥𝑥 ■ 0.6913502844
𝑥𝑥
𝑥𝑥=0

Because this is a "cumulative" function, it will find


cumulative ≡ at most (up to)
the sum of all of the probabilities up to, and including,
the given value of 52.
6

c. there will be at least 48 heads?

There is no built-in calculator command for "at least".


Instead: "at least" 48 is the complement of "at most" 47
𝑃𝑃(𝑥𝑥 ≥ 48) = 1 − 𝑃𝑃(𝑥𝑥 ≤ 47)

100
100
• 𝑃𝑃(𝑥𝑥 ≥ 48) = � � � 0.5𝑥𝑥 0.5100−𝑥𝑥
𝑥𝑥
𝑥𝑥=48
1-binomcdf(100,.5,47)
• 𝑃𝑃(𝑥𝑥 ≥ 𝑟𝑟) = 1 − 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 (𝑛𝑛, 𝑝𝑝, 𝑟𝑟 − 1). ■ 0.6913502844

The fact that this answer is the same as the "at most" answer for
the number 52, is due to the symmetric nature of the distribution
about its mean of 50.

Binomial Distribution Problems


(1) A company owns 400 laptops. Each laptop has an 8% probability of not working. You randomly select 20
laptops for your salespeople.
(a) What is the likelihood that 5 will be broken? (b) What is the likelihood that they will all work?
(c) What is the likelihood that they will all be broken?

(2) A study indicates that 4% of American teenagers have tattoos. You randomly sample 30 teenagers. What is
the likelihood that exactly 3 will have a tattoo?

(3) An XYZ cell phone is made from 55 components. Each component has a .002 probability of being defective.
What is the probability that an XYZ cell phone will not work perfectly?

(4) The ABC Company manufactures toy robots. About 1 toy robot per 100 does not work. You purchase 35
ABC toy robots. What is the probability that exactly 4 do not work?

(5) The LMB Company manufactures tires. They claim that only .007 of LMB tires are defective. What is the
probability of finding 2 defective tires in a random sample of 50 LMB tires?

(6) An HDTV is made from 100 components. Each component has a .005 probability of being defective. What is
the probability that an HDTV will not work perfectly?

Binomial Distribution SOLUTIONS

(1) (a) 20C


5 (.08)
5 (.92)15 = .0145 (b) 20C0 (.08)0(.92)20 = .1887
(c) 20C20 (.08)20(.92)0 = .0000000000000000000001
(2) 30C3 (.04)3 (.96)27 = .0863

(3) Probability that it will work (0 defective components) 55C0 (.002)0 (.998)55 = .896
Probability that it will not work perfectly is 1 - .896 = .104 or 10.4%
(4) 35C4 (.01)4 (.99)31 = .00038
(5) 50C2 (.007)2 (.993)48 = .0428
(6) Probability that it will work (0 defective components) 100C0 (.005)0 (.995)100 = .606
Probability that it will not work perfectly is 1 - .606 = .394 or 39.40%
7

Normal distribution – continuous random variable.


Standardization of normal variables.
Data can be "distributed" (spread out) in different ways. It can be spread out more on the left or more on the right,
but there are many cases where the data tends to be around a central value with no bias left or right, and it gets
close to a "Bell Curve". That is called Normal Distribution. Many things closely follow a Normal Distribution:
•heights of people
•size of things produced by machines
•errors in measurements
•blood pressure
•marks on a test (not at North Hills)
▪ Most results are close to the mean,
while few are much left or right of the mean.

A continuous random variable X follows a normal distribution if it has the following probability density function :

1 1 𝑥𝑥−𝜇𝜇 2
• 𝑓𝑓(𝑥𝑥) = 𝑒𝑒 −2� 𝜎𝜎

− ∞ < 𝑥𝑥 < ∞ < ∞ 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 ‼! − 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
𝜎𝜎√2𝜋𝜋

The grand majority of continuous distributions are normal distributions, where the probability density decreases
according to how far the value is from the mean. This is particularly true for variables in nature.
The parameters of the distribution are μ and 𝝈𝝈𝟐𝟐 , where μ is the mean (expectation) of the distribution and 𝜎𝜎 2 is
the variance.
𝑿𝑿 ~ 𝑵𝑵(𝝁𝝁, 𝝈𝝈𝟐𝟐 ) means that the random variable X has a normal distribution with parameters μ and 𝜎𝜎 2 .

Properties
▪ The curve is symmetrical about the line x = μ
▪ lim 𝑓𝑓(𝑥𝑥) = 0
𝑥𝑥→±∞

▪ ∫−∞ 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = 1
𝑑𝑑 1 𝑥𝑥 − 𝜇𝜇 1 𝑥𝑥−𝜇𝜇 2
▪ 𝜇𝜇 = 𝑚𝑚𝑚𝑚𝑚𝑚{𝑓𝑓(𝑥𝑥)} � 𝑓𝑓(𝑥𝑥) = − � � 𝑒𝑒−2� 𝜎𝜎

=0 ⇒ 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 : 𝒙𝒙 = 𝝁𝝁 �
𝑑𝑑𝑑𝑑 𝜎𝜎2 √2𝜋𝜋 𝜎𝜎
𝑑𝑑 2 1 𝑥𝑥 − 𝜇𝜇 2 −1�𝑥𝑥−𝜇𝜇 �2
▪ 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖: 𝒙𝒙 = 𝝁𝝁 ± 𝝈𝝈 � 𝑓𝑓(𝑥𝑥) = − �1 − � � � 𝑒𝑒 2 𝜎𝜎 = 0�
𝑑𝑑𝑥𝑥 2 𝜎𝜎 3 √2𝜋𝜋 𝜎𝜎
For a normal curve, standard deviation σ is uniquely
determined as the horizontal distance from the vertical line
of symmetry 𝑥𝑥 = 𝜇𝜇 to the point of inflection.
In a normal distribution, 68.26% of values lie within
one standard deviation of the mean, 95.4% of values lie
within two standard deviations of the mean and 99.74%
of values lie within three standard deviations of the mean.

Example: 95% of students at school are between 1.1m and 1.7m tall.
Assuming this data is normally distributed can you calculate the mean and standard deviation?
The mean is halfway between 1.1m and 1.7m:
Mean = (1.1m + 1.7m) / 2 = 1.4m
95% is 2 standard deviations either side of the mean
8

(a total of 4 standard deviations) so:


1 standard deviation = (1.7m-1.1m) / 4 = 0.6m / 4 = 0.15m

1 𝑥𝑥
▪ 𝑃𝑃(𝑥𝑥 ≤ 𝑥𝑥1 ) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = ∫−∞ 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑

▪ 𝑃𝑃(𝑥𝑥 ≥ 𝑥𝑥2 ) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = ∫𝑥𝑥 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
2
𝑥𝑥
▪ 𝑃𝑃(𝑥𝑥1 ≤ 𝑥𝑥 ≤ 𝑥𝑥2 ) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = ∫𝑥𝑥 2 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
1

▪ 𝐹𝐹𝐹𝐹𝐹𝐹 𝑎𝑎 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣, 𝑡𝑡ℎ𝑒𝑒 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑧𝑧𝑧𝑧𝑧𝑧𝑧𝑧
𝑃𝑃(𝑋𝑋 = 𝑥𝑥1 ) = 0 ⇒ 𝑃𝑃(𝑥𝑥1 ≤ 𝑋𝑋 ≤ 𝑥𝑥2 ) = 𝑃𝑃(𝑥𝑥1 < 𝑋𝑋 < 𝑥𝑥2 ) = 𝑃𝑃(𝑥𝑥1 ≤ 𝑋𝑋 < 𝑥𝑥2 ) 𝑒𝑒𝑒𝑒𝑒𝑒.

• Normalpdf(x,𝝁𝝁,𝝈𝝈)
This is not probability in the sense of probability. Probability of any single value is zero.
Using this function returns the y-coordinates of the normal curve. Use this to graph a normal
curve. We don’t do that. -

• InvNorm(probability,𝝁𝝁,𝝈𝝈)
Inverse normal probability distribution function
Given the probability, this function returns the x – value region to the left of x – value.
𝑃𝑃(𝑥𝑥 ≤ 𝑎𝑎) = Ф(𝑎𝑎) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 (𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝)

𝜇𝜇 = 2870; 𝜎𝜎 = 900
𝑃𝑃(𝑥𝑥 ≤ 𝑎𝑎) = 0.3409
InvNorm(0.3409,2870,900)=2500

• Normalcdf(lower bound,upper bound,𝝁𝝁,𝝈𝝈)


Calculates cumulative probability of random variable x to be in interval from lower bound to upper bound .
Technically, it returns the percentage of area under a continuous distribution curve from lower bound to
upper bound.
𝑃𝑃(𝑥𝑥1 ≤ 𝑋𝑋 ≤ 𝑥𝑥2 ) = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 ( 𝑥𝑥1 , 𝑥𝑥2 , 𝜇𝜇, 𝜎𝜎)

You can set the lower bound or upper bound to be ±∞:


𝑃𝑃(𝑥𝑥 ≥ 𝑥𝑥2 ) = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 ( 𝑥𝑥2 , 1𝐸𝐸99, 𝜇𝜇, 𝜎𝜎) the largest value the calculator can handle is 1099 ,
so it represents positive infinity
𝑃𝑃(𝑥𝑥 ≤ 𝑥𝑥1 ) = 𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛 (−1𝐸𝐸99, 𝑥𝑥1 , 𝜇𝜇, 𝜎𝜎) the smallest value the calculator can handle is −1099 ,
so it represents negative infinity

EX: Given a normal distribution of values for which the mean is 70 and the standard deviation is 4.5. Find:

a) the probability that a value is between 65 and 80, inclusive. normalcdf(65,80,70,4.5)


finding the probability of the cumulative interval from 65 to 80. ■ 0.8536055925
The probability is 85.361%.

b) the probability that a value is greater than or equal to 75. normalcdf(75, 1𝐸𝐸99,70,4.5)
The upper boundary is positive infinity, 1𝐸𝐸99.
■ 0.1332603064
The probability is 13.326%.
9

c) the probability that a value is less than 62. normalcdf(− 1𝐸𝐸99,62,70,4.5)


The lower boundary is negative infinity, −1𝐸𝐸99. ■ 0.0377201305
The probability is 3.772%.
th
d) the 90 percentile for this distribution.
invNorm(0.90,70,4.5)
Given a probability region to the left of a value (i.e., a percentile),
determine the value using invNorm.) ■ 75.76698205
The x-value is 75.767.

NOTE: A mean of zero and a standard deviation of one are considered to be the default values
for a standardized normal distribution on the calculator, if you choose not to set these values.
I don’t think I know how to do it. I simply put manually 𝜇𝜇 = 0; 𝜎𝜎 = 1

Standard score or z – score is the number of standard deviations from the mean.

Example: In that same school one of your friends is 1.85m tall


How far is 1.85m from the mean?
1.85 - 1.4 = 0.45m
How many standard deviations is that?
0.45m / 0.15m = 3 standard deviations
Your friend's height has a "z-score" of 3.0

Standard normal distribution or Z-distribution


A normal distribution with 𝜇𝜇 = 0 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 = 1 is called a standard normal distribution.
If 𝑍𝑍 ~ 𝑁𝑁(0, 1), then Z is said to follow a standard normal distribution.
Normal distributions have different means and standard deviations.
As c.d.f. has to be determined by doing integration, and these integrals non analytical integrals, so results have to
be looked up in statistical tables.
Standardising: Now, the mean and variance of the normal distribution can be any value and so clearly there can't
be a statistical table for each one. Instead, we convert to the standard normal distribution and we use statistical
tables for the standard normal distribution to find the c.d.f. of any normal distribution.

A value from any normal distribution can be transformed into its corresponding value on a standard
normal distribution using the following formula:
𝑋𝑋 − 𝜇𝜇
𝑍𝑍 = (z − score )
𝜎𝜎
where Z is the value on the standard
normal distribution, X is the value on
the original distribution, μ is the mean
of the original distribution, and σ is
the standard deviation of the original distribution.
𝑋𝑋 = 𝜇𝜇 + 𝑍𝑍𝑍𝑍
It tells us where is the value of X in fractions of 𝜎𝜎.
10

Example: Professor Einstein is marking a test.


Here are the students results (out of 60 points):
20, 15, 26, 32, 18, 28, 35, 14, 26, 22, 17
Most students didn't even get 30 out of 60, and most will fail.
The test must have been really hard, so the Prof decides to Standardize all the scores and only fail people 1 standard
deviation below the mean.
The Mean is 23, and the Standard Deviation is 6.6, and these are the Standard Scores:
-0.45, -1.21, 0.45, 1.36, -0.76, 0.76, 1.82, -1.36, 0.45, -0.15, -0.91
Only 2 students will fail (the ones who scored 15 and 14 on the test)

Z-scores are expressed in terms of standard deviations from their means.


This makes it useful when comparing results from two or more different normal distributions, since comparing
Z-values allows one to take into account the standard deviation and mean when comparing results.
The standard score does this by converting (in other words, standardizing) scores in a normal distribution to z-scores
in what becomes a standard normal distribution. THIS IS HOW CALCULATORS WORK.
Because probability of finding certain value(s) of some quantity can not depend on how we express it, area that
represent probability of finding it cannot be different in different representations.

EXAMPLE: If Z is the standard normal distribution,


find k such that P(Z > k) = 0.73 .
Interpret your result.

𝑃𝑃(𝑍𝑍 ≤ 𝑘𝑘) = 0.27


∴ 𝑘𝑘 ≈ −0.613
This means 73% of the Z-distribution values are more
than – 0.613 .

EXAMPLE: A university professor determines that no more than 80% of this year’s History candidates should
pass the final examination. The examination results were approximately normally distributed with mean 62 and
standard deviation 13. Find the lowest score necessary to pass the examination.

Let X denote the final examination result, 𝑠𝑠𝑠𝑠 𝑋𝑋 ~ 𝑁𝑁(62, 132 ).


We need to find k such that
𝑃𝑃(𝑋𝑋 ≥ 𝑘𝑘) = 08
𝑃𝑃(𝑋𝑋 < 𝑘𝑘) = 0.2
∴ 𝑘𝑘 ≈ 51.059
So, the minimum pass mark is 52.
11

For some questions we must convert to z-scores in order to find the answer.
We always need to convert to z-scores if we are trying to find an unknown mean or standard
deviation.
When we get a problem with given probability, and with no 𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 we can not use
InvNorm(probability,𝜇𝜇,𝜎𝜎) to find value of x. Then we use Standard normal distribution:
InvNorm(probability,0,1) because we know we will get Z value, from which there will be the way to find x,
𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 (depending on the problem).

EXAMPLE: An adult scallop population is known to be normally distributed with a standard deviation of 5.9 g. If
15% of scallops weigh less than 58.2 g, find the mean weight of the population.

Let the mean weight of the population be µ g.


If X g denotes the weight of an adult scallop, then 𝑋𝑋 ~ 𝑁𝑁(𝜇𝜇, 5.92 ).
As we do not know µ we cannot use the invNorm directly, but we can convert to z-scores and use
the properties of N(0, 12).

𝑃𝑃(𝑋𝑋 ≤ 58.2) = 0.15

58.2 − 𝜇𝜇
∴ 𝑃𝑃 �𝑍𝑍 ≤ � = 0.15
5.9

Using invNorm for N(0, 12)

58.2 − 𝜇𝜇
≈ −1.0364
5.9

𝜇𝜇 ≈ 64.3

So, the mean weight is 64:3 g.

EXAMPLE:
12

Normal Distribution Problems


1. X is a normally normally distributed variable with mean μ = 30 and standard deviation σ = 4. Find

a) P(x < 40)


b) P(x > 21)
c) P(30 < x < 35)

2. A radar unit is used to measure speeds of cars on a motorway. The speeds are normally distributed
with a mean of 90 km/hr and a standard deviation of 10 km/hr. What is the probability that a car picked
at random is travelling at more than 100 km/hr?

3. For a certain type of computers, the length of time bewteen charges of the battery is normally
distributed with a mean of 50 hours and a standard deviation of 15 hours. John owns one of these
computers and wants to know the probability that the length of time will be between 50 and 70 hours.

4. Entry to a certain University is determined by a national test. The scores on this test are normally
distributed with a mean of 500 and a standard deviation of 100. Tom wants to be admitted to this
university and he knows that he must score better than at least 70% of the students who took the test.
Tom takes the test and scores 585. Will he be admitted to this university?

5. The length of similar components produced by a company are approximated by a normal distribution
model with a mean of 5 cm and a standard deviation of 0.02 cm. If a component is chosen at random

a) what is the probability that the length of this component is between 4.98 and 5.02 cm?
b) what is the probability that the length of this component is between 4.96 and 5.04 cm?

6. The length of life of an instrument produced by a machine has a normal ditribution with a mean of 12
months and standard deviation of 2 months. Find the probability that an instrument produced by this
machine will last
a) less than 7 months.
b) between 7 and 12 months.

7. The time taken to assemble a car in a certain plant is a random variable having a normal distribution
of 20 hours and a standard deviation of 2 hours. What is the probability that a car can be assembled at
this plant in a period of time
a) less than 19.5 hours?
b) between 20 and 22 hours?

8. A large group of students took a test in Physics and the final grades have a mean of 70 and a standard
deviation of 10. If we can approximate the distribution of these grades by a normal distribution, what
percent of the students
a) scored higher than 80?
b) should pass the test (grades≥60)?
c) should fail the test (grades<60)?

9. The annual salaries of employees in a large company are approximately normally distributed with a
mean of $50,000 and a standard deviation of $20,000.

a) What percent of people earn less than $40,000?


b) What percent of people earn between $45,000 and $65,000?
c) What percent of people earn more than $70,000?
13

Normal Distribution SOLUTIONS


1. a) For x = 40, the z-value z = (40 - 30) / 4 = 2.5
Hence P(x < 40) = P(z < 2.5) = [area to the left of 2.5] = 0.9938
b) For x = 21, z = (21 - 30) / 4 = -2.25
Hence P(x > 21) = P(z > -2.25) = [total area] - [area to the left of -2.25]
= 1 - 0.0122 = 0.9878
c) For x = 30 , z = (30 - 30) / 4 = 0 and for x = 35, z = (35 - 30) / 4 = 1.25
Hence P(30 < x < 35) = P(0 < z < 1.25) = [area to the left of z = 1.25] - [area to the left of 0]
= 0.8944 - 0.5 = 0.3944

2. Let x be the random variable that represents the speed of cars. x has μ = 90 and σ = 10. We have to find
the probability that x is higher than 100 or P(x > 100)
For x = 100 , z = (100 - 90) / 10 = 1
P(x > 90) = P(z >, 1) = [total area] - [area to the left of z = 1]
= 1 - 0.8413 = 0.1587
The probability that a car selected at a random has a speed greater than 100 km/hr is equal to 0.1587

3. Let x be the random variable that represents the length of time. It has a mean of 50 and a standard
deviation of 15. We have to find the probability that x is between 50 and 70 or P( 50< x < 70)
For x = 50 , z = (50 - 50) / 15 = 0
For x = 70 , z = (70 - 50) / 15 = 1.33 (rounded to 2 decimal places)
P( 50< x < 70) = P( 0< z < 1.33) = [area to the left of z = 1.33] - [area to the left of z = 0]
= 0.9082 - 0.5 = 0.4082
The probability that John's computer has a length of time between 50 and 70 hours is equal to 0.4082.

4. Let x be the random variable that represents the scores. x is normally ditsributed with a mean of 500
and a standard deviation of 100. The total area under the normal curve represents the total number of students
who took the test. If we multiply the values of the areas under the curve by 100, we obtain percentages.
For x = 585 , z = (585 - 500) / 100 = 0.85
The proportion P of students who scored below 585 is given by
P = [area to the left of z = 0.85] = 0.8023 = 80.23%
Tom scored better than 80.23% of the students who took the test and he will be admitted to this University.

5. a) P(4.98 < x < 5.02) = P(-1 < z < 1) = 0.6826


b) P(4.96 < x < 5.04) = P(-2 < z < 2) = 0.9544

6. a) P(x < 7) = P(z < -2.5)


= 0.0062
b) P(7 < x < 12) = P(-2.5 < z < 0)
= 0.4938
7. a) P(x < 19.5) = P(z < -0.25)
= 0.4013
b) P(20 < x < 22) = P(0 < z < 1)
= 0.3413
8. a) For x = 80, z = 1
Area to the right (higher than) z = 1 is equal to 0.1586 = 15.87% scored more that 80.
b) For x = 60, z = -1
Area to the right of z = -1 is equal to 0.8413 = 84.13% should pass the test.
c)100% - 84.13% = 15.87% should fail the test.
a) For x = 40000, z = -0.5
Area to the left (less than) of z = -0.5 is equal to 0.3085 = 30.85% earn less than $40,000.
b) For x = 45000 , z = -0.25 and for x = 65000, z = 0.75
Area between z = -0.25 and z = 0.75 is equal to 0.3720 = 37.20 earn between $45,000 and $65,000.
c)For x = 70000, z = 1

Area to the right (higher) of z = 1 is equal to 0.1586 = 15.86% earn more than $70,000.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy