RANDOM - Discrete and Continuous - VARIABLE
RANDOM - Discrete and Continuous - VARIABLE
Example:
A coin is tossed three times
X represents the number of heads that we throw
x 0 1 2 3
1 3 3 1
P(X = x)
8 8 8 8
1. 0 ≤ 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) ≤ 1
2. � 𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = 1
𝑥𝑥
3. 𝑃𝑃(𝑋𝑋 = 𝑥𝑥𝑛𝑛 ) = 1 − � 𝑃𝑃(𝑋𝑋 = 𝑥𝑥𝑘𝑘 )
𝑘𝑘≠𝑛𝑛
P(X = x), the probability distribution of x, involves listing P(𝑥𝑥𝑖𝑖 ) for each 𝑥𝑥𝑖𝑖 .
𝑃𝑃 (𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑥𝑥𝑛𝑛 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜) = 1 − 𝑃𝑃(𝑎𝑎𝑎𝑎𝑎𝑎 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒 𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜𝑜)
Example:
A die is thrown repeatedly until a 6 is obtained. Find the probability density function for the number times we
throw the die.
Let X be the random variable representing the number of times we throw the die.
P(X = 1) = 1/6 (probability that we get 6 on our first throw)
P(X = 2) = (5/6) × (1/6) (probability that we get 6 on second throw; if we throw the die twice
before getting a 6, first throw is something that isn't a 6 (probability
is 5/6) and second throw is a 6
etc
5 𝑥𝑥−1 1
𝑃𝑃(𝑋𝑋 = 𝑥𝑥) = � �
6 6
2
• Expected value (or mean) is a weighted average of the possible values that X can take, each value
being weighted according to the probability of that event occurring. The expected value of X is usually
written as E(X) or μ.
sum of: [(each of the possible outcomes) × (the probability of the outcome occurring)].
Example
What is the expected value when we roll a fair die?
There are six possible outcomes: 1, 2, 3, 4, 5, 6. Each of these has a probability of 1/6 of occurring.
Let X represent the outcome of the experiment.
1
𝑃𝑃(𝑋𝑋 = 1) = 𝑃𝑃(𝑋𝑋 = 2) = ⋯ = 𝑃𝑃(𝑋𝑋 = 6) =
6
𝐸𝐸(𝑋𝑋) = 1 × 𝑃𝑃(𝑋𝑋 = 1) + 2 × 𝑃𝑃(𝑋𝑋 = 2) + 3 × 𝑃𝑃(𝑋𝑋 = 3) + 4 × 𝑃𝑃(𝑋𝑋 = 4) + 5 × 𝑃𝑃(𝑋𝑋 = 5) + 6 × 𝑃𝑃(𝑋𝑋 = 6)
1 2 3 4 5 6 7
𝐸𝐸(𝑋𝑋) = + + + + + = = 3.5
6 6 6 6 6 6 2
Example
• Median is the value with half the probabilities below and half above the median value
Discrete Random Variables Continuous Random Variables
The median of a random variable X is a
The median is the middle value when there are an
odd number of measurements listed, and it is the number m such that
𝑚𝑚
average of the two middle values when there are an 1
even number of measurements listed. It is the � 𝑓𝑓(𝑥𝑥) =
2
value of X such that 𝑎𝑎
1 1 The median m is the number for which
𝑃𝑃(𝑋𝑋 ≤ 𝑥𝑥) ≥ 𝑎𝑎𝑎𝑎𝑎𝑎 𝑃𝑃(𝑋𝑋 ≥ 𝑥𝑥) ≥
2 2 the probability is exactly ½ that the
random variable will have a value
greater than m, and ½ that it will have
a value less than m.
• Variance
The variance of a random variable tells us something about the spread of the possible values of the variable.
Variance, Var( X ), is defined as the average of the squared differences of X from the mean:
• Properties of variance
Note that the variance does not behave in the same way as expectation when we multiply and add constants to
random variables. In fact:
= 𝑎𝑎2 𝑉𝑉𝑉𝑉𝑉𝑉(𝑋𝑋)
𝝈𝝈 = �𝑽𝑽𝑽𝑽𝑽𝑽(𝑿𝑿)
5
Example
Suppose a die is tossed 5 times. What is the probability of getting exactly 2 fours?
Solution: This is a binomial experiment in which the number of trials is equal to 5, the number of
successes is equal to 2, and the probability of success on a single trial is 1/6 or about 0.167.
Therefore, the binomial probability is:
n = 5, p = 0.167 x=2
5
𝑃𝑃(𝑋𝑋 = 2) = � � 0.1672 (0.833)3 = 0.161
2
52
binomcdf(100,.5,52)
100
• 𝑃𝑃(𝑥𝑥 ≤ 52) = � � � 0.5𝑥𝑥 0.5100−𝑥𝑥 ■ 0.6913502844
𝑥𝑥
𝑥𝑥=0
100
100
• 𝑃𝑃(𝑥𝑥 ≥ 48) = � � � 0.5𝑥𝑥 0.5100−𝑥𝑥
𝑥𝑥
𝑥𝑥=48
1-binomcdf(100,.5,47)
• 𝑃𝑃(𝑥𝑥 ≥ 𝑟𝑟) = 1 − 𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏𝑏 (𝑛𝑛, 𝑝𝑝, 𝑟𝑟 − 1). ■ 0.6913502844
The fact that this answer is the same as the "at most" answer for
the number 52, is due to the symmetric nature of the distribution
about its mean of 50.
(2) A study indicates that 4% of American teenagers have tattoos. You randomly sample 30 teenagers. What is
the likelihood that exactly 3 will have a tattoo?
(3) An XYZ cell phone is made from 55 components. Each component has a .002 probability of being defective.
What is the probability that an XYZ cell phone will not work perfectly?
(4) The ABC Company manufactures toy robots. About 1 toy robot per 100 does not work. You purchase 35
ABC toy robots. What is the probability that exactly 4 do not work?
(5) The LMB Company manufactures tires. They claim that only .007 of LMB tires are defective. What is the
probability of finding 2 defective tires in a random sample of 50 LMB tires?
(6) An HDTV is made from 100 components. Each component has a .005 probability of being defective. What is
the probability that an HDTV will not work perfectly?
(3) Probability that it will work (0 defective components) 55C0 (.002)0 (.998)55 = .896
Probability that it will not work perfectly is 1 - .896 = .104 or 10.4%
(4) 35C4 (.01)4 (.99)31 = .00038
(5) 50C2 (.007)2 (.993)48 = .0428
(6) Probability that it will work (0 defective components) 100C0 (.005)0 (.995)100 = .606
Probability that it will not work perfectly is 1 - .606 = .394 or 39.40%
7
A continuous random variable X follows a normal distribution if it has the following probability density function :
1 1 𝑥𝑥−𝜇𝜇 2
• 𝑓𝑓(𝑥𝑥) = 𝑒𝑒 −2� 𝜎𝜎
�
− ∞ < 𝑥𝑥 < ∞ < ∞ 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 𝑖𝑖𝑖𝑖 𝑛𝑛𝑛𝑛𝑛𝑛 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 ‼! − 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
𝜎𝜎√2𝜋𝜋
The grand majority of continuous distributions are normal distributions, where the probability density decreases
according to how far the value is from the mean. This is particularly true for variables in nature.
The parameters of the distribution are μ and 𝝈𝝈𝟐𝟐 , where μ is the mean (expectation) of the distribution and 𝜎𝜎 2 is
the variance.
𝑿𝑿 ~ 𝑵𝑵(𝝁𝝁, 𝝈𝝈𝟐𝟐 ) means that the random variable X has a normal distribution with parameters μ and 𝜎𝜎 2 .
Properties
▪ The curve is symmetrical about the line x = μ
▪ lim 𝑓𝑓(𝑥𝑥) = 0
𝑥𝑥→±∞
∞
▪ ∫−∞ 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑 = 1
𝑑𝑑 1 𝑥𝑥 − 𝜇𝜇 1 𝑥𝑥−𝜇𝜇 2
▪ 𝜇𝜇 = 𝑚𝑚𝑚𝑚𝑚𝑚{𝑓𝑓(𝑥𝑥)} � 𝑓𝑓(𝑥𝑥) = − � � 𝑒𝑒−2� 𝜎𝜎
�
=0 ⇒ 𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 : 𝒙𝒙 = 𝝁𝝁 �
𝑑𝑑𝑑𝑑 𝜎𝜎2 √2𝜋𝜋 𝜎𝜎
𝑑𝑑 2 1 𝑥𝑥 − 𝜇𝜇 2 −1�𝑥𝑥−𝜇𝜇 �2
▪ 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖: 𝒙𝒙 = 𝝁𝝁 ± 𝝈𝝈 � 𝑓𝑓(𝑥𝑥) = − �1 − � � � 𝑒𝑒 2 𝜎𝜎 = 0�
𝑑𝑑𝑥𝑥 2 𝜎𝜎 3 √2𝜋𝜋 𝜎𝜎
For a normal curve, standard deviation σ is uniquely
determined as the horizontal distance from the vertical line
of symmetry 𝑥𝑥 = 𝜇𝜇 to the point of inflection.
In a normal distribution, 68.26% of values lie within
one standard deviation of the mean, 95.4% of values lie
within two standard deviations of the mean and 99.74%
of values lie within three standard deviations of the mean.
Example: 95% of students at school are between 1.1m and 1.7m tall.
Assuming this data is normally distributed can you calculate the mean and standard deviation?
The mean is halfway between 1.1m and 1.7m:
Mean = (1.1m + 1.7m) / 2 = 1.4m
95% is 2 standard deviations either side of the mean
8
1 𝑥𝑥
▪ 𝑃𝑃(𝑥𝑥 ≤ 𝑥𝑥1 ) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = ∫−∞ 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
∞
▪ 𝑃𝑃(𝑥𝑥 ≥ 𝑥𝑥2 ) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = ∫𝑥𝑥 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
2
𝑥𝑥
▪ 𝑃𝑃(𝑥𝑥1 ≤ 𝑥𝑥 ≤ 𝑥𝑥2 ) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 = ∫𝑥𝑥 2 𝑓𝑓(𝑥𝑥)𝑑𝑑𝑑𝑑
1
▪ 𝐹𝐹𝐹𝐹𝐹𝐹 𝑎𝑎 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟𝑟 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣, 𝑡𝑡ℎ𝑒𝑒 𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝 𝑜𝑜𝑜𝑜 𝑎𝑎𝑎𝑎𝑎𝑎 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣𝑣 𝑖𝑖𝑖𝑖 𝑧𝑧𝑧𝑧𝑧𝑧𝑧𝑧
𝑃𝑃(𝑋𝑋 = 𝑥𝑥1 ) = 0 ⇒ 𝑃𝑃(𝑥𝑥1 ≤ 𝑋𝑋 ≤ 𝑥𝑥2 ) = 𝑃𝑃(𝑥𝑥1 < 𝑋𝑋 < 𝑥𝑥2 ) = 𝑃𝑃(𝑥𝑥1 ≤ 𝑋𝑋 < 𝑥𝑥2 ) 𝑒𝑒𝑒𝑒𝑒𝑒.
• Normalpdf(x,𝝁𝝁,𝝈𝝈)
This is not probability in the sense of probability. Probability of any single value is zero.
Using this function returns the y-coordinates of the normal curve. Use this to graph a normal
curve. We don’t do that. -
• InvNorm(probability,𝝁𝝁,𝝈𝝈)
Inverse normal probability distribution function
Given the probability, this function returns the x – value region to the left of x – value.
𝑃𝑃(𝑥𝑥 ≤ 𝑎𝑎) = Ф(𝑎𝑎) = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 (𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝𝑝)
𝜇𝜇 = 2870; 𝜎𝜎 = 900
𝑃𝑃(𝑥𝑥 ≤ 𝑎𝑎) = 0.3409
InvNorm(0.3409,2870,900)=2500
EX: Given a normal distribution of values for which the mean is 70 and the standard deviation is 4.5. Find:
b) the probability that a value is greater than or equal to 75. normalcdf(75, 1𝐸𝐸99,70,4.5)
The upper boundary is positive infinity, 1𝐸𝐸99.
■ 0.1332603064
The probability is 13.326%.
9
NOTE: A mean of zero and a standard deviation of one are considered to be the default values
for a standardized normal distribution on the calculator, if you choose not to set these values.
I don’t think I know how to do it. I simply put manually 𝜇𝜇 = 0; 𝜎𝜎 = 1
Standard score or z – score is the number of standard deviations from the mean.
A value from any normal distribution can be transformed into its corresponding value on a standard
normal distribution using the following formula:
𝑋𝑋 − 𝜇𝜇
𝑍𝑍 = (z − score )
𝜎𝜎
where Z is the value on the standard
normal distribution, X is the value on
the original distribution, μ is the mean
of the original distribution, and σ is
the standard deviation of the original distribution.
𝑋𝑋 = 𝜇𝜇 + 𝑍𝑍𝑍𝑍
It tells us where is the value of X in fractions of 𝜎𝜎.
10
EXAMPLE: A university professor determines that no more than 80% of this year’s History candidates should
pass the final examination. The examination results were approximately normally distributed with mean 62 and
standard deviation 13. Find the lowest score necessary to pass the examination.
For some questions we must convert to z-scores in order to find the answer.
We always need to convert to z-scores if we are trying to find an unknown mean or standard
deviation.
When we get a problem with given probability, and with no 𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 we can not use
InvNorm(probability,𝜇𝜇,𝜎𝜎) to find value of x. Then we use Standard normal distribution:
InvNorm(probability,0,1) because we know we will get Z value, from which there will be the way to find x,
𝜇𝜇 𝑎𝑎𝑎𝑎𝑎𝑎 𝜎𝜎 (depending on the problem).
EXAMPLE: An adult scallop population is known to be normally distributed with a standard deviation of 5.9 g. If
15% of scallops weigh less than 58.2 g, find the mean weight of the population.
58.2 − 𝜇𝜇
∴ 𝑃𝑃 �𝑍𝑍 ≤ � = 0.15
5.9
58.2 − 𝜇𝜇
≈ −1.0364
5.9
𝜇𝜇 ≈ 64.3
EXAMPLE:
12
2. A radar unit is used to measure speeds of cars on a motorway. The speeds are normally distributed
with a mean of 90 km/hr and a standard deviation of 10 km/hr. What is the probability that a car picked
at random is travelling at more than 100 km/hr?
3. For a certain type of computers, the length of time bewteen charges of the battery is normally
distributed with a mean of 50 hours and a standard deviation of 15 hours. John owns one of these
computers and wants to know the probability that the length of time will be between 50 and 70 hours.
4. Entry to a certain University is determined by a national test. The scores on this test are normally
distributed with a mean of 500 and a standard deviation of 100. Tom wants to be admitted to this
university and he knows that he must score better than at least 70% of the students who took the test.
Tom takes the test and scores 585. Will he be admitted to this university?
5. The length of similar components produced by a company are approximated by a normal distribution
model with a mean of 5 cm and a standard deviation of 0.02 cm. If a component is chosen at random
a) what is the probability that the length of this component is between 4.98 and 5.02 cm?
b) what is the probability that the length of this component is between 4.96 and 5.04 cm?
6. The length of life of an instrument produced by a machine has a normal ditribution with a mean of 12
months and standard deviation of 2 months. Find the probability that an instrument produced by this
machine will last
a) less than 7 months.
b) between 7 and 12 months.
7. The time taken to assemble a car in a certain plant is a random variable having a normal distribution
of 20 hours and a standard deviation of 2 hours. What is the probability that a car can be assembled at
this plant in a period of time
a) less than 19.5 hours?
b) between 20 and 22 hours?
8. A large group of students took a test in Physics and the final grades have a mean of 70 and a standard
deviation of 10. If we can approximate the distribution of these grades by a normal distribution, what
percent of the students
a) scored higher than 80?
b) should pass the test (grades≥60)?
c) should fail the test (grades<60)?
9. The annual salaries of employees in a large company are approximately normally distributed with a
mean of $50,000 and a standard deviation of $20,000.
2. Let x be the random variable that represents the speed of cars. x has μ = 90 and σ = 10. We have to find
the probability that x is higher than 100 or P(x > 100)
For x = 100 , z = (100 - 90) / 10 = 1
P(x > 90) = P(z >, 1) = [total area] - [area to the left of z = 1]
= 1 - 0.8413 = 0.1587
The probability that a car selected at a random has a speed greater than 100 km/hr is equal to 0.1587
3. Let x be the random variable that represents the length of time. It has a mean of 50 and a standard
deviation of 15. We have to find the probability that x is between 50 and 70 or P( 50< x < 70)
For x = 50 , z = (50 - 50) / 15 = 0
For x = 70 , z = (70 - 50) / 15 = 1.33 (rounded to 2 decimal places)
P( 50< x < 70) = P( 0< z < 1.33) = [area to the left of z = 1.33] - [area to the left of z = 0]
= 0.9082 - 0.5 = 0.4082
The probability that John's computer has a length of time between 50 and 70 hours is equal to 0.4082.
4. Let x be the random variable that represents the scores. x is normally ditsributed with a mean of 500
and a standard deviation of 100. The total area under the normal curve represents the total number of students
who took the test. If we multiply the values of the areas under the curve by 100, we obtain percentages.
For x = 585 , z = (585 - 500) / 100 = 0.85
The proportion P of students who scored below 585 is given by
P = [area to the left of z = 0.85] = 0.8023 = 80.23%
Tom scored better than 80.23% of the students who took the test and he will be admitted to this University.
Area to the right (higher) of z = 1 is equal to 0.1586 = 15.86% earn more than $70,000.