0% found this document useful (0 votes)

10 views

WEEK - 01 to 12 NOTES

Uploaded by

Bhawna Chandrikapure

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views

WEEK - 01 to 12 NOTES

Uploaded by

Bhawna Chandrikapure

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Statistics for Data Science - 2

Week 1 Important formulas

Basic Probability

1. Experiment: Process or phenomenon that we wish to study statistically.

Example: Tossing a fair coin.

2. Outcome: Result of the experiment.

Example: head is an outcome on tossing a fair coin.

3. Sample space: A sample space is a set that contains all outcomes of an experiment.
• Sample space is a set, typically denoted S of an experiment.
• example: Toss a coin: S = { heads, tails }

4. Event: An event is a subset of the sample space.

• Toss a coin: S = { heads, tails }

– Events: empty set, {heads}, {tails}, { heads, tails }
– 4 events
• An event is said to have “occurred” if the actual outcome of the experiment
belongs to the event.
• One event can be contained in another, i.e. A ⊆ B
• Complement of an event A, denoted AC = { outcomes in S not in A } = (S \ A).
• Since events are subsets, one can do complements, unions, intersections.

5. Disjoint events: Two events with an empty intersection are said to be disjoint events.

• Throw a die: even number, odd number are disjoint.

• Multiple events: E1 , E2 , E3 , .... are disjoint if, for any i 6= j , Ei ∩ Ej = empty set.

6. De Morgan’s laws: For any two events A and B,

(A ∪ B)C = AC ∩ B C and (A ∩ B)C = AC ∪ B C .

7. Probability: “Probability” is a unction P that assigns to each event a real number

between 0 and 1 and satisfies the following two axioms:

(i) P (S) = 1 (probability of the entire sample space equals 1).

(ii) If E1 , E2 , E3 , ... are disjoint events ( Could be infinitely many),

P (E1 ∪ E2 ∪ E3 ∪ ...) = P (E1 ) + P (E2 ) + P (E3 ) + ...

• Probability function Assigns a value that represents chance of occurrence of the

event.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• Higher value of the probability of an event means higher chance of occurring that
event.
• 0 means event cannot occur and 1 means event always occurs.

8. Probability of the empty set (denoted φ) equals 0. that is

P (φ) = 0

9. Let E C be the complement of Event E. Then,

P (E C ) = 1 − P (E)

10. If event E is the subset of event F , that is E ⊆ F , then

P (F ) = P (E) + P (F \ E)

⇒ P (E) ≤ P (F )

11. If E and F are events, then

P (E) = P (E ∩ F ) + P (E \ F )

P (F ) = P (E ∩ F ) + P (F \ E)

12. If E and F are events, then

P (E ∪ F ) = P (E) + P (F ) − P (E ∩ F )

13. Equally likely events: assign the same probability to each outcome.

14. If sample space S contains the equally likely outcomes, then

1
• P (one outcome) =
Number of outcomes in S
Number of outcomes in event
• P (event) =
Number of outcomes in S

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
15. Conditional probability space: Consider a probability space (S, E, P ), where S
represents the sample space, E represents the collection of events, and P represents
the probability function.

• Let B be an event in S with P (B) > 0. Now, conditional probability space given
B is defined as
For any event A in the original probability space (P, S, E), the conditional prob-
P (A ∩ B)
ability of A given B is .
P (B)
• It is denoted by P (A | B). And

P (A ∩ B) = P (B)P (A | B)

16. Law of total probability:

• If the events B and B c partitioned the sample space S such that P (B1 ), P (B2 ) 6= 0,
then for any event A of S,

P (A) = P (A | B)P (B) + P (A | B c )P (B c ).

• In general, if we have k events B1 , B2 , · · · , Bk that partition S, then for any event

A in S,
Xk Xk
P (A) = P (Bi ∩ A) = P (A | Bi )P (Bi ).
i=1 i=1

17. Bayes’ theorem: Let A and B are two events such that P (A) > 0, P (B) > 0.

P (A ∩ B) = P (B)P (A | B) = P (A)P (B | A)

P (B)P (A | B)
⇒ P (B | A) =
P (A)
In general, if the events B1 , B2 , · · · , Bk partition S such that P (Bi ) 6= 0 for i =
1, 2, · · · , k, then for any event A in S such that P (A) 6= 0,

P (Br )P (A | Br )
P (Br | A) = k
P
P (Bi )P (A | Bi )
i=1

for r = 1, 2, · · · , k.

18. Independence of two events: Two events A and B are independent iff

P (A ∩ B) = P (A)P (B)

• A and B independent ⇒ P (A | B) = P (A) and (B | A) = P (B) for P (A), P (B) >

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• Disjoint events are never independent.
• A and B independent ⇒ A and B c are independent.
• A and B independent ⇒ Ac and B c are independent.
19. Mutual independence of three events: Events A, B, and C are mutually indepen-
dent if
(a) P (A ∩ B) = P (A)P (B)
(b) P (A ∩ C) = P (A)P (C)
(c) P (A ∩ B) = P (A)P (B)
(d) P (A ∩ B ∩ C) = P (A)P (B)P (C)
20. Mutual independence of multiple events: Events A1 , A2 , · · · ,n are mutually in-
dependent if, ∀i1 , i2 , · · · , ik ,
P (Ai1 ∩ Ai2 ∩ · · · Aik ∩) = P (Ai1 )P (Ai2 ) · · · P (Aik )
n events are mutually independent ⇒ any subset with or without complementing are
independent as well.
21. Occurrence of event A in a sample space is considered as success.
22. Non - occurrence of event A in a sample space is considered as failure.
23. Repeated independent trials:
(a) Bernoulli trials
• Single Bernoulli trial:
– Sample space is {success, failure} with P(success) = p.
– We can also write the sample space S as {0, 1}, where 0 denotes the
failure and 1 denotes the success with P (1) = p, P (0) = 1 − p.
This kind of distribution is denoted by Bernoulli(p).
• Repeated Bernoulli trials:
– Repeat a Bernoulli trial multiple times independently.
– For each of the trial, the outcome will be either 0 or 1.
(b) Binomial distribution: Perform n independent Bernoulli(p) trials.
• It models the number of success in n independent Bernoulli trials.
• Denoted by B(n, p).
• Sample space is {0, 1, · · · , n}.
• Probability distribution is given by
P (B(n, p) = k) = nCk pk (1 − p)n−k
where n represents the total number trials and k represent the number of
success in n trials.

Page 4

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• P (B = 0) + P (B = 1) + · · · + P (B = n) = 1
⇒ (1 − p)n + nC2 p2 (1 − p)n−2 + · · · + pn = 1.
(c) Geometric distribution: It models the number of failures the first success.
• Outcomes: Number of trials needed for first success and is denoted by G(p).
• Sample space: {1, 2, 3, 4, · · · }
• P (G = k) = P (first k − 1 trials result in 0 and kth trial result in 1.) =
(1 − p)k−1 p.
• Identity: P (G ≤ k) = 1 − (1 − p)k .

Page 5

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2

Week 2 Important formulas

1. Random variable: A random variable is a function with domain as the sample space
of an experiment and range as the real numbers, i.e. a function from the sample space
to the real line.

• Toss a coin, Sample space = {H, T }

– Random variable X : X(H) = 0, X(T ) = 1

2. Random variables and events: If X is a random variable,

(X < x) = {s ∈ S : X(s) < x} is an event for all real x.
So, (X > x), (X = x), (X ≤ x), (X ≥ x) are all events.

• Throw a die, Sample space = {1, 2, 3, 4, 5, 6}

– E =0: event {1, 3, 5}
– E =1: event {2, 4, 6}
– E <0: null event
– E ≤1: event {1, 2, 3, 4, 5, 6}

3. Range of a random variable: The range of a random variable is the set of values
taken by it. Range is a subset of the real line.

• Throw a die, E = 0 if number is odd, E = 1 if number is even

– Range = {0, 1}

4. Discrete random variable: A random variable is said to be discrete if its range is a

discrete set.

5. Probability Mass Function (PMF): The probability mass function (PMF) of a dis-
crete random variable (r.v.) X with range set T is the function fX : T → [0, 1] defined
as
fX (t) = P (X = t) for t ∈ T .

6. Properties of PMF:

• 0 ≤ fX (t) ≤ 1
•
P
t∈T fX (t) = 1

7. Uniform random variable: X ∼ Uniform(T ), where T is some finite set.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• Range: Finite set T
• PMF: fX (t) = 1
|T |
for all t ∈ T

8. Bernoulli random variable: X ∼ Bernoulli(p), where 0 ≤ p ≤ 1.

• Range: {0, 1}
• PMF: fX (0) = 1 − p, fX (1) = p
9. Binomial random variable: X ∼ Binomial(n, p), where n: positive integer, 0 ≤ p ≤ 1.
• Range: {0, 1, 2, . . . ., n}
• PMF: fX (k) = n Ck pk (1 − p)n−k
10. Geometric random variable: X ∼ Geometric(p), where 0 < p ≤ 1.
• Range: {1, 2, . . . ., n}
• PMF: fX (k) = (1 − p)k−1 p
11. Negative Binomial random variable: X ∼ Negative Binomial(r, p), where r: posi-
tive integer, 0 < p ≤ 1.
• Range: {r, r + 1, r + 2, . . . .}
• PMF: fX (k) = k−1 Cr−1 (1 − p)k−r pr
12. Poisson random variable: X ∼ Poisson(λ), where λ > 0.
• Range: {0, 1, 2, 3, . . . .}
e−λ λk
• PMF: fX (k) =
k!
13. Hypergeometric random variable: X ∼ HyperGeo(N, r, m), where N, r, m: positive
integers
• Range: {max(0, m − (N − r)), . . . , min(r, m)}
r
Ck N −r Cm−k
• PMF: fX (k) = NC
m

14. Functions of a random variable: X : random variable with PMF fX (t).

f (X) : random variable whose PMF is given as follows.

ff (X) (a) = P (f (X) = a) = P (X ∈ {t : f (t) = a})

X
= fX (t)
t:f (t)=a

• PMF of f (X) can be found using PMF of X.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 3 Notes
Multiple Random Variables

1. Joint probability mass function: Suppose X and Y are discrete random variables
defined in the same probability space. Let the range of X and Y be TX and TY ,
respectively. The joint PMF of X and Y , denoted fXY , is a function from TX × TY
to [0, 1] defined as

fXY (t1 , t2 ) = P (X = t1 and Y = t2 ), t1 ∈ TX , t2 ∈ TY

• Joint PMF is usually written as table or a matrix.

• P (X = t1 and Y = t2 ) is denoted P (X = t1 , Y = t2 )

2. Marginal PMF: Suppose X and Y are jointly distributed discrete random variables
with joint PMF fXY . The PMF of the individual random variables X and Y are called
as marginal PMFs. It can be shown that
X
fX (t1 ) = P (X = t1 ) = (fXY (t1 , t2 ))
t2 ∈TY

X
fY (t2 ) = P (X = t2 ) = (fXY (t1 , t2 ))
t1 ∈TX

Note: Given the joint PMF, the marginal is unique.

3. Conditional distribution given an event: Suppose X is a discrete random variable

with range TX , and A is an event in the same probability space. The conditional PMF
of X given A is defined as the PMF

fX|A (t) = P (X = t|A)

where t ∈ TX
We will denote the conditional random variable by X|A. (Note that X|A is a valid
random variable with PMF fX|A ).

P ((X = t) ∩ A)
• fX|A (t) =
P (A)
• Range of (X|A) can be different from TX and will depend on A.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
4. Conditional distribution of one random variable given another:
Suppose X and Y are jointly distributed discrete random variables with joint PMF
fXY . The conditional PMF of Y given X = t is defined as the PMF

P (X = x, Y = y) fXY (x, y)
fY |X=x (y) = =
P (X = x) fX (x)

We will denote the conditional random variable by Y |(X = x). (Note that Y |(X = x)
is a valid random variable with PMF fY |(X=x) .

• Range of (Y |X = t) can be different from TY and will depend on t.

• fXY (x, y) = fY |X=x (x, y).fX (x) = fX|Y =y (x, y).fY (y)
•
P
fY |X=x (y) = 1
y∈TY

5. Joint PMF of more than two discrete random variables:

Suppose X1 , X2 , . . . , Xn are discrete random variables defined in the same probability
space. Let the range of Xi be TXi . The joint PMF of Xi , denoted by fX1 X2 ...Xn , is a
function from TX1 × TX2 × . . . × TXn to [0, 1] defined as

fX1 X2 ...Xn (t1 , t2 , . . . , tn ) = P (X1 = t1 , X2 = t2 , . . . , Xn = tn ); ti ∈ TXi

6. Marginal PMF in case of more than two discrete random variables:

Suppose X1 , X2 , . . . , Xn are jointly distributed discrete random variables with joint
PMF fX1 X2 ...Xn . The PMF of the individual random variables X1 , X2 , . . . , Xn are
called as marginal PMFs. It can be shown that
X
fX1 (t1 ) = P (X1 = t1 ) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
t2 ∈TX2 ,t3 ∈TX3 ,...,tn ∈TXn

X
fX2 (t2 ) = P (X2 = t2 ) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
t1 ∈TX1 ,t3 ∈TX3 ,...,tn ∈TXn

..
.
X
fXn (tn ) = P (Xn = tn ) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
t1 ∈TX1 ,t2 ∈TX2 ,...,tn−1 ∈TXn−1

7. Marginalisation: Suppose X1 , X2 , . . . , Xn are jointly distributed discrete random

variables with joint PMF fX1 X2 ...Xn . The joint PMF of the random variables Xi1 , Xi2 , . . . Xik ,
denoted by fXi1 Xi2 ...Xik is given by
X
fXi1 Xi2 ...Xik (ti1 , ti2 , . . . tik ) = fX1 X2 ...Xn (t1 , . . . ti1 −1 , ti1 , ti1 +1 , . . . tik −1 , tik , tik +1 . . . , tn )

• Sum over everything you don’t want.

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
8. Conditioning with multiple discrete random variables:

• A wide variety of conditioning is possible when there are many random variables.
Some examples are:
• Suppose X1 , X2 , X3 , X4 ∼ fX1 X2 X3 X4 and xi ∈ TXi , then
fX X (x1 , x2 )
– fX1 |X2 =x2 (x1 ) = 1 2
fX2 (x2 )
fX X X (x1 , x2 , x3 )
– fX1 ,X2 |X3 =x3 (x1 , x2 ) = 1 2 3
fX3 (x3 )
fX X X (x1 , x2 , x3 )
– fX1 |X2 =x2 ,X3 =x3 (x1 ) = 1 2 3
fX2 X3 (x2 , x3 )
fX X X X (x1 , x2 , x3 , x4 )
– fX1 X4 |X2 =x2 ,X3 =x3 (x1 , x4 ) = 1 2 3 4
fX2 X3 (x2 , x3 )
9. Conditioning and factors of the joint PMF:
Let X1 , X2 , X3 , X4 ∼ fX1 X2 X3 X4 , Xi ∈ TXi .
fX1 X2 X3 X4 (t1 , t2 , t3 , t4 ) =P (X1 = t1 and (X2 = t2 , X3 = t3 , X4 = t4 ))
=fX1 |X2 =t2 ,X3 =t3 ,X4 =t4 (t1 )P (X2 = t2 and (X3 = t3 , X4 = t4 ))
=fX1 |X2 =t2 ,X3 =t3 ,X4 =t4 (t1 )fX2 |X3 =t3 ,X4 =t4 (t2 )P (X3 = t3 and X4 = t4 )
=fX1 |X2 =t2 ,X3 =t3 ,X4 =t4 (t1 )fX2 |X3 =t3 ,X4 =t4 (t2 )fX3 |X4 =t4 (t3 )fX4 (t4 ).
• Factoring can be done in any sequence.
10. Independence of two random variables:
Let X and Y be two random variables defined in a probability space with ranges TX
and TY , respectively. X and Y are said to be independent if any event defined using
X alone is independent of any event defined using Y alone. Equivalently, if the joint
PMF of X and Y is fXY , X and Y are independent if
fXY (x, y) = fX (x)fY (y)
for x ∈ TX and y ∈ TY

• X and Y are independent if

fX|Y =y (x) = fX (x)
fY |X=x (y) = fY (y)
for x ∈ TX and y ∈ TY

• To show X and Y independent, verify

fXY (x, y) = fX (x)fY (y)
for all x ∈ TX and y ∈ TY

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• To show X and Y dependent, verify
fXY (x, y) 6= fX (x)fY (y)
for some x ∈ TX and y ∈ TY
– Special case: fXY (t1 , t2 ) = 0 when fX (t1 ) 6= 0, fY (t2 ) 6= 0.
11. Independence of multiple random variables:
Let X1 , X2 , . . . , Xn be random variables defined in a probability space with range of
Xi denoted TXi . X1 , X2 , . . . , Xn are said to be independent if events defined using
different Xi are mutually independent. Equivalently, X1 , X2 , . . . , Xn are independent
iff
fX1 X2 ...Xn (t1 , t2 , . . . , tn ) = fX1 (x1 )fX2 (x2 ) . . . fXn (xn )
for all xi ∈ TXi
• All subsets of independent random variables are independent.
12. Independent and Identically Distributed (i.i.d.) random variables:
Random variables X1 , X2 , . . . , Xn are said to be independent and identically distributed
(i.i.d.), if
(i) they are independent.
(ii) the marginal PMFs fXi are identical.
Examples:
• Repeated trials of an experiment creates i.i.d. sequence of random variables
– Toss a coin multiple times.
– Throw a die multiple times.
• Let X1 , X2 , . . . Xn ∼ i.i.d.X (Geometric(p)).
X will take values in {1, 2, . . .}
P (X = k) = pk−1 p

Since Xi ’s are independent and identically distributed, we can write

P (X1 > j, X2 > j, . . . , Xn > j) =P (X1 > j)P (X2 > j) . . . P (Xn > j)
=[P (X > j)]n
∞
X
P (X > j) = (1 − p)k−1 p
k=j+1

=(1 − p)j p + (1 − p)j+1 p + (1 − p)j+2 p + . . .

=(1 − p)j p[1 + (1 − p) + (1 − p)2 + . . .]

j 1
=(1 − p) p
1 − (1 − p)
=(1 − p)j
⇒ P (X1 > j, X2 > j, . . . , Xn > j) = [P (X > j)]n = (1 − p)jn

Page 4

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
13. Function of random variables (g(X1 , X2 , . . . , Xn )):
Suppose X1 , X2 , . . . , Xn have joint PMF fX1 X2 ...Xn with TXi denoting the range of Xi .
Let g : TX1 × TX2 × . . . × TXn → R be a function with range Tg . The PMF of
X = g(X1 , X2 ..., Xn ) is given by
X
fX (t) = P (g(X1 , X2 ..., Xn ) = t) = fX1 X2 ...Xn (t1 , t2 , . . . , tn )
(t1 ,...,tn ):g(X1 ,X2 ...,Xn )=t

• Sum of two random variables taking integer values:

X, Y ∼ fXY , Z = X + Y.
Let z be some integer,

P (Z = z) =P (X + Y = z)
X∞
= P (X = x, Y = z − x)
x=−∞
X∞
= fXY (x, z − x)
x=−∞
X∞
= fXY (z − y, y)
y=−∞

∞
• Convolution: If X and Y are independent, fX+Y (z) =
P
fX (x)fY (z − x)
x=−∞

• Let X ∼ Poisson(λ1 ), Y ∼ Poisson(λ2 )

– X and Y are independent.
– Z = X + Y , z ∈ {0, 1, 2, . . .}
fZ (z) ∼ Poisson(λ1 + λ2 )
λ1 λ2
(X = k | Z = n) ∼ Binomial n, , (Y = k | Z = n) ∼ Binomial n,
λ1 + λ2 λ1 + λ2
14. CDF of a random variable:
Cumulative distribution function of a random variable X is a function FX : R → [0, 1]
defined as
FX (x) = P (X ≤ x)

15. Minimum of two random variables:

Let X, Y ∼ fXY and let Z = min{X, Y }, then
•

fZ (z) = P (Z = z) = P (min{X, Y } = z)
= P (X = z, Y = z) + P (X = z, Y > z) + P (X > z, Y = z)
X X
= fXY (z, z) + fXY (z, t2 ) + fXY (t1 , z)
t2 >z t1 >z

Page 5

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
•

FZ (z) = P (Z ≤ z) = P (min{X, Y } ≤ z)
= 1 − P (min{X, Y } > z)
= 1 − [P (X > z, Y > z)]

16. Maximum of two random variables:

Let X, Y ∼ fXY and let Z = max{X, Y }, then

fZ (z) = P (Z = z) = P (max{X, Y } = z)
= P (X = z, Y = z) + P (X = z, Y < z) + P (X < z, Y = z)
X X
= fXY (z, z) + fXY (z, t2 ) + fXY (t1 , z)
t2 <z t1 <z

FZ (z) = P (Z ≤ z) = P (max{X, Y } ≤ z)
= [P (X ≤ z, Y ≤ z)]

17. Maximum and Minimum of n i.i.d. random variables

• Let X ∼ Geometric(p), Y ∼ Geometric(q)

X and Y are independent.
Z = min(X, Y )
Z ∼ Geometric(1 − (1 − p)(1 − q))
• Maximum of 2 independent geometric random variables is not geometric.

Important Points:

1. Let N ∼ Poisson(λ) and X|N = n ∼ Binomial(n, p), then X ∼ Poisson(λp)

2. Memory less property of Geometric(p)

If X ∼ Geometric(p), then

P (X > m + n|X > m) = P (X > n)

3. Sum of n independent Bernoulli(p) trials is Binomial(n, p).

4. Sum of 2 independent Uniform random variables is not Uniform.

5. Sum of independent Binomial(n, p) and Binomial(m, p) is Binomial(n + m, p).

Page 6

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
6. Sum of r i.i.d. Geometric(p) is Negative-Binomial(r, p).

7. Sum of independent Negative-Binomial(r, p) and Negative-Binomial(s, p) is Negative-

Binomial(r + s, p)

8. If X and Y are independent, then g(X) and h(Y ) are also independent.

Page 7

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 4 Notes
Expected value

• Expected value of a random variable

Definition: Suppose X is a discrete random variable with range TX and PMF fX . The
expected value of X, denoted E[X], is defined as
X
E[X] = tP (X = t)
t∈TX

assuming the above sum exists.

Expected value represents “center” of a random variable.

1. Consider a constant c as a random variable X with

P (X = c) = 1.
E[c] = c × 1 = c
2. If X takes only non-negative values, i.e. P (X ≥ 0) = 1. Then,

E[X] ≥ 0

• Expected value of a function of random variables

Suppose X1 . . . Xn have joint PMF fX1 ...Xn with range of Xi denoted as TXi . Let

g : TX1 × . . . × TXn → R

be a function, and let Y = g(X1 , . . . , Xn ) have range TY and PMF fY . Then,

X X
E[g(X1 , . . . , Xn )] = tfY (t) = g(t1 , . . . , tn )fX1 ...Xn (t1 , . . . , tn )
t∈TY ti ∈TXi

• Linearity of Expected value:

1. E[cX] = cE[X] for a random variable X and a constant c.

2. E[X + Y ] = E[X] + E[Y ] for any two random variables X, Y .

• Zero mean Random variable:

A random variable X with E[X] = 0 is said to be a zero-mean random variable.

• Variance and Standard deviation:

Definition: The variance of a random variable X, denoted by Var(X), is defined as

Var(X) = E[(X − E[X])2 ]

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Variance measures the spread about the expected value.
Variance of random variable X is also given by Var(X) = E[X 2 ] − E[X]2

The standard deviation of X, denoted by SD(X), is defined as

p
SD(X) = + Var(X)

Units of SD(X) are same as units of X.

• Properties: Scaling and translation

Let X be a random variable. Let a be a constant real number.

1. Var(aX) = a2 Var(X)
2. SD(aX) =| a | SD(X)
3. Var(X + a) = Var(X)
4. SD(X + a) = SD(X)

• Sum and product of independent random variables

1. For any two random variables X and Y (independent or dependent), E[X + Y ] =

E[X] + E[Y ].
2. If X and Y are independent random variables,
(a) E[XY ] = E[X]E[Y ]
(b) Var(X + Y ) = Var(X) + Var(Y )

• Standardised random variables:

1. Definition: A random variable X is said to be standardised if E[X] = 0, Var(X) =

1.
X − E[X]
2. Let X be a random variable. Then, Y = is a standardised random
SD(X)
variable.

• Covariance:
Definition: Suppose X and Y are random variables on the same probability space. The
covariance of X and Y , denoted as Cov(X, Y ), is defined as

Cov(X, Y ) = E[(X − E[X])(Y − E[Y ])]

It summarizes the relationship between two random variables.

Properties:

1. Cov(X, X) = Var(X)
2. Cov(X, Y ) = E[XY ] − E[X]E[Y ]

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
3. Covariance is symmetric if Cov(X, Y ) = Cov(Y, X)
4. Covariance is a “linear” quantity.
(a) Cov(X, aY + bZ) = aCov(X, Y ) + bCov(X, Z)
(b) Cov(aX + bY, Z) = aCov(X, Z) + bCov(Y, Z)
5. Independence: If X and Y are independent, then X and Y are uncorrelated, i.e.
Cov(X, Y ) = 0
6. If X and Y are uncorrelated, they may be dependent.
• Correlation coefficient:
Definition: The correlation coefficient or correlation of two random variables X and Y
, denoted by ρ(X, Y ), is defined as
Cov(X, Y )
ρ(X, Y ) =
SD(X)SD(Y )
1. −1 ≤ ρ(X, Y ) ≤ 1.
2. ρ(X, Y ) summarizes the trend between random variables.
3. ρ(X, Y ) is a dimensionless quantity.
4. If ρ(X, Y ) is close to zero, there is no clear linear trend between X and Y .
5. If ρ(X, Y ) = 1 or ρ(X, Y ) = −1, Y is a linear function of X.
6. If | ρ(X, Y ) | is close to one, X and Y are strongly correlated.
• Bounds on probabilities using mean and variance
1. Markov’s inequality: Let X be a discrete random variable taking non-negative
values with a finite mean µ. Then,
µ
P (X ≥ c) ≤
c
Mean µ, through Markov’s inequality: bounds the probability that a non-negative
random variable takes values much larger than the mean.
2. Chebyshev’s inequality: Let X be a discrete random variable with a finite mean
µ and a finite variance σ 2 . Then,
1
P (| X − µ |≥ kσ) ≤
k2
Other forms:
σ2 1
(a) P (| X − µ |≥ c) ≤ 2
, P ((X − µ)2 > k 2 σ 2 ) ≤ 2
c k
1
(b) P (µ − kσ < X < µ + kσ) ≥ 1 − 2
k
Mean µ and standard deviation σ, through Chebyshev’s inequality: bound the
probability that X is away from µ by kσ.

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 5 Notes
Continuous Random Variables

1. Cumulative distribution function:

A function F : R → [0, 1] is said to be a Cumulative Distribution Function (CDF) if
(i) F is a non-decreasing function taking values between 0 and 1.
(ii) As x → −∞, F → 0
(iii) As x → ∞, F → 1
(iv) Technical: F is continuous from the right.

2. CDF of a random variable:

Cumulative distribution function of a random variable X is a function FX : R → [0, 1]
defined as
FX (x) = P (X ≤ x)
Properties of CDF

• FX (b) − FX (a) = P (a < X ≤ b)

• FX is a non-decreasing function of x.
• FX takes non-negative values.
• As x → −∞, FX (x) → 0
• As x → ∞, FX (x) → 1

3. Theorem: Random variable with CDF F(x)

Given a valid CDF F (x), there exists a random variable X taking values in R such
that
P (X ≤ x) = F (x)

• If F is not continuous at x and F (X) rises from F1 to F2 at x (jump at x), then

P (X = x) = F2 − F1

• If F is continuous at x, then
P (X = x) = 0

4. Continuous random variable:

A random variable X with CDF FX (x) is said to be a continuous random variable if
FX (x) is continuous at every x.
Properties of continuous random variables

• CDF has no jumps or steps.

• P (X = x) = 0 for all x.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• Probability of X falling in an interval will be nonzero

P (a < X ≤ b) = F (b) − F (a)

• Since P (X = a) = 0 and P (X = b) = 0, we have

P (a ≤ X ≤ b) = P (a < X ≤ b) = P (a ≤ X < b) = P (a < X < b)

5. Probability density function (PDF):

A continuous random variable X with CDF FX (x) is said to have a PDF fX (x) if, for
all x0 , Z x0
FX (x0 ) = fX (x)dx
−∞

• CDF is the integral of the PDF.

• Derivative of the CDF (wherever it exists) is usually taken as the PDF.
• Value of PDF around fX (x0 ) is related to X taking a value around x0 .
• Higher the PDF, higher the chance that X lies there.

6. For a random variable X with PDF fX , an event A is a subset of the real line and its
probability is computed as Z
P (A) = fX (x)dx
A

Rb
• P (a < X < b) = FX (b) − FX (a) = a
fX (x)dx

7. Density function:
A function f : R → R is said to be a density function if
(i) fR(x) ≥ 0
∞
(ii) −∞ fX (x)dx = 1
(iii) f (x) is piece-wise continuous

8. Given a density function f , there is a continuous random variable X with PDF as f .

9. Support of random variable X

Support of the random variable X with PDF fX is

supp(X) = {x : fX (x) > 0}

• supp(X) contains intervals in which X can fall with positive probability.

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
10. Continuous Uniform distribution:
• X ∼ Uniform[a, b]
• PDF: 
 1 a<x<b
fX (x) = b − a
0 otherwise
• CDF: 

 0 x≤a
x − a
FX (x) = a<x<b

 b−a
1 x≥b


11. Exponential distribution:

• X ∼ Exp(λ)
• PDF: (
λe−λx x>0
fX (x) =
0 otherwise
• CDF: (
0 x≤0
FX (x) =
1 − e−λx x>0

12. Normal distribution:

• X ∼ Normal[µ, σ 2 ]
• PDF:
−(x − µ)2

1
fX (x) = √ exp −∞<x<∞
σ 2π 2σ 2
• CDF: Z x
FX (x) = fX (u)du
−∞

• CDF has no closed form expression.

• Standard normal: Z = Normal(0, 1)
2
1 −z
– PDF: fZ (z) = √ exp −∞<z <∞
2π 2
13. Standardization:
If X ∼ Normal(µ, σ 2 ), then
X −µ
= Z ∼ Normal(0, 1)
σ
14. To compute the probabilities of the normal distribution, convert probability computa-
tion to that of a standard normal.

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
15. Functions of continuous random variable:
Suppose X is a continuous random variable with CDF FX and PDF fX and suppose
g : R → R is a (reasonable) function. Then, Y = g(X) is a random variable with CDF
FY determined as follows:
• FY (y) = P (Y ≤ y) = P (g(X) ≤ y) = P (X ∈ {x : g(x) ≤ y})
• To evaluate the above probability
– Convert the subset Ay = {x : g(x) ≤ y} into intervals in real line.
– Find the probability that X falls in those intervals.
R
– FY (y) = P (X ∈ AY ) = AY fX (x)dx
• If FY has no jumps, you may be able to differentiate and find a PDF.
16. Theorem: Monotonic differentiable function
Suppose X is a continuous random variable with PDF fX . Let g(x) be monotonic for
dg(x)
x ∈ supp(X) with derivative g 0 (x) = . Then, the PDF of Y = g(X) is
dx
1 −1
fY (y) = fX (g (y))
|g 0 (g −1 (y))|
• Translation: Y = X + a
fY (y) = fX (y − a)
• Scaling: Y = aX
1
fY (y) = fX (y/a)
|a|
• Affine: Y = aX + b
1
fY (y) = fX ((y−b)/a)
|a|
• Affine transformation of a normal random variable is normal.
17. Expected value of function of continuous random variable:
Let X be a continuous random variable with density fX (x). Let g : R → R be a
function. The expected value of g(X), denoted E[g(X)], is given by
Z ∞
E[g(X)] = g(x)fX (x)dx
−∞

whenever the above integral exists.

• The integral may diverge to ±∞ or may not exist in some cases.
18. Expected value (mean) of a continuous random variable:
Mean, denoted E[X] or µX or simply µ is given by
Z ∞
E[X] = xfX (x)dx
−∞

Page 4

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
19. Variance of a continuous random variable:
2
Variance, denoted Var[X] or σX or simply σ 2 is given by
Z ∞
2
Var(X) = E[(X − E[X]) ] = (x − µ)2 fX (x)dx
−∞

• Variance is a measure of spread of X about its mean.

• Var(X) = E[X 2 ] − E[X]2

X E[X] Var(X)
a+b (b−a)2
Uniform[a, b] 2 12
1 1
Exp(λ) λ λ2

Normal(µ, σ 2 ) µ σ 2

20. Markov’s inequality:

If X is a continuous random variable with mean µ and non-negative supp(X) (i.e.
P (X < 0) = 0), then
µ
P (X > c) ≤
c
21. Chebyshev’s inequality:
If X is a continuous random variable with mean µ and variance σ 2 , then
1
P (|X − µ| ≥ kσ) ≤
k2

Page 5

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2

Week 6 Notes

1. Marginal density: Let (X, Y ) be jointly distributed where X is discrete with range
TX and PMF pX (x).
For each x ∈ TX , we have a continuous random variable Yx with density fYx (y).
fYx (y) : conditional density of Y given X = x, denoted fY |X=x (y).

• Marginal density of Y
P
– fY (y) = pX (x)fY |X=x (y)
x∈TX

2. Conditional probability of discrete given continuous: Suppose X and Y are

jointly distributed with X ∈ TX being discrete with PMF pX (x) and conditional densi-
ties fY |X=x (y) for x ∈ TX . The conditional probability of X given Y = y0 ∈ supp(Y ) is
defined as

pX (x)fY |X=x (y0 )

• P (X = x | Y = y0 ) =
fY (y0 )
3. Joint density: A function f (x, y) is said to be a joint density function if

• f (x, y) ≥ 0, i.e. f is non-negative.

R∞ R∞
• f (x, y)dxdy = 1
−∞ −∞

4. 2D uniform distribution: Fix some (reasonable) region D in R2 with total area |D|.
We say that (X, Y ) ∼ Uniform(D) if they have the joint density
(
1
|D|
(x, y) ∈ D
fXY (x, y) =
0 otherwise

5. Marginal density: Suppose (X, Y ) have joint density fXY (x, y). Then,
y=∞
• X has the marginal density fX (x) =
R
fXY (x, y)dy.
y=−∞
x=∞
• Y has the marginal density fY (y) =
R
fXY (x, y)dx.
x=−∞

– In general the marginals do not determine joint density.

6. Independence: (X, Y ) with joint density fXY (x, y) are independent if

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• fXY (x, y) = fX (x)fY (y)
– If independent, the marginals determine the joint density.

7. Conditional density: Let (X, Y ) be random variables with joint density fXY (x, y).
Let fX (x) and fY (y) be the marginal densities.

• For a such that fX (a) > 0, the conditional density of Y given X = a, denoted as
fY |X=a (y), is defined as

fXY (a, y)
fY |X=a (y) =
fX (a)

• For b such that fY (b) > 0, the conditional density of X given Y = b, denoted as
fX|Y =b (x), is defined as

fXY (x, b)
fX|Y =b (x) =
fY (b)

8. Properties of conditional density: Joint = Marginal × Conditional, for x = a and

y = b such that fX (a) > 0 and fY (b) > 0.

• fXY (a, b) = fX (a)fY |X=a (b) = fY (b)fX|Y =b (a)

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2

Important results

Discrete random variables:

Distribution PMF (fX (k)) CDF (FX (x)) E[X] Var(X)


1 0
 x<0
,x=k  k−a+1

k ≤x<k+1
Uniform(A) n
n a+b n2 −1
n=b−a+1
A = {a, a + 1, . . . , b} k = a, a + 1, . . . , b − 1, b 2 12
k = a, a + 1, . . . , b



x≥n

1

0 x<0
( 
p x=1
Bernoulli(p) 1−p 0≤x<1 p p(1 − p)
1−p x=0 
1 x≥1



 0 x<0
k


n
Ci pi (1 − p)n−i

P
n
Ck pk (1 − p)n−k , k ≤x<k+1
Binomial(n, p) i=0 np np(1 − p)
k = 0, 1, . . . , n
k = 0, 1, . . . , n





1 x≥n


0
 x<0
1 1−p
(1 − p)k−1 p,
Geometric(p) 1 − (1 − p)k k ≤ x < k + 1
k = 1, . . . , ∞  p p2
k = 1, . . . , ∞


 0 x<0
e−λ λk


 k λi
Poisson(λ) , e−λ
P
k ≤x<k+1 λ λ
k!
k = 0, 1, . . . , ∞ 
 i=0 i!

 k = 0, 1, . . . , ∞

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Continuous random variables:

Distribution PDF (fX (k)) CDF (FX (x)) E[X] Var(X)


 0 x≤a
(b − a)2

x − a
1 a+b
Uniform[a, b] ,a≤x≤b a<x<b
b−a 
 b−a 2 12
1 x≥b

(
0 x≤0 1 1
Exp(λ) λe−λx , x > 0
1 − e−λx x > 0 λ λ2
−(x − µ)2

1
√ exp ,
Normal(µ, σ 2 ) σ 2π 2σ 2 No closed form µ σ2
−∞ < x < ∞
β α α−1 −βx α α
Gamma(α, β) x e ,x>0
Γ(α) β β2
Γ(α + β) α−1
x (1 − x)β−1 α αβ
Beta(α, β) Γ(α)Γ(β)
α+β (α + β)2 (α + β + 1)
0<x<1

1. Markov’s inequality: Let X be a discrete random variable taking non-negative values

with a finite mean µ. Then,
µ
P (X ≥ c) ≤
c
2. Chebyshev’s inequality: Let X be a discrete random variable with a finite mean µ
and a finite variance σ 2 . Then,
1
P (| X − µ |≥ kσ) ≤
k2

3. Weak Law of Large numbers: Let X1 , X2 , . . . , Xn ∼ iid X with E[X] = µ, Var(X) =

σ2.
X 1 + X2 + . . . + X n
Define sample mean X = . Then,
n
σ2
P (|X − µ| > δ) ≤
nδ 2

4. Using CLT to approximate probability: Let X1 , X2 , . . . , Xn ∼ iid X with E[X] =

µ, Var(X) = σ 2 .
Define Y = X1 + X2 + . . . + Xn . Then,
Y − nµ
√ ≈ Normal(0, 1).
nσ

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• Test for mean
Case (1): When population variance σ 2 is known (z-test)

Test H0 HA Test statistic Rejection region

T =X
right-tailed µ = µ0 µ > µ0 X − µ0 X>c
Z= σ√
/ n
T =X
left-tailed µ = µ0 µ < µ0 X − µ0 X<c
Z= σ√
/ n
T =X
two-tailed µ = µ0 µ 6= µ0 X − µ0 |X − µ0 | > c
Z= σ√
/ n

Case (2): When population variance σ 2 is unknown (t-test)

Test H0 HA Test statistic Rejection region

T =X
right-tailed µ = µ0 µ > µ0 X − µ0 X>c
tn−1 = S/√n

T =X
left-tailed µ = µ0 µ < µ0 X − µ0 X<c
tn−1 = S/√n

T =X
two-tailed µ = µ0 µ 6= µ0 X − µ0 |X − µ0 | > c
tn−1 = S/√n

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• χ2 -test for variance:

Test H0 HA Test statistic Rejection region

(n − 1)S 2
right-tailed σ = σ0 σ > σ0 T = 2
∼ χ2n−1 S 2 > c2
σ0

(n − 1)S 2
left-tailed σ = σ0 σ < σ0 T = ∼ χ2n−1 S 2 < c2
σ02
α
(n − 1)S 2 S 2 > c2 where = P (S 2 > c2 ) or
two-tailed σ = σ0 σ 6= σ0 T = ∼ χ2n−1 2
σ02 α
S 2 < c2 where = P (S 2 < c2 )
2

• Two samples z-test for means:

Test H0 HA Test statistic Rejection region

T =X −Y
σ12 σ22

right-tailed µ1 = µ2 µ1 > µ2 X −Y >c
X − Y ∼ Normal 0, + if H0 is true
n1 n2
T =Y −X
σ22 σ12

left-tailed µ1 = µ2 µ1 < µ2 Y −X >c
Y − X ∼ Normal 0, + if H0 is true
n2 n1
T =X −Y
σ12 σ22

two-tailed µ1 = µ2 µ1 6= µ2 |X − Y | > c
X − Y ∼ Normal 0, + if H0 is true
n1 n2

• Two samples F -test for variances

Test H0 HA Test statistic Rejection region

S12 S12
one-tailed σ1 = σ2 σ1 > σ2 T = ∼ F(n1 −1,n2 −1) >1+c
S22 S22

S12 S12
one-tailed σ1 = σ2 σ1 < σ2 T = ∼ F(n1 −1,n2 −1) <1−c
S22 S22
S12 α
> 1 + cR where = P (T > 1 + cR ) or
S12 S22 2
two-tailed σ1 = σ2 σ1 6= σ2 T = ∼ F(n1 −1,n2 −1)
S22 S12 α
< 1 − cL where = P (T < 1 − cL )
S22 2

Page 4

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• χ2 -test for goodness of fit:
H0 : Samples are i.i.d X, HA : Samples are not i.i.d X

k (y − np )2 k (observed value − expected value)2

i i
∼ χ2k−1
P P
Test statistic: T = =
i=1 npi i=1 expected value

Test: Reject H0 if T > c.

• Test for independence:

H0 : Joint PMF is product of marginals, HA : Joint PMF is not product of marginals

P (yij − npij )2 k (observed value − expected value)2

∼ χ2dof
P
Test statistic: T = =
i,j npij i=1 expected value

where dof = (number of rows−1) × (number of columns−1)

yij = product of marginals for (i, j)
npij = expected, if independent

Test: Reject H0 if T > c.

Page 5

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 7 Notes
Statistics from samples and Limit theorems

1. Empirical distribution:
Let X1 , X2 , . . . , Xn ∼ X be i.i.d. samples. Let #(Xi = t) denote the number of times
t occurs in the samples. The empirical distribution is the discrete distribution with
PMF
#(Xi = t)
p(t) =
n
• The empirical distribution is random because it depends on the actual sample
instances.
• Descriptive statistics: Properties of empirical distribution. Examples :
– Mean of the distribution
– Variance of the distribution
– Probability of an event
• As number of samples increases, the properties of empirical distribution should
become close to that of the original distribution.

2. Sample mean:
Let X1 , X2 , . . . , Xn ∼ X be i.i.d. samples. The sample mean, denoted X, is defined to
be the random variable
X1 + X 2 + . . . + Xn
X=
n
• Given a sampling x1 , . . . , xn the value taken by the sample mean X is x =
x1 + x2 + . . . + xn
. Often, X and x are both called sample mean.
n

3. Expected value and variance of sample mean:

Let X1 , X2 , . . . , Xn be i.i.d. samples whose distribution has a finite mean µ and variance
σ 2 . The sample mean X has expected value and variance given by

σ2
E[X] = µ, Var(X) =
n
• Expected value of sample mean equals the expected value or mean of the distri-
bution.
• Variance of sample mean decreases with n.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
4. Sample variance:
Let X1 , X2 , . . . , Xn ∼ X be i.i.d. samples. The sample variance, denoted S 2 , is defined
to be the random variable
(X1 − X)2 + (X2 − X)2 + . . . + (Xn − X)2
S2 = ,
n−1

where X is the sample mean.

5. Expected value of sample variance:

Let X1 , X2 , . . . , Xn be i.i.d. samples whose distribution has a finite variance σ 2 . The
2 (X1 − X)2 + (X2 − X)2 + . . . + (Xn − X)2
sample variance S = has expected value
n−1
given by
E[S 2 ] = σ 2

• Values of sample variance, on average, give the variance of distribution.

• Variance of sample variance will decrease with number of samples (in most cases).
• As n increases, sample variance takes values close to distribution variance.

6. Sample proportion:
The sample proportion of A, denoted S(A), is defined as

number of Xi for which A is true

S(A) =
n
• As n increases, values of S(A) will be close to P (A).
• Mean of S(A) equals P (A).
• Variance of S(A) tends to 0.

7. Weak law of large numbers:

Let X1 , X2 , . . . , Xn ∼ iid X with E[X] = µ, Var(X) = σ 2 .
X1 + X2 + . . . + Xn
Define sample mean X = . Then,
n
σ2
P (|X − µ| > δ) ≤
nδ 2

8. Chernoff inequality:
Let X be a random variable such that E[X] = 0, then

E[eλX ]
P (X > t) ≤ , λ>0
eλt

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
9. Moment generating function (MGF):
Let X be a zero-mean random variable (E[X] = 0). The MGF of X, denoted MX (λ),
is a function from R to R defined as

MX (λ) = E[eλX ]

MX (λ) = E[eλX ]
λ2 X 2 λ3 X 3
= E[1 + λX + + + . . .]
2! 3!
λ2 λ3
= 1 + λE[X] + E[X 2 ] + E[X 3 ] + . . .
2! 3!
λk
That is coefficient of in the MGF of X gives the kth moment of X.
k!
2 σ2
• If X ∼ Normal(0, σ 2 ) then, MX (λ) = eλ /2

• Let X1 , X2 , . . . , Xn ∼ i.i.d. X and let S = X1 + X2 + . . . + Xn , then

MS (λ) = (E[eλX ])n = [MX (λ)]n

It implies that MGF of sum of independent random variables is product of the

individual MGFs.

10. Central limit theorem: Let X1 , X2 , . . . , Xn ∼ iid X with E[X] = µ, Var(X) = σ 2 .

Define Y = X1 + X2 + . . . + Xn . Then,
Y − nµ
√ ≈ Normal(0, 1).
nσ

11. Gamma distribution:

X ∼ Gamma(α, β) if PDF fx (x) ∝ xα−1 e−βx , x>0

• α > 0 is a shape parameter.

• β > 0 is a rate parameter.
1
• θ = is a scale parameter.
β
α
• Mean, E[X] =
β
α
• Variance, Var(X) = 2
β

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
12. Beta distribution:
X ∼ Beta(α, β) if PDF fx (x) ∝ xα−1 (1 − x)β−1 , 0<x<1

• α > 0, β > 0 are the shape parameters.

α
• Mean, E[X] =
α+β
αβ
• Variance, Var(X) = 2
(α + β) (α + β + 1)

13. Cauchy distribution:

1 α
X ∼ Cauchy(θ, α2 ) if PDF fx (x) ∝
π α + (x − θ)2
2

• θ is a location parameter.
• α > 0 is a scale parameter.
• Mean and variance are undefined.

14. Some important results:

• Let Xi ∼ Normal(µi , σi2 ) are independent and let Y = a1 X1 + a2 X2 + . . . an Xn ,

then
Y ∼ Normal(µ, σ 2 )
where µ = a1 µ1 + a2 µ2 + . . . an µn and σ 2 = a21 σ12 + a22 σ22 + . . . a2n σn2
That is linear combinations of i.i.d. normal distributions is again a normal distri-
bution.

• Sum of n i.i.d. Exp(β) is Gamma(n, β).

1 1
• Square of Normal(0, σ ) is Gamma , 2 .
2
2 2σ
X
• Suppose X, Y ∼ i.i.d. Normal(0, σ 2 ). Then, ∼ Cauchy(0, 1).
Y

• Suppose X ∼ Gamma(α, k), Y ∼ Gamma(β, k) are independent random vari-

X
ables, then ∼ Beta(α, β).
X +Y

• Sum of n independent Gamma(α, β) is Gamma(nα, β).

n 1
• If X1 , X2 , . . . , Xn ∼ i.i.d. Normal(0, σ ) , then
2
X12 +X22 +. . .+Xn2 ∼ Gamma , .
2 2σ 2

Page 4

www.letslearn1110.com
https://www.youtube.com/@letslearn1110

n 1
• Gamma , is called Chi-square distribution with n degrees of freedom, de-
2 2
noted χ2n .

• Suppose X1 , X2 , . . . , Xn ∼ i.i.d. Normal(µ, σ 2 ). Suppose that X and S 2 denote

the sample mean and sample variance, respectively, then
(n − 1)S 2
(i) ∼ χ2n−1
σ2
(ii) X and S 2 are independent.

Page 5

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 8 notes

• Let X1 , . . . , Xn ∼ i.i.d.X, where X has the distribution described by parameters

θ1 , θ2 , . . ..

– The parameters θi are unknown but a fixed constant.

– Define the estimator for θ as the function of the samples: θ̂(X1 , . . . , Xn ).

Note:

1. θ is an unknown parameter.
2. θ̂ is a function of n random variables.

Remark: Infinite number of estimators are possible for a parameter of a distribution.

• Estimation error: θ̂(X1 , . . . , Xn ) − θ is a random variable.

– We expect the estimator random variable θ̂(X1 , . . . , Xn ) to take values around

• Good design principles:

1. Error should be close to or equal to 0.

2. Var(Error) → 0 with n.

• Bias: The bias of the estimator θ̂

for a parameter θ, denoted Bias(θ̂, θ) is defined as

Bias(θ̂, θ) = E[θ̂ − θ] = E[θ̂] − θ

1. Bias is the expected value of Error.

2. An estimator with bias equal to 0 is said to be an unbiased estimator.

• Risk: The (squared-error) risk of the estimator θ̂ for a parameter θ, denoted Risk(θ̂, θ),
is defined as
Risk(θ̂, θ) = E[(θ̂ − θ)2 ]

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
1. Risk is the expected value of “squared error” and is also called mean squared error
(MSE) often.
2. Squared-error risk is the second moment of Error.

• Variance of estimator:
Variance(θ̂) = E[(θ̂ − E[θ])2 ]
Var(Error) = Var(θ̂)

• Bias-Variance tradeoff: The risk of the estimator satisfies the following relationship:

Risk(θ̂, θ) = Bias(θ̂, θ)2 + Variance(θ̂)

• Estimator design approach:

1. Method of moments
1P n
(a) Sample moments: Mk (X1 , . . . , Xn ) = Xk
n i=1 i
(b) Mk is a random variable, and mk is the value taken by it in one sampling
instance. We expect that Mk will take values around E[X k ]
(c) Procedure:
– Equate sample moments to expression for moments in terms of unknown
parameters.
– Solve for the unknown parameters.
(d) One parameter θ usually needs one moment
– Sample moment: m1
– Distribution moment: E[X] = f (θ)
– Solve for θ from f (θ) = m1 in terms of m1 .
– θ̂: replace m1 by M1 in above solution.
(e) Two parameters θ1 , θ2 usually needs two moments.
– Sample moments: m1 , m2
– Distribution moment: E[X] = f (θ1 , θ2 ), E[X 2 ] = g(θ1 , θ2 )
– Solve for θ1 , θ2 from f (θ1 , θ2 ) = m1 , g(θ1 , θ2 ) = m2 in terms of m1 , m2 .
– θ̂: replace m1 by M1 and m2 by M2 in above solution.
2. Maximum Likelihood estimators
(a) Likelihood of i.i.d. samples: Likelihood of a sampling x1 , x2 , . . . , xn , denoted
L(x1 , x2 , . . . , xn )
n
Y
L(x1 , x2 , . . . , xn ) = fX (xi ; θ1 , θ2 , . . .)
i=1

– Likelihood L(x1 , x2 , . . . , xn ) is a function of parameters.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
– Maximum likelihood (ML) estimation
n
Y
θ1∗ , θ2∗ , . . . = arg max fX (xi ; θ1 , θ2 , . . .)
θ1 ,θ2 ,...
i=1

We find parameters that maximize likelihood for a given set of samples.

• Properties of estimators:

1. Consistency of estimators: If an estimator satisfies the following requirement, it

is said to be consistent. Technically, it is called convergence in probability.
P (| Error |> δ) → 0 as n → ∞ for any δ > 0.
2. To compare the estimators, use mean squared error (MSE).

• Confidence interval:

X1 , . . . , Xn ∼ iid X, µ = E[X]
X 1 + . . . + Xn
Estimator: µ̂ =
n
– Suppose P (| µ̂ − µ |< α) = β, where α is a small fraction and β is a large fraction.
– µ̂ in one sampling instance: estimate with margin of error (100α)% at confidence
level (100β)%.

1. Normal samples with known variance: X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 known.

X1 + . . . + Xn
Estimator: µ̂ =
n
σ2 µ̂ − µ
µ̂ ∼ i.i.d. Normal(µ, n ), Z = √ ∼ Normal(0, 1)
σ/ n

P (| µ̂ − µ |< α) =β
!
µ̂ − µ α
=⇒ P √ < √ =β
σ/ n σ/ n

α
=⇒ P | Normal(0, 1) |< √ =β
σ/ n

2. Normal samples with unknown variance: X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 un-

known.
Sampling instance: x1 , . . . , xn .
1P n 1 P n
Estimated mean and variance: X̄ = xi , σ̂ 2 = (xi − x̄)2
n i=1 n − 1 i=1

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
2 µ̂ − µ
µ̂ ∼ i.i.d. Normal(µ, σn ), Z = √ ∼ tn−1
S/ n

P (| µ̂ − µ |< α) =β
!
µ̂ − µ α
=⇒ P √ < √ =β
S/ n α̂/ n

α
=⇒ P | Normal(0, 1) |< √ =β
α̂/ n

3. If samples are not normal: Use CLT to argue that sample mean will have a normal
distribution

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2

Week 9 Notes

1. Parameter estimation: Let X1 , . . . , Xn ∼ iid X, parameter Θ

Prior distribution of Θ: Θ ∼ fΘ (θ)
Samples: x1 , . . . , xn , notation S = (X1 = x1 , . . . Xn = xn )
Bayes’ rule: posterior ∝ likelihood × prior

P (Θ = θ | S) = P (S | Θ = θ)fΘ (θ)/P (S)

P
In case of discrete: P (S) = P (S | Θ = θ)fΘ (θ)
θ R
In case of continuous: P (S) = P (S | Θ = θ)fΘ (θ) dθ
θ
Posterior mode: θ̂ = arg maxθ P (S | Θ = θ)fΘ (θ)
Posterior mean: E[Θ | S], mean of posterior distribution.

2. Bernoulli(p) samples with uniform prior: X1 , . . . , Xn ∼ iid Bernoulli(p)

Prior p ∼ Uniform[0, 1]
Samples: x1 , . . . , xn
Posterior: p| (X1 = x1 , . . . Xn = xn )
Posterior density ∝ P (X1 = x1 , . . . Xn = xn | p = p) × fp (p)
Posterior density ∝ pw (1 − p)n−w
⇒ Posterior density: Beta(w + 1, n − w + 1)
X1 + X 2 + . . . + Xn + 1
Posterior mean: p̂ =
n+2
3. Bernoulli(p) samples with beta prior: X1 , . . . , Xn ∼ iid Bernoulli(p)
Prior p ∼ Beta(α, β)
⇒ fp (p) ∝ pα−1 (1 − p)β−1
Samples: x1 , . . . , xn
Posterior: p| (X1 = x1 , . . . Xn = xn )
Posterior density ∝ P (X1 = x1 , . . . Xn = xn | p = p) × fp (p)
Posterior density ∝ pw+α−1 (1 − p)n−w+β−1

⇒ Posterior density: Beta(w + α, n − w + β)

X1 + X 2 + . . . + Xn + α
Posterior mean: p̂ =
n+α+β

4. Normal samples with unknown mean and known variance: X1 , . . . , Xn ∼ iid

Normal(M, σ 2 )

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Prior M ∼ Normal(µ0 , σ02 )
2
⇒ fM (µ) = √2πσ 1
exp(− (µ−µ2
2σ0
0)
)
0
Samples: x1 , . . . , xn , Sample mean: x = (x1 + . . . + xn )/n
Posterior: M| (X1 = x1 , . . . Xn = xn )
Posterior density ∝ f (X1 = x1 , . . . Xn = xn | M = µ) × fM (µ)
2 2 2
Posterior density ∝ exp(− (x1 −µ) +...+(x 2σ02
n −µ)
)exp(− (µ−µ 0)
2σ02
)
⇒ Posterior density: Normal
X1 + X2 + . . . + Xn nσ02 σ2
Posterior mean: µ̂ = + µ 0
n nσ02 + σ 2 nσ02 + σ 2

5. Geometric(p) samples with Uniform[0, 1] prior: X1 , . . . , Xn ∼ iid Geometric(p)

Prior p ∼ Uniform[0, 1]
Samples: x1 , . . . , xn
Posterior: p| (X1 = x1 , . . . Xn = xn )
Posterior density ∝ P (X1 = x1 , . . . Xn = xn | p = p) × fp (p)
Posterior density ∝ pn (1 − p)x1 +...+xn −n

⇒ Posterior density: Beta(n + 1, x1 + . . . + xn − n + 1)

n+1
Posterior mean: p̂ =
X1 + . . . + Xn + 2

6. Poisson(λ) samples with gamma prior: X1 , . . . , Xn ∼ iid Poisson(Λ)

Prior Λ ∼ Gamma(α, β)
⇒ fΛ (λ) ∝ λα−1 e−βλ
Samples: x1 , . . . , xn
Posterior: Λ | (X1 = x1 , . . . Xn = xn )
Posterior density ∝ P (X1 = x1 , . . . Xn = xn | Λ = λ) × fΛ (λ)
Posterior density ∝ e−nλ λx1 +...+xn λα−1 e−βλ

⇒ Posterior density: Gamma(x1 + . . . + xn + α, β + n)

X1 + X2 + . . . + Xn + α
Posterior mean: λ̂ =
n+β

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 10 Notes
Hypothesis testing

1. Null hypothesis:
The null hypothesis is a kind of hypothesis which explains the population parameter
whose purpose is to test the validity of the given experimental data. It is denoted by
H0 . The null hypothesis is a default hypothesis that is assumed to remain possibly
true.
2. Alternative hypothesis:
The alternative hypothesis is a statement used in statistical inference experiment. It
is contradictory to the null hypothesis and denoted by HA or H1 .
3. Test statistic:
A test statistic is numerical quantity computed from values in a sample used in statis-
tical hypothesis testing.
4. Type I error:
A type I error is a kind of fault that occurs during the hypothesis testing process when
a null hypothesis is rejected, even though it is true.
5. Type II error:
A type II error is a kind of fault that occurs during the hypothesis testing process when
a null hypothesis is accepted, even though it is not true (HA is true).
6. Significance level (Size):
Significance level (also called size) of a test, denoted α, is the probability of type I
error.
α = P (Type I error)

7. β = P (Type II error)
8. Power of a test:
Power = 1 − β
9. Types of hypothesis:
(a) Simple hypothesis: A hypothesis that completely specifies the distribution of
the samples is called a simple hypothesis.
(b) Composite hypothesis: A hypothesis that does not completely specify the
distribution of the samples is called a composite hypothesis.
10. Standard testing method: z-test:
Consider a sample X1 , X2 , . . . , Xn ∼ i.i.d. X.

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• Test statistic, denoted T , is some function of the samples. For example: sample
mean X
• Acceptance and rejection regions are specified through T .

(a) Right-tailed z-test:

• H0 : µ = µ0 , HA : µ > µ0
• Test: reject H0 if T > c.
• Significance level α depends on c and the distribution of T |H0 .
• α = P (T > c|H0 )
• Fix α and find c.
(b) Left-tailed z-test:
• H0 : µ = µ0 , HA : µ < µ0
• Test: reject H0 if T < c.
• Significance level α depends on c and the distribution of T |H0 .
• α = P (T < c|H0 )
• Fix α and find c.
(c) two-tailed z-test:
• H0 : µ = µ0 , HA : µ 6= µ0
• Test: reject H0 if |T | > c.
• Significance level α depends on c and the distribution of T |H0 .
• α = P (|T | > c|H0 )
• Fix α and find c.

X − µ0
Note: In the test for mean (σ 2 known), T = X and when null is true, σ/√n
∼
Normal(0, 1).

11. P -value:
Suppose the test statistic T = t in one sampling. The lowest significance level α at
which the null will be rejected for T = t is said to be the P -value of the sampling.

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Statistics for Data Science - 2
Week 11 Notes
t-test, χ2 -test, two samples z/F -test

1. Normal samples and statistics: Consider the samples X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ).

X1 + . . . + Xn
The sample mean, X =
n
2 1
The sample variance, S = [(X1 − X)2 + . . . + (Xn − X)2 ]
n−1
E[X] = µ, E[S 2 ] = σ 2

• X ∼ Normal(µ, σ 2 /n)
(n − 1) 2
• 2
S ∼ χ2n−1 , chi-squared distribution with n − 1 degrees of freedom.
σ
X −µ
• √ ∼ tn−1 , t-distribution with n − 1 degrees of freedom.
S/ n
2. t-test for mean (Variance unknown)
Consider the samples X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 unknown. Following are the
three different possibilities:

• The null and alternative hypothesis are:

H0 : µ = µ0

HA : µ > µ0
Test Statistic: T = X
Test: Reject H0 , if T > c
X −µ
Given H0 , √ ∼ tn−1
S/ n

α =P (reject H0 | H0 is true)
=P (T > c | µ = µ0 )

c − µ0 c − µ0
=P tn−1 > √ = 1 − Ftn−1 √
s/ n s/ n
s
=⇒ c = √ Ft−1 (1 − α) + µ0
n n−1
Note: Ftn−1 is the CDF of t-distribution with n − 1 degrees of freedom.
• The null and alternative hypothesis are:

H0 : µ = µ0

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
HA : µ < µ0
Test Statistic: T = X
Test: Reject H0 , if T < c
X −µ
Given H0 , √ ∼ tn−1
S/ n

α =P (reject H0 | H0 is true)
=P (T < c | µ = µ0 )

c − µ0 c − µ0
=P tn−1 < √ = Ftn−1 √
s/ n s/ n
s
=⇒ c = √ Ft−1 (α) + µ0
n n−1

Note: Ftn−1 is the CDF of t-distribution with n − 1 degrees of freedom.

• The null and alternative hypothesis are:

H0 : µ = µ0

HA : µ 6= µ0
Test Statistic: T = X − µ
Test: Reject H0 , if | X − µ |> c
X −µ
Given H0 , √ ∼ tn−1
S/ n

α =P (reject H0 | H0 is true)
=P (| X − µ |> c | µ = µ0 )

c −c
=P | tn−1 |> √ = 2Ftn−1 √
s/ n s/ n
−s
=⇒ c = √ Ft−1 (α/2)
n n−1

Note: Ftn−1 is the CDF of t-distribution with n − 1 degrees of freedom.

3. χ2 -test for variance

Consider the samples X1 , . . . , Xn ∼ iid Normal(µ, σ 2 ), σ 2 unknown. Following are the
three different possibilities:

• The null and alternative hypothesis are:

H0 : σ = σ0

HA : σ > σ0

Page 2

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
Test Statistic: S 2
Test: Reject H0 , if S 2 > c2
(n − 1) 2
Given H0 , S ∼ χ2n−1
σ2
α =P (reject H0 | H0 is true)
=P (S 2 > c2 | σ = σ0 )

2 (n − 1) 2 (n − 1) 2
=P χn−1 > c = 1 − Fχ2n−1 c
σ02 σ02
Note: Fχ2n−1 is the CDF of chi-distribution with n − 1 degrees of freedom.
• The null and alternative hypothesis are:
H0 : σ = σ0
HA : σ < σ0
Test Statistic: S 2
Test: Reject H0 , if S 2 < c2
(n − 1) 2
Given H0 , 2
S ∼ χ2n−1
σ
α =P (reject H0 | H0 is true)
=P (S 2 < c2 | σ = σ0 )

2 (n − 1) 2 (n − 1) 2
=P χn−1 < c = Fχ2n−1 c
σ02 σ02
Note: Fχ2n−1 is the CDF of chi-distribution with n − 1 degrees of freedom.
• The null and alternative hypothesis are:
H0 : σ = σ0
HA : σ 6= σ0
Test Statistic: S 2
Test: Reject H0 , if S 2 < c2 or S 2 > c2
(n − 1) 2
Given H0 , S ∼ χ2n−1
σ2
α
=P (S 2 < c2 | H0 ) = P (S 2 > c2 | H0 )
2

Note: Fχ2n−1 is the CDF of chi-distribution with n − 1 degrees of freedom.

4. Two samples z-test (known variances)

Let X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

and Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
Following are the three different possibilities:

Page 3

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
• The null and alternative hypothesis are:

H0 : µ1 = µ2

HA : µ1 6= µ2
Test Statistic: T = X − Y
Test: Reject H0 , if | T |> c
σ12 σ22
Given H0 , T ∼ Normal(0, σT2 ), where σT2 = +
n1 n2
α =P (reject H0 | H0 is true)
=P (| T |> c | µ1 = µ2 )

−c
=2FZ
σT

• The null and alternative hypothesis are:

H0 : µ1 = µ2

HA : µ1 > µ2
Test Statistic: T = X − Y
Test: Reject H0 , if X − Y > c
σ12 σ22
Given H0 , T ∼ Normal(0, σT2 ), where σT2 = +
n1 n2
α =P (reject H0 | H0 is true)
=P (X − Y > c | µ1 = µ2 )

c
=1 − FZ
σT

• The null and alternative hypothesis are:

H0 : µ1 = µ2

HA : µ1 < µ2
Test Statistic: T = X − Y
Test: Reject H0 , if Y − X > c
σ12 σ22
Given H0 , T ∼ Normal(0, σT2 ), where σT2 = +
n1 n2
α =P (reject H0 | H0 is true)
=P (Y − X > c | µ1 = µ2 )

c
=1 − FZ
σT

Page 4

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
5. Two samples F -test (known variances)

Let X1 , . . . , Xn1 ∼ iid Normal(µ1 , σ12 )

and Y1 , . . . , Yn2 ∼ iid Normal(µ2 , σ22 )
Following are the three different possibilities:
• The null and alternative hypothesis are:
H0 : σ1 = σ2
HA : σ1 > σ2
S12
Test Statistic: T =
S22
Test: Reject H0 , if T > 1 + c
Given H0 , T ∼ F (n1 − 1, n2 − 1)
α =P (reject H0 | H0 is true)
=P (T > 1 + c | σ1 = σ2 )
=1 − FF (n1 −1,n2 −1) (1 + c)

• The null and alternative hypothesis are:

H0 : σ1 = σ2
HA : σ1 < σ2
S12
Test Statistic: T =
S22
Test: Reject H0 , if T < 1 − c
Given H0 , T ∼ F (n1 − 1, n2 − 1)
α =P (reject H0 | H0 is true)
=P (T < 1 − c | σ1 = σ2 )
=FF (n1 −1,n2 −1) (1 − c)

• The null and alternative hypothesis are:

H0 : σ1 = σ2
HA : σ1 6= σ2
S12
Test Statistic: T =
S22
Test: Reject H0 , if T > 1 + cR or T < 1 − cL
Given H0 , T ∼ F (n1 − 1, n2 − 1)
α
=P (T > 1 + cR | H0 ) = P (T < 1 − cL | H0 )
2

Page 5

www.letslearn1110.com
https://www.youtube.com/@letslearn1110
6. Likelihood Ratio test:
For simple null and alternative hypothesis, Likelihood ratio test is enough.

X1 , . . . , X n ∼ P

Consider the simple null and alternative hypothesis:

H0 : P = fX

HA : P = gX
n
Q
gX (Xi )
i=1
Likelihood ratio: L(X1 , . . . , Xn ) = Qn
fX (Xi )
i=1
Likelihood ratio test: Reject H0 , if T = L(X1 , . . . , Xn ) > c

7. χ2 -test for goodness of fit:

H0 : Samples are i.i.d X, HA : Samples are not i.i.d X

k (y − np )2 k (observed value − expected value)2

i i
∼ χ2k−1
P P
Test statistic: T = =
i=1 np i i=1 expected value

Test: Reject H0 if T > c.

Significance level: α = P (T > c | H0 ) ≈ 1 − Fχ2k−1 (c)
Note: In case of continuous distribution, convert continuous to discrete by binning.

8. Test for independence:

H0 : Joint PMF is product of marginals, HA : Joint PMF is not product of marginals

P (yij − npij )2 k (observed value − expected value)2

∼ χ2dof
P
Test statistic: T = =
i,j npij i=1 expected value

where dof = (number of rows−1) × (number of columns−1)

yij = product of marginals for (i, j)
npij = expected, if independent

Test: Reject H0 if T > c.

Page 6

www.letslearn1110.com
https://www.youtube.com/@letslearn1110

The Art of Problem Solving Intermediate Algebra
96% (25)
The Art of Problem Solving Intermediate Algebra
720 pages
Digital SAT Math Practice Questions
61% (31)
Digital SAT Math Practice Questions
29 pages
Discovering Geometry Solutions Manual
70% (10)
Discovering Geometry Solutions Manual
304 pages
Woodcock Johson IV Training Manual PDF
100% (2)
Woodcock Johson IV Training Manual PDF
48 pages
Introduction To Geometry
90% (21)
Introduction To Geometry
580 pages
The Motivational Interviewing Workbook - Exercises To Decide What You Want and How To Get There
100% (10)
The Motivational Interviewing Workbook - Exercises To Decide What You Want and How To Get There
224 pages
Algebra Cheat Sheet
97% (72)
Algebra Cheat Sheet
2 pages
Workout Log
63% (19)
Workout Log
8 pages
Physics Primer - Homework - 1
95% (42)
Physics Primer - Homework - 1
40 pages
Golf Strategies - Dave Pelz's Short Game Bible PDF
92% (24)
Golf Strategies - Dave Pelz's Short Game Bible PDF
444 pages
Catherine V Holmes - How To Draw Cool Stuff, A Drawing Guide For Teachers and Students
97% (35)
Catherine V Holmes - How To Draw Cool Stuff, A Drawing Guide For Teachers and Students
260 pages
Parts Work 4th Edition
100% (30)
Parts Work 4th Edition
166 pages
[Algebra Essentials Practice Workbook with Answers Linear and Quadratic Equations Cross Multiplying and Systems of Equations Improve your Math Fluency Series] Chris McMullen - Algebra Essentials Practice Workbook with A.pdf
82% (11)
[Algebra Essentials Practice Workbook with Answers Linear and Quadratic Equations Cross Multiplying and Systems of Equations Improve your Math Fluency Series] Chris McMullen - Algebra Essentials Practice Workbook with A.pdf
207 pages
Math 87 Mathematics 8 - 7 Textbook An Incremental Development Stephen Hake John Saxon
100% (10)
Math 87 Mathematics 8 - 7 Textbook An Incremental Development Stephen Hake John Saxon
696 pages
Cfa L 1 Mock Paper - Solution - 2024
100% (1)
Cfa L 1 Mock Paper - Solution - 2024
10 pages
Astrology Cheatsheet
98% (44)
Astrology Cheatsheet
15 pages
Self-System Therapy For Depression Client Workbook
100% (9)
Self-System Therapy For Depression Client Workbook
113 pages
MTH2222 Mathematics of Uncertainty
No ratings yet
MTH2222 Mathematics of Uncertainty
96 pages
The Colossal Book of Mathematics PDF
100% (11)
The Colossal Book of Mathematics PDF
744 pages
Algebra 8-1studyguide
71% (7)
Algebra 8-1studyguide
110 pages
Stats 2 Notes_removed (1)
No ratings yet
Stats 2 Notes_removed (1)
38 pages
Statistics 2 W1-12 Notes
No ratings yet
Statistics 2 W1-12 Notes
48 pages
statistics-2-w1-12-notes_watermark (1)
No ratings yet
statistics-2-w1-12-notes_watermark (1)
49 pages
Week 1
No ratings yet
Week 1
5 pages
CENG 222 Statistical Methods For Computer Engineering
No ratings yet
CENG 222 Statistical Methods For Computer Engineering
31 pages
All in One CheatSheet PDF
No ratings yet
All in One CheatSheet PDF
52 pages
STAT 516 Course Notes Part 0: Review of STAT 515: 1 Probability
No ratings yet
STAT 516 Course Notes Part 0: Review of STAT 515: 1 Probability
21 pages
Introduction To Discrete Probability Theory and Bayesian Networks
No ratings yet
Introduction To Discrete Probability Theory and Bayesian Networks
26 pages
ML Cheat Sheet
50% (2)
ML Cheat Sheet
74 pages
Week 2
No ratings yet
Week 2
2 pages
Lecture03 Probability Review
No ratings yet
Lecture03 Probability Review
48 pages
Cmpe107 Notes
No ratings yet
Cmpe107 Notes
14 pages
LectureSTS 1A
No ratings yet
LectureSTS 1A
35 pages
ML DL AI Cheatsheet
No ratings yet
ML DL AI Cheatsheet
52 pages
1-ProbabilityReview v3
No ratings yet
1-ProbabilityReview v3
116 pages
AI ML Cheatsheet
No ratings yet
AI ML Cheatsheet
51 pages
Stat 350 Study Guide
No ratings yet
Stat 350 Study Guide
37 pages
Probability Review
No ratings yet
Probability Review
5 pages
Week0-part3
No ratings yet
Week0-part3
2 pages
MATHEMATICAL - STATISTICS (p.1-34)
No ratings yet
MATHEMATICAL - STATISTICS (p.1-34)
34 pages
MAT3003 Modules - (1 2 3) - Updated
No ratings yet
MAT3003 Modules - (1 2 3) - Updated
40 pages
Revision Notes - ST2131: Ma Hongqiang April 18, 2017
No ratings yet
Revision Notes - ST2131: Ma Hongqiang April 18, 2017
30 pages
Probability Basics
No ratings yet
Probability Basics
19 pages
Chapter 1 Probability
No ratings yet
Chapter 1 Probability
13 pages
Unit7 Probability Statistics I-1
No ratings yet
Unit7 Probability Statistics I-1
49 pages
Stat 333
No ratings yet
Stat 333
128 pages
Stochastic Processes SM
No ratings yet
Stochastic Processes SM
82 pages
Exam P Formula Sheet
100% (4)
Exam P Formula Sheet
14 pages
Probability-The Science of Uncertainty and Data
No ratings yet
Probability-The Science of Uncertainty and Data
4 pages
Turn in Recitation and Tutorial Scheduling Form Policy: Text
No ratings yet
Turn in Recitation and Tutorial Scheduling Form Policy: Text
52 pages
Probablity Mit Removed
No ratings yet
Probablity Mit Removed
31 pages
Midterms Cheatsheet
No ratings yet
Midterms Cheatsheet
2 pages
Module I Complete
No ratings yet
Module I Complete
40 pages
Probability: Key Concept Involved
No ratings yet
Probability: Key Concept Involved
5 pages
Probability Review
No ratings yet
Probability Review
29 pages
3_prob-review
No ratings yet
3_prob-review
77 pages
Class 12 Chapter 13 Maths Important Formulas
No ratings yet
Class 12 Chapter 13 Maths Important Formulas
2 pages
A 18-Page Statistics & Data Science Cheat Sheets
No ratings yet
A 18-Page Statistics & Data Science Cheat Sheets
18 pages
Theory of Probability
No ratings yet
Theory of Probability
9 pages
Material_MAT3003_Modules-(1+2+3)
No ratings yet
Material_MAT3003_Modules-(1+2+3)
63 pages
340 Printable Course Notes
No ratings yet
340 Printable Course Notes
184 pages
Basic Concepts of Probability Theory
No ratings yet
Basic Concepts of Probability Theory
51 pages
4b_ProbabilityNotes
No ratings yet
4b_ProbabilityNotes
79 pages
CSUnit1[1]
No ratings yet
CSUnit1[1]
124 pages
Cs229 Probability Review
No ratings yet
Cs229 Probability Review
36 pages
Stochastic Systems: Dr. Farah Haroon
No ratings yet
Stochastic Systems: Dr. Farah Haroon
24 pages
DA Unit-2 Probability and Statistical Methods
No ratings yet
DA Unit-2 Probability and Statistical Methods
139 pages
Random Variables PDF
No ratings yet
Random Variables PDF
23 pages
Lecture01 Intro Probability Theory
No ratings yet
Lecture01 Intro Probability Theory
80 pages
Unit 4 - Basic Probability
No ratings yet
Unit 4 - Basic Probability
15 pages
Information & Communication
No ratings yet
Information & Communication
13 pages
Mathematical statistics
No ratings yet
Mathematical statistics
7 pages
main
No ratings yet
main
24 pages
Summary Statistics
No ratings yet
Summary Statistics
2 pages
Inf Theory 3
No ratings yet
Inf Theory 3
76 pages
Applied Econometrics. Statistics Review. Part 1. Probability
No ratings yet
Applied Econometrics. Statistics Review. Part 1. Probability
3 pages
Two Interpretations of Probability: The Frequentist Interpretation
No ratings yet
Two Interpretations of Probability: The Frequentist Interpretation
18 pages
L1 Queue
No ratings yet
L1 Queue
9 pages
Calculus I Essentials
From Everand
Calculus I Essentials
Editors of REA
1/5 (1)
How To Read Sheet Music For Beginners
100% (2)
How To Read Sheet Music For Beginners
15 pages
Algebra 2
95% (19)
Algebra 2
200 pages
How To Distinguish ADHD From Typical Toddler Behavior
100% (1)
How To Distinguish ADHD From Typical Toddler Behavior
24 pages
Tarasov Calculus
100% (1)
Tarasov Calculus
179 pages
Pre-Algebra and Algebra
100% (23)
Pre-Algebra and Algebra
66 pages
Mathematics Fundamentals
89% (9)
Mathematics Fundamentals
198 pages
Day 2 Math, Distributive Property
No ratings yet
Day 2 Math, Distributive Property
9 pages
M. Aurelius PDF
100% (10)
M. Aurelius PDF
366 pages
Fractions
100% (10)
Fractions
50 pages
Florida Teacher Certificate Examinations (FTCE) Study Guide
0% (2)
Florida Teacher Certificate Examinations (FTCE) Study Guide
20 pages
ECG Rhythm Interpretation 2007
100% (20)
ECG Rhythm Interpretation 2007
533 pages
PTSP Lab Record
No ratings yet
PTSP Lab Record
27 pages
MCQ 6 Probability & Index Numbers
No ratings yet
MCQ 6 Probability & Index Numbers
5 pages
Random Variable Modified PDF
No ratings yet
Random Variable Modified PDF
19 pages
Stat3021 2
No ratings yet
Stat3021 2
4 pages
Edu 2016 05 P Syllabus PDF
No ratings yet
Edu 2016 05 P Syllabus PDF
3 pages
Frequency Analysis in Hydrology
No ratings yet
Frequency Analysis in Hydrology
24 pages
CH 05
No ratings yet
CH 05
26 pages
1. Excel Assignemnt 1 Statistical Analysis
No ratings yet
1. Excel Assignemnt 1 Statistical Analysis
4 pages
Lect01 Handouts
No ratings yet
Lect01 Handouts
45 pages
HW 4 Chap 2
No ratings yet
HW 4 Chap 2
4 pages
4.PS-Unit-3 Questions
No ratings yet
4.PS-Unit-3 Questions
4 pages
SI Practice Material For Week 5: Discrete Probability Distribution Uniform Distribution
No ratings yet
SI Practice Material For Week 5: Discrete Probability Distribution Uniform Distribution
5 pages
091 - MA8451 MA6451 Probability and Random Processes - Important Question PDF
No ratings yet
091 - MA8451 MA6451 Probability and Random Processes - Important Question PDF
19 pages
Chapter-6 Normal Distribution
100% (2)
Chapter-6 Normal Distribution
113 pages
V64i04 PDF
No ratings yet
V64i04 PDF
34 pages
Problem Set 7
No ratings yet
Problem Set 7
2 pages
Back To Problems Page: Solution
No ratings yet
Back To Problems Page: Solution
10 pages
SSP4SE Appa
No ratings yet
SSP4SE Appa
10 pages
BaseR Cheat Sheet
No ratings yet
BaseR Cheat Sheet
21 pages
Confidence Interval For Variance (Problems)
No ratings yet
Confidence Interval For Variance (Problems)
13 pages
IB AAHL 4.6 Normal Distribution
No ratings yet
IB AAHL 4.6 Normal Distribution
14 pages
Stats 2 GA Week 9 Sols
No ratings yet
Stats 2 GA Week 9 Sols
8 pages
If $X$ Has The Uniform Density With The Parameters $alpha 0 Quizlet
No ratings yet
If $X$ Has The Uniform Density With The Parameters $alpha 0 Quizlet
2 pages
F Tables PDF
No ratings yet
F Tables PDF
5 pages
PT REVIEW Syllabus
No ratings yet
PT REVIEW Syllabus
2 pages
Simulation Homework 2 Chapter 2 Random Numbers and Random Variables
No ratings yet
Simulation Homework 2 Chapter 2 Random Numbers and Random Variables
7 pages
STA1501 2023 TL 203 0 E Solutions To Assignment 03
No ratings yet
STA1501 2023 TL 203 0 E Solutions To Assignment 03
13 pages
1) CE Probability - Revision
No ratings yet
1) CE Probability - Revision
10 pages
Society of Actuaries/Casualty Actuarial Society: Exam C Construction and Evaluation of Actuarial Models
No ratings yet
Society of Actuaries/Casualty Actuarial Society: Exam C Construction and Evaluation of Actuarial Models
83 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.