0% found this document useful (0 votes)
2 views

revision_concepts

The document outlines fundamental concepts in probability, including axioms, random variables, and various probability distributions. It covers bivariate distributions, expectations, linear transformations, and limit theorems, emphasizing key properties and theorems such as Bayes' Theorem and the Central Limit Theorem. Additionally, it discusses the multivariate normal distribution and its characteristics, including density functions and conditional distributions.

Uploaded by

gamer1234444
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

revision_concepts

The document outlines fundamental concepts in probability, including axioms, random variables, and various probability distributions. It covers bivariate distributions, expectations, linear transformations, and limit theorems, emphasizing key properties and theorems such as Bayes' Theorem and the Central Limit Theorem. Additionally, it discusses the multivariate normal distribution and its characteristics, including density functions and conditional distributions.

Uploaded by

gamer1234444
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

1 Basic Probability

Let A, B ⊆ Ω and {A1 , . . . , Am } form a partition of A.


1. The axioms of probability are positivity: P (A) ≥ 0; finiteness: P (Ω) = 1;
and additivity: P (A ∪ B) = P (A) + P (B) if A ∩ B = 0.
Pm
2. The partition law (extends additivity): P (∪m i=1 Ai ) = i=1 P (Ai ).

3. The addition law: P (A ∪ B) = P (A) + P (B) − P (A ∩ B).


4. A and B are independent events if and only if P (A ∩ B) = P (A) P (B).
5. The law of total probability is P (A ∩ B) = m
P
i=1 P (Ai ∩ B).

6. Conditional probability: P (A|B) = P (A ∩ B) / P (B).


7. Bayes Theorem: P (A|B) = P (A) P (B|A) / P (B).

2 Random variables
1. A random variable (rv) is a map from Ω → R.
2. The cumulative distribution function (cdf) of any rv, X, is FX (x) =
P (X ≤ x); x can be any real number.
3. Discrete random variables can take at most a countably infinite number
of values. The probability mass function (pmf) of a discrete rv is pX (x) =
P (X = x).
4. Continuous random values can take a continuum of values. The prob-
ability density function
Rx (pdf) of a continuous rv is fX (x) = x FX (x).
Conversely, FX (x) = −∞ fX (t)t.
5. The quantile function of a continuous rv, X, evaluated at some p ∈ (0, 1]
is the smallest value x such that FX (x) = p.
6. For some real-valuedRfunction, g, the expectation, E [g(X)] is ∞
P
i=−∞ pX (i)g(i)

if X is discrete and −∞ fX (t)g(t)t if X is continuous.
7. Expectation is linear: E [ag(X) + bh(X)] = a E [h(X)] + b E [g(X)].
2 2 2
p − E [X]) ] = E [X ] − E [X] .
8. The variance of any rv, X, is Var [X] = E [(X
The standard deviation is StdDev [X] = Var [X].
9. Var [aX + b] = a2 Var [X].

1
3 Summary of probability distributions
Name Abbrev. Disc/Cts Values
Discrete uniform Unif(0, m) D 0, 1, . . . , m
Bernoulli Bern(θ) D 0, 1
Binomial Bin(n, θ) D 0, 1, . . . , n
Geometric Geom(θ) D 0, 1, 2, 3, . . .
Poisson Poisson(λ) D 0, 1, 2, 3, . . .
Uniform Unif(a, b) C [a, b]
Exponential Exp(β) C [0, ∞)
Gamma Gam(α, β) C [0, ∞)
Beta Beta(α1 , α2 ) C [0, 1]
Weibull Weib(α, β) C [0, ∞)
Normal N(µ, σ 2 ) C (−∞, ∞)
Cauchy Cauchy C (−∞, ∞)

4 Univariate Transformations
Let X be a rv and let Y = g(X) where g is a real-valued function.
P
1. If X is discrete then so is Y , and pY (y) = x:g(x)=y pX (x).

2. Distribution function (cdf ) method: FY (y) = P (Y ≤ y) = P (g(X) ≤ y).


The right hand side must be evaluated from knowledge of X. If X is con-
tinuous then differentiation gives fY (y).

3. Density function (pdf ) method: if X is continuous and g is 1-1 then Y is


continuous and fY (y) = fX (x)|dx/dy|, where the right hand side is evaluated
at x = g −1 (y).

4. Be careful to also specify the range of Y .

5. PIT: if a continuous rv, V , has a cdf of F then F (V ) ∼ Unif(0, 1); if U ∼


Unif(0, 1) and F is a cdf of a continuous rv then F −1 (U ) is a rv whose cdf is
F.

5 Bivariate Distributions
Let (X, Y ) be a bivariate rv.

1. The joint cdf is FX,Y (x, y) = P (X ≤ x, Y ≤ y). FX (x) = FX,Y (x, ∞).

2
2. For a discrete rv, the joint pmf is pX,Y (x, y) = P (X = x, Y = y).
∂2
3. For a continuous rv, the joint pdf is fX,Y (x, y) = F (x, y).
∂x∂y X,Y

4. For discrete rvs X and Y , the marginal pmf of X, is pX (x) = ∞


P
j=−∞ pX,Y (x, j),
and the conditional pmf of X given Y = y is pX|Y (x|y) = pX,Y (x, y)/pY (y).
R∞
5. For continuous rvs X and Y , the marginal pdf of X is fX (x) = t=−∞ fX,Y (x, t)t,
and the conditional pdf of X given Y = y is fX|Y (x|y) = fX,Y (x, y)/fY (y).

6. X and Y are independent if and only if the events {X ∈ A} and {Y ∈ B}


are independent for all sets A and B: P (X ∈ A, Y ∈ B) = P (X ∈ A) P (Y ∈ B)
for all A, B.

7. An equivalent, but easier to check, condition for independence (of dis-


crete or continuous rvs) is: FX,Y (x, y) = FX (x)FY (y). For discrete rvs,
independence is also equivalent to pX,Y (x, y) = pX (x)pY (y), whereas for con-
tinuous rvs it is equivalent to fX,Y (x, y) = fX (x)fY (y). When just checking
factorisation within the range where the rvs are non-zero, variational in-
dependence must also be verified.

8. Lack of independence can be shown using the two-point method; showing


that fX,Y (x1 , y1 )fX,Y (x2 , y2 ) ̸= fX,Y (x1 , y2 )fX,Y (x2 , y1 ) for some x1 , x2 , y1 , y2 .
Alternatively, show that fX|Y (x|y) ̸= fX (x) for some y.

6 Bivariate Expectations
Let (X, Y ) be a bivariate rv and X = (X1 , . . . , Xn )t and Y = (Y1 , . . . , Ym ) be
vector rvs.

1. Bivariate expectation:
P∞ P∞
For a discrete rv E [g(X, Y )] = i=−∞ j=−∞ pX,Y (i, j)g(i, j).
R∞ R∞
For a continuous rv E [g(X, Y )] = −∞ −∞ fX,Y (s, t)g(s, t)st.

2. Linearity: E [ag(X, Y ) + bh(X, Y )] = a E [g(X, Y )] + b E [h(X, Y )].

3. If X and Y are independent then E [g(X)h(Y )] = E [g(X)h(Y )].

4. P
The conditional expectation of X given Y = y is E [g(X)|Y R ∞ = y] =

i=−∞ pX|Y (i|y)g(i) if X is a discrete rv, and E [g(X)|Y = y] = −∞ fX|Y (t|y)g(t)t
if X is continuous.

3
5. The conditional variance of X given Y = y is Var [X|Y = y] = E [X 2 |Y = y]−
E [X|Y = y]2 .

6. Tower: E [X|Y ] is a function of Y and hence a random variable; E [E [X|Y ]] =


E [X].
 
7. The moment generating function, MX (t) = E etX , uniquely determines
  (k)
the distribution of X. For integer k, E X k = MX (0).

8. If X and Y are independent then MX+Y (t) = MX (t)MY (t).

7 Linear Transformations
Let (X, Y ) be a bivariate rv and X = (X1 , . . . , Xn )t and Y = (Y1 , . . . , Ym ) be
vector rvs.
1. The covariance of X and Y is Cov [X, Y ] = E [(X − E [X])(Y − E [Y ])] =
E [XY ] − E [X] E [Y ]. Var [X] = Cov [X, X].

2. Bilinearity: Cov [aX, Y ] = aCov [X, Y ] and Cov [W + X, Y ] = Cov [W, Y ] +


Cov [X, Y ]
p
3. The correlation between X and Y is Corr [X, Y ] = Cov [X, Y ] / Var [X] Var [Y ].

4. Vector/matrix forms: E [X] = (E [X1 ] , . . . , E [Xn ])t . Var [X] is the matrix
with (i, j)-th element Cov [Xi , Xj ].

5. Summaries for linear


 ⊤ transformations:
 E⊤[AX]
 =A E [X] and Var [AX] =
⊤ ⊤ ⊤

AVar [X] A , so E a X = a E [X], Var a X = a Var [X] a, and
Cov a⊤ ⊤
= a⊤
 
1 X, a 2 X 1 Var [X] a2 .

8 Bivariate Transformations
Let (X, Y ) be a continuous bivariate rv that is mapped by a 1-1 transformation
to another continuous rv, (S, T ).
1. fS,T (s, t) = fX,Y (x, y)| det(J)|, where J = ∂(x, y)/∂(s, t). Relabelling and
re-arranging fS,T (s, t) = fX,Y (x, y)| det(J)|−1 , where J = ∂(s, t)/∂(x, y).

2. Be careful to define the joint range of S and T .

3. If interest lies in S = g1 (X, Y ) then define a dummy variable T = g2 (X, Y )


such that the transformation (X, Y ) → (S, T ) is 1-1.

4
9 Limit Theorems
Let Y be an rv with Var [Y ] < ∞, and let X1 , X2 , . . . be a sequence of independent
and identically
Pn distributed rvs with E [Xi ] = µ and Var [Xi ] = σ 2 < ∞. Let
Sn = i=1 Xi and X n = Sn /n.

1. Markov’s Inequality: P (Y > c) ≤ E [Y ] /c provided Y ≥ 0.

2. Chebyshev’s Inequality: P (|Y − E [Y ] | > c) ≤ Var [Y ] /c2 .


σ2

3. Weak Law of Large Numbers (WLLN): P |X n − µ| > ϵ ≤ nϵ 2 → 0 as

n → ∞.
√ 
n(X n −µ)
4. Central Limit Theorem (CLT): P σ
< a → Φ(a) as n → ∞.

5. Approximations arising from the CLT: X n ∼ N (µ, σ 2 /n) and Sn ∼ N (nµ, nσ 2 ).

6. Monte Carlo: let x1 , . . . , xn be independent


Pn realisations of a random vari-
1
Pnof interest, X. Then E [g(X)] ≈ n i=1 g(xi ). In particular P (X ∈ A) ≈
able
1
n i=1 1(xi ∈A) , the fraction of times the event ‘A’ occurs.

10 Multivariate Normal Distribution


Let X = (X1 , . . . , Xn ) have a multivariate normal (Gaussian) distribution with a
variance matrix of Σ and an expectation vector of µ. Let B be an m × n matrix.

1. The density is
 
1 1 t −1
f (x) = exp − (x − µ) Σ (x − µ) .
(2π)n/2 | det(Σ)|1/2 2

2. Decomposition: X = µ + AZ, where Z is a vector of n independent


standard normal random variables, and AAt = Σ.

3. Linear transformation: BX has a multivariate normal distribution.

4. Marginals: Xi ∼ N(µi , Σi,i ), and all pairwise marginals (e.g. of (X1 , X2 )t )


are bivariate normal.

5. Conditionals: the conditionals of a bivariate Gaussian are Gaussian:


e.g. X1 |X2 = x2 is Gaussian. This is true more generally for a multivariate
Gaussian: (X1 , X2 )|(X3 = x3 , X5 = x5 , X7 = x7 ) is bivariate Gaussian.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy