Probability Theory - Cheat Sheet
Probability Theory - Cheat Sheet
Listed below are some useful properties and rules of probability, for quick reference. It is highly
recommended that the student understands the reasoning and/or derivation behind these, instead
of simply mugging them up.
1. Pr[E] = 1 − Pr[E].
2. The probability space is said to be uniformly distributed if and only if every outcome is
1
equally likely. That is, Pr[a] = |Ω| for every outcome a ∈ Ω, the sample space.
X k
!
\
− · · · + (−1)k+1 Pr Em + ··· .
i1 <i2 <···<ik m=1
5. Union bound. A consequence of the inclusion-exclusion principle and much easier to use.
X
!
[
Pr Ei ≤ Pr[Ei ].
i i
6. The conditional probability for event E occurring, given that event F has already occurred,
is given by
Pr[E ∩ F]
Pr[E | F] = .
Pr[F]
7. Independent events. We say that events E and F are mutually independent if and only if
Pr[E | F] = Pr[E] (or equivalently, Pr[F | E] = Pr[F]). That is, the probability of occurrence of
E is not affected by the information that F has already occurred.
8. For n events E1 , E2 , · · · , En ,
" #
\ \
Pr Ei = Pr[E1 ] · Pr Ei E1
i i≥2
\
= Pr[E1 ] · Pr[E2 | E1 ] · Pr Ei E1 ∩ E2
i≥3
\
= Pr[E1 ] · Pr[E2 | E1 ] · Pr[E3 | E1 ∩ E2 ] · Pr Ei E1 ∩ E2 ∩ E4
i≥4
..
.
Y
n i−1
\
= Pr Ei Ej
i=1 j=1
9. Consequently, if the events in the previous rule are mutually independent, then
Y
" #
\
Pr Ei = Pr[Ei ].
i i
10. Bayes’ Law. If E1 [ , E2 , · · · , En are mutually disjoint events (that is, Ei ∩ Ej = ∅. for all
i, j ∈ [n]) such that Ei = Ω, the sample space, then
i
X X
(a) Pr[F] = Pr[F ∩ Ei ] = Pr[F | Ei ] Pr[Ei ] and
i i
Pr[Ei ∩ F] Pr[F | Ei ] Pr[Ei ]
(b) Pr[Ei | F] = =P .
Pr[F] j Pr[F | Ej ] · Pr[Ej ]
Rule (a) here is often referred to as the Law of Total Probability.
11. A random variable is a function that assigns a real value to every outcome of a random
experiment.
12. If X is a random variable, then we can treat (aX + b)d also as a random variable, where a, b, d
can be random variables themselves or constants with respect to the random experiment.
13. An indicator random variable is a random variable whose range is the set {0, 1}. They are
also commonly called Bernoulli or 0 − 1 random variables. Many random variables can be
expressed as a sum of indicator random variables; doing this often simplifies analysis.
17. Conditional expectation. What is the expected value of a random variable X given that
event E has already occurred?
X
E[X | E] = i Pr[X = i | E]
i
Page 2
18. Method of Conditional Expectation. For any random variables X, Y, we have
X
E[X] = Pr[Y = i] · E[X | Y = i]
i
19. Markov’s Inequality. Let X be a random variable that only takes non-negative values. Then,
for all a, b > 0,
E[X] 1
Pr[X ≥ a] ≤ or equivalently, Pr[X ≥ b · E[X]] ≤ .
a b
Y Y
" #
E Xi = E[Xi ].
i i
X
21. The kth moment of a random variable X is E[Xk ] = ik Pr[X = i], where i goes over the
i
range of X. The kth central moment of a random variable X is E[(X − E[X])k ].
X X
" #
Var Xi = Var[Xi ].
i i
Var[X] 1
Pr[|X − E[X]| ≥ a] ≤ 2
or equivalently, Pr[|X − E[X]| ≥ bσ] ≤ 2
a b
p
, where σ = Var[X] is the standard deviation of X.
25. If an event, with probability of success n1 , is independently repeated k times, then the prob-
ability that none of these repetitions succeed is at most
1 k
1− .
n
For k = n, we have
1 n 1
1− ≈
n e
Page 3