Exam 2 Formulas
Exam 2 Formulas
P (A∩B)
1. Conditional Probability P (B | A) = P (A)
Px
4. Cumulative Distribution Fucntion F (x) = P (X ≤ x) = k=0 P r(X = k)
5. Expected Value (P
x xp(x), for discrete variables
E(X) = R
xp(x), for continuous variables
N!
9. Permutation: PkN = (N −k)!
N N!
10. Combination: CkN = k = k!(N −k)!
3. V ar(cX) = c2 V ar(X)
p
4. SD(cX) = |c| V ar(X)
If X1 , X2 , . . . , XN are independent
1. V ar(X1 + X2 + · · · + Xn ) = V ar(X1 ) + · · · + V ar(Xn )
2. V ar(Xi − Xj ) = V ar(Xi ) + V ar(Yj )
1
Bernoulli Distribution
p(x) = P r(X = x) = π x (1 − π)1−x , x = 0, 1
E(X) = π
V (X) = π(1 − π)
Binomial pdf
P (X = x) = N N −x
x
x π (1 − π) , x = 0, 1, . . . , N
E(X) = N π
V (X) = N π(1 − π)
Geometric Distribution
P (X = x) = π(1 − π)x , x = 0, 1, 2, . . .
E[X] = (1 − π)/π
V (X) = (1 − π)/π 2
Hypergeometric Distribution
(M )(K−x
N
)
P (X = x) = x M +N , x = 0, 1, . . . , M
( K )
E(X) = K MM+N
M +N −K
V (X) = K × MM N
+N × M +N × M +N −1
2
Useful R Functions
3
1. T-test statistic for mean µ under H0 : µ = µ0
x̄−µ
T = √0
s/ n
∼ tn−1
where s is the sample standard deviation and n is sample size
2. Two-sided 100(1 − α) confidence interval for mean µ when unknown variance σ 2 estimated by s2
√ √
(x̄ − t1−α/2,df =n−1 s/ n, x̄ + t1−α/2,df =n−1 s/ n)
3. One-sided 100(1 − α)% confidence interval for mean µ when unknown variance σ 2 estimated by s2
√ √
(−∞, x̄ + t1−α,df =n−1 s/ n) or (x̄ + t1−α,df =n−1 s/ n, ∞)
x̄1 −x̄2
4. Unpaired t-test statistic (unequal variances) √
s21 /n1 +s22 /n2
x̄1 −x̄
6. Paired t-test statistic √2
sd / n
where sd is standard deviation of differences between paired samples
p
7. Standard error for proportion x/n is π(1 − π)/n
8. Chi-square Test statistic for an R × C table (R rows, C columns) is
P (Xij −E(Xij ))2
χ2 = E(Xij ) ∼ χ2(R−1)(C−1)
where the sum is over all the cells of the table, the degrees of freedom are (R − 1)(C − 1) and
E(Xij ) = Ri × Cj /N where Ri is the total count of the row and Cj is the total count in the column
9. Degrees of freedom for χ2 test of goodness of fit
DF = #bins of data − #parameters estimated − 1
4
Standard Normal Quantiles [pnorm(z)]
z P (Z ≤ z) z P Z ≤ z)
-3.0 0.001 0.1 0.540
-2.9 0.002 0.2 0.579
-2.8 0.003 0.3 0.618
-2.7 0.003 0.4 0.655
-2.6 0.005 0.5 0.691
-2.5 0.006 0.6 0.726
-2.4 0.008 0.7 0.758
-2.3 0.011 0.8 0.788
-2.2 0.014 0.9 0.816
-2.1 0.018 1.0 0.841
-2.0 0.023 1.1 0.864
-1.9 0.029 1.2 0.885
-1.8 0.036 1.3 0.903
-1.7 0.045 1.4 0.919
-1.6 0.055 1.5 0.933
-1.5 0.067 1.6 0.945
-1.4 0.081 1.7 0.955
-1.3 0.097 1.8 0.964
-1.2 0.115 1.9 0.971
-1.1 0.136 2.0 0.977
-1.0 0.159 2.1 0.982
-0.9 0.184 2.2 0.986
-0.8 0.212 2.3 0.989
-0.7 0.242 2.4 0.992
-0.6 0.274 2.5 0.994
-0.5 0.309 2.6 0.995
-0.4 0.345 2.7 0.997
-0.3 0.382 2.8 0.997
-0.2 0.421 2.9 0.998
-0.1 0.460 3.0 0.999
0.0 0.500
5
Tail area probability
DF 0.20 0.10 0.05 0.025 0.02 0.01 0.005 0.002 0.001
1 1.642 2.706 3.841 5.024 5.412 6.635 7.879 9.550 10.828
2 3.219 4.605 5.991 7.378 7.824 9.210 10.597 12.429 13.816
3 4.642 6.251 7.815 9.348 9.837 11.345 12.838 14.796 16.266
4 5.989 7.779 9.488 11.143 11.668 13.277 14.860 16.924 18.467
5 7.289 9.236 11.070 12.833 13.388 15.086 16.750 18.907 20.515
6 8.558 10.645 12.592 14.449 15.033 16.812 18.548 20.791 22.458
7 9.803 12.017 14.067 16.013 16.622 18.475 20.278 22.601 24.322
8 11.030 13.362 15.507 17.535 18.168 20.090 21.955 24.352 26.124
9 12.242 14.684 16.919 19.023 19.679 21.666 23.589 26.056 27.877
10 13.442 15.987 18.307 20.483 21.161 23.209 25.188 27.722 29.588
Quantiles of tail areas of the Chi-squared distribution. Each cell entry corresponds to the R command
qchisq(p, df, lower.tail = F) = 1 − qchisq(p, df, lower.tail = T), i.e. the quantile of the chi-square dis-
tribution corresponding to the tail area probability with the number of degrees of freedom given by the row
6
7
8