0% found this document useful (0 votes)
14 views

FormulaSheet FinalExam

Uploaded by

Dark lord
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

FormulaSheet FinalExam

Uploaded by

Dark lord
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

STAT4003 Formula Sheet Final Exam

6:00 pm – 9:00 pm on Monday, August 14, 2023

Estimating One Population Mean When the Population Standard Deviation is Unknown

Under the confidence level 1 − 𝛼, the confidence interval for the population mean is


𝑠
𝑥̅ ± 𝑡𝑛−1 𝑆𝐸(𝑥̅ ), , 𝑆𝐸(𝑥̅ ) =
√𝑛

𝑡𝑛−1 𝑖𝑠 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 𝑤𝑖𝑡ℎ 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚 𝑛 − 1 𝑢𝑛𝑑𝑒𝑟 𝑡ℎ𝑒 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 1 − 𝛼.

Sample size determined by the requirement of margin of error.


∗ ∗
𝑠 ∗
𝑠 2
𝑀𝐸 = 𝑡𝑛−1 𝑆𝐸(𝑥̅ ) = 𝑡𝑛−1 , 𝑛 = (𝑡𝑛−1 )
√𝑛 𝑀𝐸

Testing One Population Mean when the Population Standard Deviation is Unknown
𝑥̅ − 𝜇
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 𝑡= , 𝑛 − 1 𝑖𝑠 𝑑𝑒𝑔𝑟𝑒𝑒 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚
𝑠/√𝑛

Estimating the Difference between Two Means Based on Independent Samples

Under the confidence level 1 − 𝛼, 𝑤ℎ𝑒𝑛 𝜎12 ≠ 𝜎22 the confidence interval of 𝜇1 − 𝜇2 is
𝑠12 𝑠22 (𝑠12 /𝑛1 + 𝑠22 /𝑛2 )2 𝑠12 𝑠22
(𝑥̅1 − 𝑥̅2 ) ± 𝑡𝛼/2 √( + ), 𝑑𝑓 = 2 , 𝑆𝐸(𝑥̅ 1 − 𝑥̅2 ) = √( + )
𝑛1 𝑛2 (𝑠1 /𝑛1 )2 (𝑠22 /𝑛2 )2 𝑛1 𝑛2
+
𝑛1 − 1 𝑛2 − 1
Under the confidence level 1 − 𝛼, 𝑤ℎ𝑒𝑛 𝜎12 = 𝜎22 the confidence interval of 𝜇1 − 𝜇2 is
1 1 (𝑛1 − 1)𝑠12 + (𝑛2 − 1)𝑠22
(𝑥̅1 − 𝑥̅2 ) ± 𝑡𝛼/2 √𝑠𝑝2 ( + ), 𝑑𝑓 = 𝑛1 + 𝑛2 − 2, 𝑠𝑝2 =
𝑛1 𝑛2 𝑛1 + 𝑛2 − 2

1 1
𝑆𝐸𝑝𝑜𝑜𝑙𝑒𝑑 (𝑥̅ − 𝑥̅2 ) = 𝑠𝑝 √ +
1 𝑛1 𝑛2
Testing the Difference between Two Means Based on Independent Samples
Test statistic for 𝜇1 − 𝜇2 𝑤ℎ𝑒𝑛 𝜎12 ≠ 𝜎22

(𝑥̅1 − 𝑥̅2 ) − (𝜇1 − 𝜇2 ) (𝑠12 /𝑛1 + 𝑠22 /𝑛2 )2


𝑡= , 𝑑𝑓 =
(𝑠12 /𝑛1 )2 (𝑠22 /𝑛2 )2
𝑠2 𝑠2 𝑛1 − 1 + 𝑛2 − 1
√( 1 + 2 )
𝑛 1 𝑛 2

(Pooled t-test) Test statistic for 𝜇1 − 𝜇2 𝑤ℎ𝑒𝑛 𝜎12 = 𝜎22

(𝑥̅1 − 𝑥̅2 ) − (𝜇1 − 𝜇2 ) (𝑛1 − 1)𝑠12 + (𝑛2 − 1)𝑠22


𝑡= , 𝑑𝑓 = 𝑛1 + 𝑛2 − 2, 𝑠𝑝2 =
1 1 𝑛1 + 𝑛2 − 2
√𝑠𝑝2 ( + )
𝑛1 𝑛2
Estimating the Difference between Two Means Based on Paired Data

𝑠𝐷 ̅) = 𝑠𝑑
𝜇 = 𝑥̅𝐷 ± 𝑡𝛼/2 , 𝑑𝑓 = 𝑛 − 1 𝑖𝑠 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚, 𝑆𝐸(𝑑
√𝑛𝐷 √𝑛

Testing the Difference between Two Means based on Paired Data

𝐻0 : 𝜇𝑑 = 0

𝐻1 : 𝜇𝑑 ≠ 0, 𝑜𝑟 𝐻1 : 𝜇𝑑 > 0, 𝑜𝑟 𝐻1 : 𝜇𝑑 < 0

𝑥̅𝐷 − 𝜇𝐷
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 𝑡= , 𝑑𝑓 = 𝑛 − 1 𝑖𝑠 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚
𝑠𝐷 /√𝑛𝐷

One-Factor Analysis of Variance

𝐻0 : 𝜇1 = 𝜇2 = ⋯ = 𝜇𝑘

𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝑚𝑒𝑎𝑛 𝑖𝑠 𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑡

𝑀𝑆𝑇
𝑇𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 𝐹𝑘−1,𝑁−𝑘 = , 𝑟𝑒𝑗𝑒𝑐𝑡𝑖𝑛𝑔 𝑡ℎ𝑒 𝑛𝑢𝑙𝑙 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠 𝑤ℎ𝑒𝑛 𝐹𝑘−1,𝑁−𝑘 𝑖𝑠 𝑡𝑜𝑜 𝑙𝑎𝑟𝑔𝑒.
𝑀𝑆𝐸

𝑆𝑆𝑇 𝑆𝑆𝐸
𝑀𝑆𝑇 = , 𝑀𝑆𝐸 =
𝑘−1 𝑁−𝑘

Average size of the error standard deviation 𝑠𝑝 = √𝑀𝑆𝐸

Variation between the groups


𝑘

𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑆𝑆𝑇 = ∑ 𝑛𝑖 (𝑥̅𝑖 − 𝑥̅ )2 , 𝑑𝑓 = 𝑘 − 1


𝑖=1
𝑥̅𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑔𝑟𝑜𝑢𝑝 𝑖 𝑎𝑛𝑑 𝑥̅ 𝑖𝑠 𝑡ℎ𝑒 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑙𝑙 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠.

Variation within a group


𝑘

𝐸𝑟𝑟𝑜𝑟 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑆𝑆𝐸 = ∑(𝑛𝑖 − 1)𝑠𝑖2 ,


𝑖=1
𝑠𝑖2 𝑖𝑠 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 𝑜𝑓 𝑔𝑟𝑜𝑢𝑝 𝑖, 𝑑𝑓 = 𝑁 − 𝑘

𝑆𝑆𝑇𝑜𝑡𝑎𝑙 = 𝑆𝑆𝑇 + 𝑆𝑆𝐸

ANOVA Table of One Factor Analysis

Source df Sum of Squares Mean Square F-Ratio Prob > F


Treatment (Between) k-1 SST MST MST/MSE p-value
Error (Within) N-k SSE MSE
Total N-1 SSTotal
Two-Factor Analysis of Variance

𝑥𝑖𝑗 represents the ith level of the first factor and the jth level of the second factor. First factor A has a levels and
second factor B has b levels.
𝑏 𝑎

𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑓𝑖𝑟𝑠𝑡 𝑓𝑎𝑐𝑡𝑜𝑟 𝑆𝑆𝐴 = ∑ ∑(𝑥̅𝑖 − 𝑥̿ )2


𝑗=1 𝑖=1
𝑥̅𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑎𝑠𝑠𝑖𝑔𝑛𝑚𝑒𝑛𝑡 𝑎𝑠𝑠𝑖𝑔𝑛𝑒𝑑 𝑙𝑒𝑣𝑒𝑙 𝑖 𝑜𝑓 𝑓𝑎𝑐𝑡𝑜𝑟 𝐴 𝑎𝑛𝑑 𝑥̿ 𝑖𝑠 𝑡ℎ𝑒 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑙𝑙
𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠.
𝑆𝑆𝐴
𝑀𝑆𝐴 =
𝑎−1
𝑎 𝑏
2
𝑇𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑠𝑒𝑐𝑜𝑛𝑑 𝑓𝑎𝑐𝑡𝑜𝑟 𝑆𝑆𝐵 = ∑ ∑(𝑥̅𝑗 − 𝑥̿ )
𝑖=1 𝑗=1
𝑥̅𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑙𝑙 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑎𝑠𝑠𝑖𝑔𝑛𝑚𝑒𝑛𝑡 𝑎𝑠𝑠𝑖𝑔𝑛𝑒𝑑 𝑙𝑒𝑣𝑒𝑙 𝑗 𝑜𝑓 𝑓𝑎𝑐𝑡𝑜𝑟 𝐵 𝑎𝑛𝑑 𝑥̿ 𝑖𝑠 𝑡ℎ𝑒 𝑜𝑣𝑒𝑟𝑎𝑙𝑙 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑎𝑙𝑙
𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠.
𝑆𝑆𝐵
𝑀𝑆𝐵 =
𝑏−1

𝑆𝑆𝐸 = 𝑆𝑆𝑇𝑜𝑡𝑎𝑙 − (𝑆𝑆𝐴 + 𝑆𝑆𝐵)


𝑎 𝑏
2
𝑆𝑆𝑇𝑜𝑡𝑎𝑙 = ∑ ∑(𝑥𝑖𝑗 − 𝑥̿ )
𝑖=1 𝑗=1

𝑆𝑆𝐸
𝑀𝑒𝑎𝑛 𝑆𝑞𝑢𝑟𝑒 𝑓𝑜𝑟 𝐸𝑟𝑟𝑜𝑟 𝑀𝑆𝐸 = , 𝑤ℎ𝑒𝑟𝑒 𝑁 = 𝑎 × 𝐵
𝑁 − (𝑎 + 𝑏 − 1)

Test statistic

𝑀𝑆𝐴
𝐹𝑎−1,𝑁−(𝑎+𝑏−1) =
𝑀𝑆𝐸
𝑀𝑆𝐵
𝐹𝑏−1,𝑁−(𝑎+𝑏−1) =
𝑀𝑆𝐸

ANOVA Table of Two-Factor Analysis

Source df Sum of Squares Mean Square F-Ratio Prob > F


Factor A a-1 SSA MSA MSA/MSE p-value
Factor B b-1 SSB MSB MSB/MSE p-value
Interaction (a – 1)(b – 1) SSAB MSAB MSAB/MSE p-value
Error N-k SSE MSE
Total (Corrected) N-1 SSTotal
Chi-Square Test ---- Goodness of Fit Test

𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑂𝑏𝑠 − 𝐸𝑥𝑝 = 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝐶𝑜𝑢𝑛𝑡𝑠 − 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝐶𝑜𝑢𝑛𝑡𝑠


𝑂𝑏𝑠 − 𝐸𝑥𝑝
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑖𝑧𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =
√𝐸𝑥𝑝

(𝑂𝑏𝑠 − 𝐸𝑥𝑝)2
𝐶ℎ𝑖 − 𝑆𝑞𝑢𝑎𝑟𝑒 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐, 𝜒2 = ∑
𝐸𝑥𝑝

𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑒𝑙𝑙𝑠 − 1

Chi-Square Test ---- Homogeneity


𝑇𝑜𝑡𝑎𝑙𝑟𝑜𝑤 𝑖 × 𝑇𝑜𝑡𝑎𝑙𝐶𝑜𝑙 𝑗
𝐸𝑥𝑝𝑖𝑗 =
𝑇𝑎𝑏𝑙𝑒 𝑇𝑜𝑡𝑎𝑙
𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 = 𝑂𝑏𝑠 − 𝐸𝑥𝑝 = 𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝐶𝑜𝑢𝑛𝑡𝑠 − 𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝐶𝑜𝑢𝑛𝑡𝑠
𝑂𝑏𝑠 − 𝐸𝑥𝑝
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑𝑖𝑧𝑒𝑑 𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 =
√𝐸𝑥𝑝

2
(𝑂𝑏𝑠 − 𝐸𝑥𝑝)2
𝐶ℎ𝑖 − 𝑆𝑞𝑢𝑎𝑟𝑒 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐, 𝜒 =∑
𝐸𝑥𝑝
𝐷𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝐹𝑟𝑒𝑒𝑑𝑜𝑚 = (𝑅 − 1)(𝐶 − 1),

𝑅 𝑖𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑟𝑜 𝑟𝑜𝑤𝑠 𝑎𝑛𝑑 𝐶 𝑖𝑠 𝑡ℎ𝑒 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑙𝑢𝑚𝑛𝑠

Standardize Value of a Variable


𝑥 − 𝑥̅ 𝑦 − 𝑦̅
𝑧𝑥 = , 𝑧𝑦 =
𝑠𝑥 𝑠𝑦
∑ 𝑧𝑥 𝑧𝑦
𝐶𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝐶𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑏𝑡, 𝑟= , 𝑖𝑡 𝑐𝑎𝑛 𝑎𝑙𝑠𝑜 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑒𝑑 𝑏𝑦 𝑡ℎ𝑒 𝑓𝑜𝑚𝑢𝑙𝑎𝑠 𝑏𝑒𝑙𝑜𝑤
𝑛−1

∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅) ∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦̅) ∑(𝑥 − 𝑥̅ )2 ∑(𝑦 − 𝑦̅)2


𝑟= = , 𝑠𝑥 = √ , 𝑠𝑦 = √
√∑(𝑥 − 𝑥̅ )2 ∑(𝑦 − 𝑦̅)2 (𝑛 − 1)𝑠𝑥 𝑠𝑦 𝑛−1 𝑛−1

Linear Regression
𝑠𝑦
𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑠𝑡𝑟𝑎𝑖𝑔ℎ𝑡 𝑙𝑖𝑛𝑒 𝑦̂ = 𝑏0 + 𝑏1 𝑥, 𝑏1 = 𝑟 , 𝑏0 = 𝑦̅ − 𝑏1 𝑥̅
𝑠𝑥
𝑅𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑒 = 𝑦 − 𝑦̂
∑ 𝑒2
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 𝑜𝑓 𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙 𝑠𝑒 = √
𝑛−2

𝑅2, 𝑇ℎ𝑒 𝑠𝑞𝑢𝑎𝑟𝑒 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑙𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑦 𝑎𝑛𝑑 𝑥.

𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑑 𝐸𝑟𝑟𝑜𝑟𝑠, 𝑆𝑆𝐸 = ∑(𝑦 − 𝑦̂)2

𝑇𝑜𝑡𝑎𝑙 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠, 𝑆𝑆𝑇 = ∑(𝑦 − 𝑦̅)2


𝑆𝑆𝐸
𝑅2 = 1 −
𝑆𝑆𝑇
The t-Test for the regression slope
𝐻0 : 𝛽1 = 0
𝐻1 : 𝛽1 ≠ 0
Test Statistic for 𝛽1
𝑏1 − 𝛽1 𝑠𝑒
𝑡= , 𝑑𝑓 = 𝑣 = 𝑛 − 2, 𝑆𝐸(𝑏1 ) =
𝑆𝐸(𝑏1 ) 𝑠𝑥 √(𝑛 − 1)

𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝐸𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟 𝑜𝑓 𝛽1 , 𝑏1 ± 𝑡𝛼,𝑣 𝑆𝐸(𝑏1 ), 𝑣 = 𝑛 − 2,


2

𝛼
𝑡𝛼,𝑣 𝑖𝑠 𝑡 − 𝑐𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 𝑤𝑖𝑡ℎ 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒 𝑎𝑛𝑑 𝑑𝑒𝑔𝑟𝑒𝑒𝑠 𝑜𝑓 𝑓𝑟𝑒𝑒𝑑𝑜𝑚 𝑣 = 𝑛 − 2
2 2

The t-Test for the Correlation Coefficient

𝐻0 : 𝜌 = 0
𝐻1 : 𝜌 ≠ 0
Test Statistic for 𝜌
𝑛−2
𝑡 = 𝑟√ , 𝑑𝑓 = 𝑣 = 𝑛 − 2
1 − 𝑟2

Confidence Interval for the Predicted Mean Value 𝝁𝒗 𝒂𝒕 𝒂 𝒗𝒂𝒍𝒖𝒆 𝒙𝒗

𝑠𝑒2

𝑦̂𝑣 ± 𝑡𝑛−2 𝑆𝐸(𝜇̂ 𝑣 ), 𝑆𝐸(𝜇̂ 𝑣 ) = √𝑆𝐸 2 (𝑏1 )(𝑥𝑣 − 𝑥̅ )2 +
𝑛

The Prediction Interval for an Individual Value 𝒙𝒗

𝑠𝑒2
𝑦̂𝑣 ± ∗
𝑡𝑛−2 𝑆𝐸(𝑦̂𝑣 ), 𝑆𝐸(𝑦̂𝑣 ) = √𝑆𝐸 2 (𝑏1 )(𝑥𝑣 − 𝑥̅ )2 + + 𝑠𝑒2
𝑛

Multiple Regression Model


̂ = 𝑏0 + 𝑏1 𝑥1 + 𝑏2 𝑥2 + ⋯ + 𝑏𝑘 𝑥𝑘
𝒚

Standard Error of Estimate

𝑺𝑺𝑬
𝒔𝑬 = √
𝒏−𝒌−𝟏

Coefficient of determination
𝑆𝑆𝐸
𝑅2 = 1 −
∑(𝑦𝑖 − 𝑦̅)2

Coefficient of Determination Adjusted for Degrees of Freedom

𝑆𝑆𝐸/(𝑛 − 𝑘 − 1) 𝑀𝑆𝐸
𝐴𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑅 2 = 1 − =1− 2
∑(𝑦𝑖 − 𝑦̅)2 /(𝑛 − 1) 𝑠𝑦
The meaning of ANOVA table is as below:

Source of Variation Degree of Freedom Sums of Squares Mean Squares F-Statistic

Regression K SSR MSR=SSR/k F=MSR/MSE

Residual n-k-1 SSE MSE=SSE/(n-k-1)


Total n-1 𝑆𝑆𝑇𝑜𝑡𝑎𝑙

𝑆𝑆𝑇𝑜𝑡𝑎𝑙 = 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠, 𝑇𝑜𝑡𝑎𝑙 = ∑(𝑦 − 𝑦̅)2

𝑆𝑆𝑅 = 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠, 𝑅𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 = ∑(𝑦̂ − 𝑦̅)2

𝑆𝑆𝐸 = 𝑆𝑢𝑚 𝑜𝑓 𝑆𝑞𝑢𝑎𝑟𝑒𝑠, 𝐸𝑟𝑟𝑜𝑟𝑠 = ∑(𝑦̂ − 𝑦)2

𝑆𝑆𝑇𝑜𝑡𝑎𝑙 = 𝑆𝑆𝑅 + 𝑆𝑆𝐸

𝑆𝑆𝑅
𝑀𝑒𝑎𝑛 𝑆𝑞𝑢𝑎𝑟𝑒, 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛; (𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑) = 𝑀𝑆𝑅 =
𝑘

𝑆𝑆𝐸
𝑀𝑒𝑎𝑛 𝑆𝑞𝑢𝑎𝑟𝑒, 𝑒𝑟𝑟𝑜𝑟𝑠; (𝑟𝑒𝑠𝑖𝑑𝑢𝑎𝑙𝑠, 𝑢𝑛𝑒𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑) = 𝑀𝑆𝐸 =
𝑛−𝑘−1

𝑀𝑆𝑅
𝐹=
𝑀𝑆𝐸

𝑆𝑆𝑅 𝐸𝑥𝑝𝑙𝑎𝑖𝑛𝑒𝑑
𝑅2 = =
𝑆𝑆𝑇𝑜𝑡𝑎𝑙 𝑇𝑜𝑡𝑎𝑙

2 𝑆𝑆𝐸/(𝑛 − 𝑘 − 1)
𝑅𝑎𝑑𝑗 =1−
𝑆𝑆𝑇/(𝑛 − 1)

𝑀𝑢𝑙𝑡𝑖𝑝𝑙𝑒 𝑙𝑖𝑛𝑒𝑎𝑟 𝑟𝑒𝑔𝑟𝑒𝑠𝑠𝑖𝑜𝑛 𝑚𝑜𝑑𝑒𝑙: 𝑦 = 𝛽0 + 𝛽1 𝑥1 + 𝛽2 𝑥2 + ⋯ + 𝛽𝑘 𝑥𝑘

Hypotheses using F-test: 𝐻0 : 𝛽1 = 𝛽2 = ⋯ = 𝛽𝑘 = 0, 𝐻1 : 𝐴𝑡 𝑙𝑒𝑎𝑠𝑡 𝑜𝑛𝑒 𝛽 ≠ 0

The relationship among 𝑆𝐸 , R2, and F

SSE 𝑺𝜺 R2 F Assessment of Model


0 0 1 Perfect
small small close to 1 large Good
large large close to 0 small Poor
0 0 Useless
List of Excel Functions

NORM.S.DIST(z,1) gives the value of 𝑃(𝑋 < 𝑧)


if is X is standard normal distribution variable

NORM.DIST(z,µ,ơ,1) gives the value of 𝑃(𝑋 < 𝑧)


if is X is normal distribution variable with mean µ and standard deviation ơ

NORM.S.INV(𝑝) gives the value of z


if X is standard normal distribution variable and 𝑃(𝑋 < 𝑧) = 𝑝
NORM.INV(𝑝, 𝜇, 𝜎) gives the value of z,
if is X is normal distribution variable with mean µ and
standard deviation ơ and 𝑃(𝑋 < 𝑧) = 𝑝

T.DIST(x, degree of freedom, 1) gives the probability of the left side of x.

T.DIST.2T(x, degree of freedom) gives the probability of the two tails outside of
the interval (-x, x).

T.DIST.RT(x, degree of freedom) gives the probability of the right side of x.

T.INV(left side probability, degree of freedom) gives the critical t-value.


T.INV.2T(significance level, degree of freedom) gives the critical t-value.
F.DIST(x, df1, df2, cumulative) gives the probability of the left side of x when
the cumulative is true.

F.INV(probability, df1, df2) gives the critical value given the probability of left
side of the critical value.

F.DIST.RT(x, df1, df2) gives the probability of the right side of x.

F.INV.RT(probability, df1, df2) gives the critical value given the probability of
right side of the critical value.

CHISQ.DIST(x, df, 1)) gives the probability of the left side of value of x of Chi-
Square distribution

CHISQ.DIST.RT(x, df, 1)) gives the probability of the right side of value of x of
Chi-Square distribution. (for getting p-value of a Chi-square test)

CONFIDENCE.NORM(alpha, ơ, n) gives the radius of confidence interval with


confidence level “1 – alpha”, population standard deviation ơ and sample size n.

CONFIDENCE.T(alpha, s, n) gives the radius of confidence interval with


confidence level “1 – alpha”, population standard deviation s and sample size n.
AVERAGE(range of data in Excel) gives the mean of the data

SUM(range of data in Excel) gives the total value of the data

STDEV.P(range of data in Excel) gives the population standard deviation

STDEV.S(range of data in Excel) gives the sample standard deviation

VAR.P(range of data in Excel) gives the population variance

VAR.S(range of data in Excel) gives the sample variance

Instruction of adding the data analysis in since it is not default setting in Excel. All
temporary settings in the computer in our campus are cleaned up every day. If you
need to add the function in, here is the instruction.

Excel has all this analysis program built in but they are not showing up in the default
setting when you open Excel. You have to do the following steps in Excel to have the
functions available for you to use.

➢ Click the “File” button on the top left corner of Excel;


➢ Click the “Option” at the bottom of the drop down menu;
➢ Choose “Add-Ins” on the second bottom of the list on the left side of the pop-up
window;
➢ Choose “Analysis ToolPak” in the middle;
➢ Click “Go” at the bottom of the window;
➢ Check mark “Analysis ToolPak” on the top of the list in the Pop-up window;
➢ Click OK.

Now when you go back to Excel interface, click the group “Data”, you will see “Data
Analysis” showing on the right side of the ribbon area. Click it, you will be able to
perform different kind of z-tests, t-tests and other analysis.

those t-tests

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy