OLS Assumptions
OLS Assumptions
TYPES
Y = β1 + β2X1 +β3X2+β4X3+µ
• Perfect collinearity
• Imperfect collinearity
■ Multicollinearity, as we have defined, adheres only to linear
relationship among X variables. It does not rule out nonlinear
relationships among them. For example, consider the
following regression model:
Y = β1 + β2X1 +β3X12+β4X13+µ
SOURCES
■ Dropping a variable(s).
■ Transformation of variables.
Yt = β1 +β2X2t +β3X3t +u
Yt−1 = β1 +β2X2,t−1 +β3X3,t−1 +ut−1
Yt −Yt−1 = β2(X2t − X2,t−1)+β3(X3t − X3,t−1)+v
13
Meaning
14
Graphical representation of
heteroscedasticity
15
Graphical representation of
heteroscedasticity
•The following graph
shows a heteroscedastic
data where the conditional
variance of Y variables
varies with a change in X
variable.
1. The regression model we choose might not suit the data. There
might be incorrect data transformation or incorrect functional form.
2. We might not consider a variable while recording data which was
an important variable in the first place. This violates the
assumption of specification of a regression model in CLRM.
3. Heteroscedasticity can also arise as a result of the presence of
outliers.
4. Skewness in the distribution of one or more regressors included in
the model may also lead to heteroscedasticity. Variables like
education, income, wealth, etc. give heteroscedastic data in most
cases distribution of which are generally uneven.
17
Consequences
1. Heteroscedasticity does not alter the unbiasedness and consistency
properties of OLS estimators.
2. But OLS estimators are no longer of minimum variance or
efficient. That is, they are not best linear unbiased estimators
(BLUE); they are simply linear unbiased estimators (LUE).
3. As a result, the t and F tests based under the standard assumptions
of CLRM may not be reliable, resulting in erroneous conclusions
regarding the statistical significance of the estimated regression
coefficients.
18
Detection- Graphical
19
Detection: Mathematical
▰ Park test
▰ Glejser test
▰ Spearman’s rank correlation test
▰ Goldfeld-Quandt test
▰ Breusch-Pagan-Godfrey test
▰ White’s general heteroscedasticity test
▰ Koenkar-Bassett test
20
An Example
Factors that determine the abortion rate across the 50 states in the USA:
▰ State = name of the state (50 US states).
▰ ABR=Abortion rate, number of abortions per thousand women
aged 15–44 in 1992.
▰ Religion = the percent of a state’s population that is Catholic,
Southern Baptist, Evangelical, or Mormon.
▰ Price = the average price charged in 1993 in non-hospital facilities
for an abortion at 10 weeks with local anesthesia (weighted by the
number of abortions performed in 1992).
21
An Example
▰ State = name of the state (50 US states).
▰ Laws = a variable that takes the value of 1 if a state enforces a law
that restricts a minor’s access to abortion, 0 otherwise.
▰ Funds = a variable that takes the value of 1 if state funds are
available for use to pay for an abortion under most circumstances,
0 otherwise.
▰ Educ = the percent of a state’s population that is 25 years or older
with a high school degree (or equivalent), 1990.
▰ Income = disposable income per capita, 1992.
▰ Picket = the percentage of respondents that reported experiencing
picketing with physical contact or blocking of patients.
22
The Econometrics Model
Abortioni = β1 + β2Reli + β3Pricei + β4Lawsi + β5Fundsi + β6Educi +
β7Incomei + β8Picketi + ui
Where i = 1,2,3,…..,50
we would expect ABR to be negatively related to religion, price, laws,
picket, education, and positively related to fund and income. We assume
the error term satisfies the standard classical assumptions, including the
assumption of homoscedasticity.
Of course, we will do a post-estimation analysis to see if this assumption
holds in the present case.
23
Consequences
1. Heteroscedasticity does not alter the unbiasedness and consistency
properties of OLS estimators.
2. But OLS estimators are no longer of minimum variance or
efficient. That is, they are not best linear unbiased estimators
(BLUE); they are simply linear unbiased estimators (LUE).
3. As a result, the t and F tests based under the standard assumptions
of CLRM may not be reliable, resulting in erroneous conclusions
regarding the statistical significance of the estimated regression
coefficients.
24
AUTOCORRELATI
ON
Introduction
■ = +E
■ First order autocorrelation
Patterns of positive and negative autocorrelation
Causes
■ Inertia
■ Specification bias
■ Cobweb phenomenon
■ Lags or autoregression
■ Manipulation of data
■ Nonstationarity
■ Data transformation
Consequences
■ Residual variance is underestimated
■ R square is overestimated
■ F and t test results are misleading
Overall, the results are unreliable
Tests to detect
■ Graphical method
■ Durbin – Watson d test
■ Runs test
■ Breusch - Godfrey (BG) test
Graphical method
1. A plot of lag 1 is a plot of the values
of versus
■ Interpretation of results
If R= Number of runs, lies within the confidence interval, we do not reject the null
hypothesis. However, if R lies outside the confident interval, we reject the null
hypothesis.
Example
(---------) (+++++++++++++++++++++) (----------)
R=3
R = 3, which lies outside the confidence interval. Therefore, we reject out null hypothesis
and we can conclude that the residuals exhibit autocorrelation.
Remedial measures