Chapter 3 Heteroscedasticity
Chapter 3 Heteroscedasticity
It is assumed that:
𝑉𝑎𝑟 (𝜇𝑖 ) = 𝜎 2 for all 𝑖. Homoscedasticity or constant variance of 𝜇𝑖 .
1
Causes of heteroscedasticity
(a) Misspecification: Some economic variables such as the consumer price index
(CPI) or GDP, tend to increase linearly or exponentially. If such variables are omitted
from the regression, they will be absorbed in the error term 𝜇𝑡 which will then exhibit
changing variance. For example, if a model of the form
𝑌𝑡 = 𝛽0 + 𝛽1 𝑋1𝑡 + 𝛽2 𝑋2𝑡 + 𝜇𝑡 is wrongly specified as 𝑌𝑡 = 𝛽0 + 𝛽1 𝑋1𝑡 + 𝑣𝑡 . If 𝑋2𝑡 is
increasing with time, will 𝑣𝑡 .
(b) Data collection procedures: Sampling procedures such as cluster sampling can
easily generate unequal variances.
(c) Stratification: Different economic units or populations are hardly homogeneous.
Data from two different groups of the population can exhibit unequal variances for
many reasons. For example, income figures for low and high-income groups, in
general, show variability or spread of values.
(d) Presence of outliers: The inclusion of an outlier, especially if the sample size is
small, substantially alters the results of regression analysis. Heteroscedasticity can
arise as a result of the presence of outliers.
(e) Data Treatment: data manipulation, such as data aggregation and grouping
techniques, tend to produce marked heterogeneity. Use of indices and choice or
change of base year cause heteroscedasticity.
OLS estimation in the presence of Heteroscedasticity
What happens to OLS estimators and their variances if we introduce
heteroscedasticity? i.e. 𝐸 (𝜇𝑖2 ) = 𝜎𝑖2 .
𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝜇𝑖
𝑛 𝑛 𝑛
𝑛∑ 𝑋 𝑌 −∑ 𝑋 ∑ 𝑌𝑖
𝛽̂2 = 𝑖=1𝑛 𝑖 𝑖 2 𝑖=1𝑛 𝑖 𝑖=1
2 (1)
𝑛 ∑𝑖=1 𝑋𝑖 −(∑𝑖=1 𝑋𝑖 )
∑𝑛 2 2
𝑖=1 𝑋𝑖 𝜎𝑖
𝑉𝑎𝑟(𝛽̂2 ) = 2 (2)
(∑𝑛 2
𝑖=1 𝑋𝑖 )
Which is different from the usual formula obtained under the assumption of
homoscedasticity.
𝜎 2
𝑉𝑎𝑟(𝛽̂2 ) = ∑𝑛 2 (3)
𝑖=1 𝑋𝑖
• From (1), 𝛽̂2 is still linear unbiased and consistent but no longer best and the
minimum variance is not given by Equation (2).
• Confidence interval based on (2) will be unneccessarilty larger. As a result, the
𝑡 and 𝐹 are likely to give us inaccurate results in that 𝑉𝑎𝑟(𝛽̂2 ) is overly large.
• Model coefficients are in accurate
• The estimated model has low predictive power.
2
Heteroscedasticity does not destroy the unbiasedness and consistency properties of
the OLS estimators. But these estimators are no longer minimum variance or efficient
i.e. they are not BLUE.
Testing for Heteroscedasticity (Formal tests)
• The Goldfeld-Quandt Test,
• Breusch-Pagan-Godfrey Test
• The White Test
The Goldfeld-Quandt Test
Purpose:
The Goldfeld-Quandt Test is used to check for heteroscedasticity
Assumptions:
• The error terms 𝜇𝑖 are uncorrelated and normally distributed.
• Heteroscedasticity is assumed to be positively related to one of the explanatory
variables in the regression model. i.e. 𝜎𝑖2 = 𝜎 2 𝑋𝑖2 .
Hypotheses:
𝐻𝑜 : 𝐻𝑜𝑚𝑜𝑠𝑐𝑒𝑑𝑎𝑠𝑖𝑐𝑖𝑡𝑦
𝐻1 : 𝐻𝑒𝑡𝑒𝑟𝑜𝑠𝑐𝑒𝑑𝑎𝑠𝑡𝑖𝑐𝑖𝑡𝑦
Test Statistic:
The test statistic is given by
𝑅𝑆𝑆𝑚𝑎𝑥 /(𝑛−𝑐−2𝑝)/2
𝜆= ,
𝑅𝑆𝑆𝑚𝑖𝑛 /(𝑛−𝑐−2𝑝)/2
Decision Rule:
Reject the null hypothesis of homoscedasticity if 𝜆 (𝐹𝑐𝑎𝑙𝑐𝑢𝑙𝑎𝑡𝑒𝑑 ) > 𝐹𝑐𝑣 .
4
3
Example:
Table 3.1: Hypothetical data on consumption expenditure 𝑌/𝑅 and income in 𝑌/𝑅 to
illustrate the Goldfeld-Quandst test
4
Breusch- Pagan-Godfrey Test
5
6
The White test using SAS: H/W
7
Remedies for Autocorrelation
If the pattern of heteroscedasticity can be established, use appropriate transformation
of the original model/data so that in the new models/data there is no
heteroscedasticity. We use some type of Generalized least-square (GLS) methods
e.g. Weighted least squares.
Proposition 4.1:
Let the relationship between the variance and the explanatory variable be given by
𝜎𝑡2 = 𝑓(𝑋𝑡 ), the the transformation or the original model consists of diving through the
original relationship by the square root of the term, 𝑓(𝑋𝑡 ).
Example:
Assume that the form of heteroscedasticity is
𝐸(𝜇𝑖2 ) = 𝜎𝜇2𝑖 = 𝜎 2 𝑋𝑖2 , in the model 𝑌𝑖 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝜇𝑖
Solution:
𝑌𝑖 𝛽1 𝜇
= + 𝛽2 + 𝑋𝑖 ,
𝑋𝑖 𝑋𝑖 𝑖
𝛽1
𝑌𝑡∗ = + 𝛽2 + 𝑣𝑖 , where 𝑣𝑖 is the transformed error term.
𝑋𝑖
𝜇 2 1
(b) 𝐸(𝑣𝑖2 ) = 𝐸 (𝑋𝑖 ) = 𝑋 2 𝐸(𝜇𝑖2 ) = 𝜎 2 ,
𝑖 𝑖
8
Assumption About the patterns of Heteroscedasticity
Assumption 1: The error variance is proportional to 𝑋𝑖2 .
𝐸(𝜇𝑖2 ) = 𝜎 2 𝑋𝑖2 .
𝐸(𝑣𝑖2 ) = 𝜎 2 (Verify)
9
Assumption 3: The error variance is proportional to the square of the mean value of
𝑌.
𝐸(𝜇𝑖2 ) = 𝜎 2 [𝐸(𝑌𝑖 )]2 .
Now, 𝐸(𝑌𝑖 ) = 𝛽1 + 𝛽2 𝑋𝑖 .
Therefore, if we transform the original equation as follows,
𝑌𝑖 𝛽1 𝑋𝑖 𝜇𝑖 1 𝑋𝑖
= + 𝛽2 + = 𝛽1 + 𝛽2 + 𝑣𝑖 ,
𝐸(𝑌𝑖 ) 𝐸(𝑌𝑖 ) 𝐸(𝑌𝑖 ) 𝐸(𝑌𝑖 ) 𝐸(𝑌𝑖 ) 𝐸(𝑌𝑖 )
𝜇𝑖
where 𝑣𝑖 = .
𝐸(𝑌𝑖 )
𝐸(𝑣𝑖2 ) = 𝜎 2 (Verify)
10