0% found this document useful (0 votes)

21 views

Econometric S 1 R

The document provides an outline for a course module on econometrics. It includes an introduction reviewing key econometrics concepts like classical linear regression, multiple regression, and violations of assumptions. The module then covers topics like simultaneous equation models, limited dependent variables, time series analysis, panel data, and model selection, with examples provided. Estimation techniques like ordinary least squares regression are discussed. The goals of econometrics like testing theories, policymaking, and forecasting are also summarized.

Uploaded by

Brhane Weldegebrial

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views

Econometric S 1 R

Uploaded by

Brhane Weldegebrial

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 47

SUN DAERO COLLEGE

DISTANCE EDUCATION

MODULE FOR THE COURSE

ECONOMETRICS OF MANAGEMENT
DEPARTMENT OF MANAGEMENT YEAR III TERM II

COURSE CODE MGMT 3071

CR.HR 3

JAN:2022

1 MEKELLE TIGRAY

1
Econometrics Module
Outline
 Introduction: Review of Econometrics (High attention)
 Simple Linear Regression Analysis (High attention)
 Multiple Linear Regression Analysis (High attention)
 Violations of OLS Assumptions (High attention)
 Simultaneous Equation Model and Instrumental Variables
 Limited Dependent Variable Analysis (High attention)
 Time Series Analysis (High attention)
 Panel Data
 Model Selection (High attention)
 Examples (High attention)
Introduction: Review On Econometrics
Introduction: Review of Econometrics
• Research in economics, finance, management, marketing, and many other disciplines is becoming
increasingly quantitative.
• It involves estimation of different parameters or functions and quantification of different qualitative
information and hypotheses.
• These are achieved with the aid of econometric tools and techniques
Definition of Econometrics
What is Econometrics?
• The term econometrics is formed from two Greek words.
• Oikovomia which means ‘economy’ and
• Metopov which means ‘measure’.
• Simply, econometrics means economic measurement.
• Thus, Econometrics is defined as a science which deals with the measurements of economic relationship.
• It is a combination of;
• Economic theory,
• Mathematical economics and
• Statistics
• Thus, econometrics may be defined as a subject matter in which the tools of economics theory, mathematics,
and statistical inference are applied to the analysis of economic phenomena.
• Economics provides economic theory
• Mathematics puts what economic theory puts verbally into mathematical form or model
• Statistics makes the estimation of the coefficients of economic relationships
• But, Econometrics is completely different from any one of these disciplines.
Econometrics Vs. Mathematical Economics
• Mathematical economics states economic theory in terms of mathematical symbols.
There is no essential difference between mathematical economics and economic theory
• Both state the same relationships, but while economic theory use verbal exposition in terms of mathematical
symbols.
• Both express economic relationships in an exact or deterministic form.
• Neither mathematical economics nor economic theory allows for random elements which might affect the
relationship and make it stochastic.
• Econometrics does not assume exact or deterministic relationship. It account random disturbances.
Econometrics Vs. Statistics
• Econometrics differs from economic statistics. Economic statistics is mainly a descriptive aspect of
economics.

2
• Economic statistics does not provide explanations of the development of various variables
• Economic statistics doesn’t provide measurements the coefficients of economic relationships.
• Econometric methods are adjusted so they become appropriate for measurement of economic relationships
which are stochastic.
Economic Models Vs. Econometric Models
• Economic Model is an organized set of relationships that describes the functioning of an economic entity
under a set of simplifying assumptions.
• Econometric models contain a random element which is ignored by mathematical economic models.

Economic Model=

Econometric Model=
Example: Economic theory postulates that demand for a good depends on its price, on the prices of other related
commodities, on consumers’ income and on tastes.

Methodology of Econometrics
• Starting with the postulated theoretical relationships among economic variables, econometric research or
inquiry generally proceeds along the following lines/stages.
• Specification the model
• Estimation of the model
• Evaluation of the estimates
• Evaluation of the forecasting power of the estimated model
Specification of the model
• In this step the econometrician has to express the relationships between economic variables in mathematical
form. Involving steps;
1. The dependent and independent (explanatory) variables which will be included in the model.
Dependent variable = f (independent variables)
2. A priori theoretical expectations about the size and sign of the parameters of the function.
3. The mathematical form of the model (number of equations, specific functional form of the equations (linear or
non-linear), etc.)
• The most common errors of specification are:
a) Omissions of some important variables from the function.
b) The omissions of some equations (for example, in simultaneous equations model).
c) The mistaken mathematical form of the functions.
Estimation of the model
• This stage includes;
• Gathering of the data on the variables included in the model.

3
• Examination of the identification conditions of the function (especially for simultaneous
equations models).
• Checking for existence of any problems in the data such as aggregations problems.
• Examination of the degree of correlation between the explanatory variables (i.e. examination
of the problem of multicollinearity).
Choice of appropriate economic techniques for estimation, i.e. to decide a specific econometric method to be applied
in estimation; such as, OLS, Logit, Probit, Fixed effect, random effects and so on.
Evaluation of the estimates
• This stage consists of deciding whether the estimates of the parameters are theoretically meaningful and
statistically satisfactory.
Evaluation criteria involves:
• Economic a priori criteria: refer to the size and sign of the parameters
• Statistical criteria (first-order tests): evaluate the statistical reliability of the estimates of the parameters
• Econometric criteria (second-order tests): Econometric criteria aim at the detection of the violation or
validity of the assumptions of the various econometric techniques.
Evaluation of the forecasting power of the model
• The model may be economically meaningful and statistically and econometrically correct; yet it may not be
suitable for forecasting due to various factors (reasons).
• Therefore, this stage involves the investigation of the stability of the estimates and their sensitivity to
changes in the size of the sample.
• Consequently, we must establish whether the estimated function performs adequately outside the sample of
data. i.e. we must test an extra sample performance of the mode

4
Goals of Econometrics
• Three main goals of Econometrics are identified:
– Analysis i.e. testing economic theory
– Policy making i.e. Obtaining numerical estimates of the coefficients of economic relationships for
policy simulations.
– Forecasting i.e. using the numerical estimates of the coefficients in order to forecast the future values
of economic magnitudes.
The Sources, Types and Nature of Data
• There are several types of data with which economic/econometric analysis can be done.
– Cross-sectional data: collected from different parties or entities at a given point in time.
– Time’s series data: consists of observations on a variable or several variables over time.
– Panel or longitudinal data: consists of a time series for each cross-sectional member in the data set.

Chapter 2: Classical and multiple linear regression

1.1 Basis of regression: classical linear regression model

The basic linear regression model and the relationship between Y and X is given by:

Yi = b 0 + b1 X i + e i , i = 1,2,..., n (1.1)
Equation (1.1) that represents the classical linear regression model with a single regressor is
called estimator. While Y represents the dependent variable (explained variable), X represents
the independent (explaining) variable. The constant values b 0 and b1 are called coefficients or
estimates. In another way, b 0 is the intercept and b1 is the slope. The slope b1 is the change in
Y
associated with the change in X . The intercept b 0 is the value of the regression line (Y )
when X = 0 ; the point at which the regression line intersects the Y axis. The random component
of the model (referred to more commonly as the error term) that represents the other factors
(variables) not included in the model but determine the dependent variable (Y ) is represented
bye . The subscript i shows the unit of observation. The coefficients describe both the direction
and magnitude of the relationship between the dependent variable, Y and the independent
variable, X .

1.2 Curve fitting through the method of Least Squares (OLS)

Suppose we are interested in the relationship between two variables, X and Y . To describe this
relationship, we need a set of observations by selecting a sample and a specific functional form
based on theory or empirics. At this point, we assume that the relationship between X and Y is
linear (no polynomial form for coefficients). Consider the following relationship between the
consumption of a specific commodity and income of a given sample (8 units).

Table 1 Income–consumption data

Individuals (i )
1 2 3 4 5 6 7 8
4 3 3.5 2 3 3.5 2.5 2.5
Consumption (Birr) (21
Y 15 15 9 12 18 6 12
)

5
Figure 1.1: Representation of income–consumption data by different lines

As you can see, many straight lines can be drawn (chosen) to fit the points, say in this case L1, L2
and L3. But, if we want to represent (fit) all the points by a single line, which line fits best to the
scattered points? There exists a procedure to get the line of best fit involving deviations and
squared deviations. The line of best fit is the line that minimizes the sum of squared deviations of
the points of the graph from the points of the straight line. The deviations are measured by the
vertical distance between the straight line and the scattered points of the graph. The process of
fitting the line that fits best the scattered points using the principle of minimum sum of squared
deviations is called the Ordinary Least Squares (OLS).

6
Figure 1.2: Deviations of the scattered points (from Line 3)

The coefficients b 0 and b1 are unknown but we can use the Least Squares procedure to estimate
them. The following graph shows how actually the parameters of the linear regression model are
estimated using the Least Square method. In order to do this, we must use data to the unknown
slope ( b1 ) and intercept ( b0 ) of the regression line.

Y •
Least Squares
i  •   
Deviation: Yi - Yi Y = b 0 + b1 X
 •
•
•
•
•
•
X

Xi
Figure 1.3: Constructing regression line using Least Squares

7
To estimate the coefficients through minimizing the sum of squared deviations, we go through
the following process:

Y i = b 0 + b1 X i + e i Population regression function

  
Yi = b 0 + b1 X i Estimated regression line
  
( )
e i = Yi -Yi = Yi - b0 + b1 X i Deviations (residual)

[Y - (b +b X )] =(Y - b - 

i 0 1 i
2
i 0
2
Squared deviations
n  
å (Y - b -
2
i 0 Sum of squared deviations
i=1

   
Where Yi = b 0 + b1 X i represents the equation of the straight line with intercept given by b 0 and

slope given by b1 . In those notations, Yi is the actual value for observation i and corresponds to

the value of X for that observation, while n is the number of observations. Yi , called the fitted
or predicted or estimated value is the value of Y on the straight line associated with
observation X i . The deviation (residual) is calculated by subtracting the fitted value of Yi (which

X there is a corresponding
is Yi ) from the actual value ( Yi ). Thus, for each observation on
deviation of the fitted value from the actual value of Y . So, our objective is to minimize the sum
of squared deviations where we can use elementary calculus to estimate the coefficients from. In
 
the process, we use procedures of partial derivatives with respect to b 0 and b1 , and setting each
first-order condition equal to zero to arrive at the formulae for calculating the values of
coefficients using simultaneous equations.
2
2

å( ) =å (Y - b - b X )
n n
Minimize Yi -Yi i 0 1 i
i=1 i=1

¶ n (Y - b
i 0  X ) = -2 (Y - b
-b 1 i 2  - b X )
n i 0 1 i

¶b0 å
i=1
n
å i=1
n
   
¶ å (Y i - b0 - b X ) = -2å(Y
1 i
2
i - b 0 - b1 X i )
¶b1 nX i
 
(
- 2å Yi - b0 - b1 X i = 0 )
i=1

8
n 
(
- 2å Yi - b0 - b1 X i X i = 0 ) (1.4
i=1

By rearranging equations (1.3) and (1.4) we obtain a pair of simultaneous equations, called
Normal Equations, given below.
n n
Y  
å i = b 0 n + b1 å X i
i=1 i=1

n  n  n 2

å
i=1
X iYi = b 0 å X i + b1 å X
i=1 i=1
i

n
 
Now we can solve for b 0 and b1 simultaneously by multiplying the normal equations by å Xi
i=1
and equation by n , respectively.
n n n n n
Y  
å Xi å i = b 0 n å X i + b1 å X i å X i
i=1n i=1 n i=1 ni=1 i=1
n X Y = b0 n
i X + b1 n X
å i å i å i
2

i=1 i=1 i=1

Solving the system simultaneously, we get 2

é ù
n X Yi - X i Y = b1 n X 2 - çæ X ÷ö ú
n n n n n

åi=1 i å i=1 åi=1 i

ê åi=1 i èåi=1 i ø ú
ê
From which it follows that ë û
n n n n n
n å X iYi - å X i å Yi Y 
 i=1 i=1 
å i - b1 å X i

b1 = i=1 i=1 i=1 
n æ n ö n
n X -ç X and b0 = = b 0 = Y - b1 X
å i2 å i
2

i=1 è i=1 ø
 
Having obtained the coefficients of the model ( b 0 and b1 ), they need to be incorporated into the
model and the standard errors of the respective coefficients have to be estimated as well and
fitted within the model as follows:

Yi = bˆ 0 + bˆ 1 X i , where s.e represents the standard errors of coefficients

(s.e of bˆ ) (s.e of ˆ 0

9
To calculate the standard errors of each coefficient, we need to calculate first the estimated
population variance ( s2 )1. The estimate of the population variance is calculated as follows:

s2 =
åê i
2

=
å (Y i -b )
ˆ 0 - bˆ 1 X i 2
(1.5
n- 2 n- 2

Note that the degrees of freedom ( n - 2 ) is constrained by the number of coefficients to be

estimated. In this case, we have two coefficients ( b 0 and b1 ) to be estimated and hence the n - 2 .
ˆ ˆ
It follows that the standard error of the slope of the regression line ( bˆ1 ) and the intercept ( bˆ 0 ) is
given by

æ å Xi ö
2

s.ebˆ s2 and s.ebˆ

1 2 åx 0
s 2çç n x ÷
åiø 2
÷

Notice that the standard errors of the coefficients ( s.ebˆ and s.ebˆ ) measure the dispersion of the
1 0

estimates about their means. How do they differ from s = s 2 ? Remember s is the standard
error of the regression, which measures the dispersion of the error term associated with the
regression line.

The formulae will be less complicated if we write the Least Square estimates in terms of
variables that are expressed as deviations from their respective sample means. Hence, we
transform the data to deviations form by expressing each observation of X and Y in terms of
deviations from their respective means. The deviations form are represented by lower cases of
X and Y .
xi = X i - X and yi = Yi -Y
Summing the population regression function, Yi = b 0 + b1 X i + e i over all n observations and
dividing it by n , we find that

1
Notice that s 2 is known as Root MSE in Stata’s ANOVA table.

10
Example 1.1: manual estimation of coefficients using OLS
Observations ( n Y X Yi -Y Xi - X X i2
) Yi Xi 1 7.5 Yi X i
3 13.5 441
1 4 21 0 1.5 84
3 13.5 225
2 3 15 0.5 1.5 45
3 13.5 225
3 3.5 15 -1 -4.5 52.5
3 13.5 0 -1.5 81
4 2 9 3 13.5 144 18
0.5 4.5
5 3 12 3 13.5 324 36
-0.5 -7.5
6 3.5 18 3 13.5 -0.5 -1.5 36 63
7 2.5 6 3 13.5 0 0 144 15
2.5 12 30
24 108X -Y ) 1620
n

å24Yi X
108 i åY å å(Y i å (X i -
n
X i2 X iYi
343.5
i=n n
å n

å i=n å
n n n 8´ (i=n
343.5) - (24 ´108) = 2748 - 2592 156 i=n
1. Y
n å n X iYi - å n X i å
 i
b1 = i=1
å i å
i=1
= i = = 0.12
8´(1620)- (108)
2 2
n
2 æ n ö 12960 -11664
ån÷ Yi -X b1-åç X i X 24 - 0.12 ´108 24 -12.96
 = = = 1.38
i=1 i=1 8 8 8
2. b 0 = n =

So, the fitted or estimated regression line of the income–consumption data is normally given as: Y =1.38 )+ 0.(0.026
12 X , the numbers
in )

parentheses are the standard errors of the estimated coefficients. That is normally how a fitted regression line is sketched.

11
n n n n
Y b b1 Xi
n n n n å i å 0 å åe i
åY = å b i 0 + b å X +å
1
e
i
i Þ
i=1
n =
i=1
n +
i=1
n +
i=1
n = Y = b0 + b1 X + e
i=1 i=1 i=1 i=1

Subtracting equation (6) from the population regression function and combining like terms gives
Yi -Y = b1 (X i - X )+ (e i -e ) Þ yi = b1 xi + e i (1.6)
Equation (1.6) represents the regression function in deviations form and notice that the intercept,
b 0 dissolves out. From equation (1.6) , the estimated slope of the regression line is given by
 x
b1 = å i y i
åx i
2


If you calculate the value of b1 in Example 1.1 using this formula, the result would be exactly the
same.

1.3 Assumptions of Least Square

A stochastic model of the following type

X +
Y i = b 0 + b1  ei " i = 1,2,3,..., n
 i random 
component
Deterministic component
(1.7
is known as the Classical Linear Regression Model if it satisfies the following assumptions2:

1. The model is linear in parameters, without regard to the linearity of the dependent and
independent variables. The relationship between X and Y is linear.
2. The error term, e is a random (real number) variable
i. With zero mean or expected value of the error term has mean zero given any value of the
explanatory variable, X .
E (e i ) = 0 Þ E (ei X ) = 0 "i

2 Assumption is a statement that is assumed (considered to be true) and taken for granted without
proof and from which a conclusion can be drawn.

12
Thus, observing a high or a low value X does not imply a high or a low value of e . This
effectively means X and e are uncorrelated. The implication is that changes in X are
associated with changes in e in any particular direction. Hence, the associated changes in Y can
be attributed to the effect of X . This assumption allows us to interpret the estimated coefficients
as reflecting causal effects of X on Y .
ii. With constant variance (homoskedastic or homoscedastic distribution)
Var(ei ) = E (ei - E (ei )) = E (ei2 )= d
2 2

3. The error terms from any two observations are uncorrelated with each other, which implies
there is no autocorrelation (no serial correlation). When the observations are drawn
sequentially over time (time series data), we say that there is no serial correlation or no
autocorrelation. When the observations are cross sectional (survey data) we say that we have
no spatial correlation.
( )
Cov ei ,e j = 0 " i, j i ¹ j
4. The error term ( e ) is independent of the independent variables, X ’s. It follows that there is
no correlation between the random variable ( e ) and the explanatory variable ( X ). If two
variables are unrelated, then their covariance is zero.
Cov(ei , X ) = 0 " i
5. The variance of the independent variable X must be non-zero.
Var( X i ) > 0
This is a crucial requirement. To identify the effect of X on Y , it must be that we observe
situations with different values of X . In the absence of such variability, there is no information
about the effect of X on Y . It means that the values of the independent variable ( X ) should not
be constant.
6. The error term has a normal distribution with mean zero and constant variance.
e ~ N (0,d )
2

If conditions 1–6 hold, we have the Best Linear Unbiased Estimator (BLUE). We have an
unbiased estimator when over repeated samples; the estimator gives us the true population
parameter. We have efficient (best) estimator when we have an unbiased estimator that yields
smallest variance for the coefficients, bˆ s. We have a consistent estimator when we find over
'

13
many different samples, the Ordinary Least Square (OLS) estimates is close to the population
estimates.

1.4 Desirable properties of the OLS

A good starting point for this section is to notice that the Least Square estimates result from a
specific sample of observations of dependent and independent variables. The coefficients
produced from Least Squares are based on a single sample. It follows that the estimates may vary
from sample to sample. Remember also that the estimates of Least Square ( bˆ 0 and bˆ1 ) refer not
only to regression estimates from a specific sample but are also used to make inferences about
the population from which the sample is drawn (i.e. the estimator or formula which is also used
to compute the estimates from many different samples).

a) Unbiasedness

We want our estimator to be unbiased. Remember that there actually exist true values of the
coefficients for population regression function, which of course we do not know about. These
reflect the true underlying relationship between Y and X . We want to use a technique to
estimate these true coefficients. Our results will only be approximations to reality. An unbiased
estimator is such that the average of the estimates, across an infinite set of different samples of
the same size n , is equal to the TRUE value. This means that on average the estimator bˆ is

correct, even though any single estimate of bˆ for a particular sample of data may not equal
b.
Mathematically,
ˆ ˆthis unbiased estimator is given by

In()
E other
b = words, ()
b Û E the
b -average
b = 0 or expected value of b is equal to the true value of b .

Example: assume the true relationship between X and Y is given as follows: Yi = 1+ 2 X + e i ,

which means true values of b 0 and b1 are 1 and 2, respectively. The following table (where each
of the 500 samples has the same number of observation, n =14) shows what an
estimator is all about.

14
Table 1.2: illustration of an unbiased estimator
+--------------------------------------------------+

ˆ ˆ
| Samples b0 | b1
|--------------------------------------------------|
| Sample 1 1.21851 1.584188 |
| Sample 2 .82502 2.5564 |
| Sample 3 1.375252 1.32566 |
| Sample 4 .9216356 2.106887 |
| Sample 5 1.056685 2.11987 |
|--------------------------------------------------|
| Sample 6 1.048275 1.818525 |
| Sample 7 .9140797 1.657301 |
| Sample 8 .7885023 2.957194 |
| Sample 9 .658188 2.293599 |
| Sample 10 1.085249 2.345555 |
|--------------------------------------------------|
| Average across 10 samples .9891397 2.076518 |
| Average across 500 samples .9899374 2.004986 |
+--------------------------------------------------+

b) Efficiency (Minimum variance plus unbiasedness)

An estimator is efficient if within the set of assumptions that we make, it provides the most
precise estimates in the sense that the variance is the lowest possible in the class of estimators we
are considering. How do we choose between the OLS estimator and any other unbiased
estimator? Our criterion is efficiency.

() ()
Var bˆ £ Var b
~

The variance of an estimator is an inverse measure of its statistical precision, i.e., of its
dispersion or spread around its mean. The smaller the variance of an estimator, the more
statistically precise it is. A minimum variance estimator is therefore the statistically most precise
estimator of an unknown population parameter, although it may be biased or unbiased. Among
all the linear unbiased estimators, which one has the smallest variance? It is OLS. Thus,
efficiency which includes unbiasedness and minimum variance characteristics is also another
desirable property of an estimator.

iii. Consistency
An estimator is said to be consistent, if bˆ approaches its true value, b when the sample size
gets
larger and lager (approaches infinity). More formally, b is a consistent estimator of b if the

15
probability limit of bˆ is b . Given assumptions 1–6, the Ordinary Least Squares Estimator
(OLS) is the Best Linear Unbiased Estimator (BLUE). This means that the OLS estimator is the
most efficient (least variance) estimator in the class of linear unbiased estimators. This is known
as the Gauss-Markov Theorem.

1.5 Hypothesis testing

Another important thing to do in examining the goodness of fit of the regression model is
hypothesis testing. In cases where the true population variance is unknown (in cases of OLS
estimates), we use the t-distribution and t - test .
bˆi - b
t b ˆ ,n-K = , where biˆ = b0,
ˆ b1,ˆ b2,...,
ˆ bn ˆ
()
i
ˆ
se bfor
The t - test i a particular coefficient b i is given by the formula in equation (2.31) with n - K
ˆ
degrees of freedom K is the number of coefficients to be estimated, including the
intercept term. The following steps may be followed in performing t - test.

Table 1.3: steps in hypothesis testing

Steps for hypothesis testing
1. Formulate null and alternative hypotheses: alternative depends on 1- or 2-tailed test
H 0 : b i = 0 , H 1 : b i ¹ 0 (2-tailed) H 0 : b i = 0 , H 1 : b i > 0 (1-tailed)
2. Specify test statistic and appropriate distribution
bî - b * bî - b
t = t = …….(1-tailed)
( ) ( )
*
t = t a ,n-K = …... (2-tailed), and a ,n-K
2 se bî se bî
3. Choose rejection region: a (example, a =0.05)
4. Calculate the test statistic ( t * ) for sample
5. Reject or do not reject the null hypothesis
If Pæ t > ta ö reject null hypothesis (2-tailed)
ç ÷
è 2ø

If P( t > ta ) reject null hypothesis (1-tailed)

6. State Conclusion (take inference)
Notice that if the calculated test statistic ( t* ) is larger than the critical value ( t c ), you would
reject the null hypothesis, which implies the coefficient is statistically significant. If the
calculated test statistic ( t* ) is smaller than the critical value ( t c ), you would do not reject the null
hypothesis, which implies the coefficient is statistically not significant.

16
The t - test analyzes the significance of each coefficient. Apart from t - test, we can also test if the
overall model is good in the sense that the independent variables explain the change in the dependent
variable well enough (stated otherwise, if the coefficients of the model are together equal to zero). The
F - test enables us to perform this kind of test. The following formula shows the F - test.

Explained Variance ES
K-
bˆ12 å xi R2
K -1
FK -1,n-K = = = =
Unexplained Variance RS n-K s 2 (1-R )n-K
2

The F - test follows an F-distribution with K -1 and n - K degrees of freedom for

explained and unexplained variation, respectively. Notice that K represents the number of
coefficients estimated from the model. Other things being equal, a large ratio of explained to
unexplained variance shows the existence of strong statistical relationship between X and Y .
We can reject or accept the null hypothesis of no relationship between X and Y at the 5 percent
significance level by looking up the appropriate critical level of the F-distribution with K -1 and
n - K degrees of freedom. If the value FK -1,n-K calculated from the regression is larger than the
critical value, we reject the null hypothesis at the 5 percent level. If the value of FK -1,n-K is lower
than the critical value, we cannot reject the null hypothesis.

1.6 Multiple regression

In simple two-variable regression model, we discussed the relationship between a dependent

variable and a single explanatory (independent variable). But it is usually the case that economic
relationships involve more than two variables. A dependent variable, Y may depend on a
number of different explanatory variables. For instance, in the consumption example we
discussed in subsection 1.2, income may not be the only factor that influences consumption. The
variation in consumption may also be explained by other factors, including wealth, taste, price,
tax, family size and etc. We extend the two-variable model by assuming that the dependent
variable Y is a linear function of a series of independent variables X 1, X 2, X 3 ,..., X k and an error
term e i . We write the multiple regression model as

Yi = b 0 + b1 X 1i + b2 X 2i + b3 X 3i + ..., b k X ki + e i (1.8)

17
where Y is the dependent variable, the X ' s are the independent variables and e is the error term.
This model is an extension of the two-variable model and the mechanics of the method of least
squares through minimizing the sum of squared deviations works the same way. The multiple
regression model has to satisfy, in addition to the 6 assumptions specified in the two-variable
model. These additional assumptions are discussed below.

1.6.1 Multicollinearity
One of the assumptions of the multiple regression models is that there is no exact linear
relationship between any of the independent variables in the model. If such exact linear
relationship exists, we say that the independent variables are perfectly collinear. Explanatory
variables are rarely uncorrelated with each other suggesting that multicollinearity is only a matter
of degree. As an illustration, let us assume that a dependent variable Y (say, grade point average)
is explained by the following independent variables:

Yi = b 0 + b1 X 1i + b 2 X 2i + b3 X 3i + e i
X 1 = family income, thousands of birr per year
X 2 = average hours of study per day
X 3 = average hours of study per week

It is easy to see that variables X 2 and X 3 are perfectly collinear because X 3 =7 X 2 for each
observation unit. When such an exact relationship exists among independent variables, it will be
impossible to calculate the least square estimates. The term multicollinearity is broadly used to
include the case of (very) high collinearity among two or more independent variables. In
practice, we are often faced with the more difficult problem of having independent variables with
a high degree of multicollinearity. It is unusual for there to be an exact relationship among the
explanatory variables in a regression. When this occurs, it is typically because there is a logical
error in the specification. In practical applications in general, multicollinearity is said to occur
when two or more independent variables are highly (but not perfectly) correlated with each
other. If there is no perfect collinearity, estimation of the coefficients is possible but the
interpretation of the coefficients would be very difficult. A given change in one of the highly
correlated variable is likely to bring a change to the other independent variable in a predictable

18
similar fashion. It is difficult to attribute the change in the dependent variable to either of the
highly correlated variables and hence the difficulty of the interpretation of the coefficients of
these variables. That is why it is difficult to separate the effect each independent variable.

When multicollinearity is present, the estimator remains unbiased but estimated variances of the
coefficients ( bˆ ) are very large. This implies the reliance we can place on one or the other will
's
be small. Large variance means large standard errors and the confidence interval tend to be much
wider, leading to the acceptance of the null hypothesis when it is in fact false. Large standard
errors can result in very small values of computed t - ratio resulting in statistically insignificant
coefficients when tested individually. Despite the small t - ratios and statistically insignificant
coefficients, R 2 measuring goodness of fit of the model can be (very) high. Another effect is that
in the presence of multicollinearity, OLS estimators and their standard errors can be sensitive to
small change (even the slightest change) in the data.

Indications of multicollinearity
An estimated model with high standard errors and low t - ratios could be an indicative of
multicollinearity, but it could alternatively suggest that the underlying model is poorly specified.
How can we tell (test) the presence of multicollinearity?
1. A relatively high R 2 in a regression model with few significant t - ratios is an indicator
of multicollinearity. In fact, it is possible that the F - statistic for the regression model is
highly significant and none of the individual coefficients is significant.
2. Relatively high pair-wise correlations between two or more of the independent variables
may indicate multicollinearity. Testing the presence of multicollinearity using solely pair-
wise correlations should be made carefully.
3. Formal tests such as VIF (Variance Inflation Factor) and TOL (Tolerance): despite these
tests are not universally accepted, they may help in identifying the presence of
multicollinearity. VIF shows how the variance of an estimator is inflated by the presence
of multicollinearity. The rule of thumb is that if VIF is larger than 10, multicollinearity is
a serious problem (the variable is said to be highly collinear).

19
As partly remedial measures, increasing the sample size may reduce the problem of
multicollinearity as covariance of two independent variables is inversely related to the sample
size. Dropping one of the collinear variables may also reduce the problem; however we have to
be careful not to drop an important variable because it can result in specification error or biased
coefficients. Including a new variable may also reduce the problem of multicollinearity.
Combining the two collinear variables (whose unit of measurement is the same) into one index
can also reduce the problem of multicollinearity.

1.6.2 Omitted variable bias

The problem of omitted variable bias arises as a result of excluding important and relevant
variables from a model. Omitting an independent variable which has an impact on the dependent
variable and is correlated with the included independent variables leads to omitted variable bias.
Omitting a relevant variable thus results in biased estimates. We may be forced to omit a relevant
variable for the following reasons: ignorance, lack of data about the specific variable, incomplete
observations, multicollinearity problems and etc.

Now suppose that we omit a variable that actually belongs in the true (or population) model. This
is often called the problem of excluding a relevant variable or under-specifying the model (or,
misspecification). Imagine that the true relationship between Y and X ’s is

Y i = b 0 + b1 X 1 + b 2 X 2 + e i

But if one estimates the following model (i.e., a relevant independent variable, X 2 is excluded
from the model)

Yi = b 0 + b1 X 1 + ei* (1.9)
The variable X 2 as a result goes to the error term (random component) of the model where
e i* = b 2 X 2 + ei , and when this is so

( )
E ei* = E (b2 X 2 + e i ) = E (b2 X 2 )+ E (ei ) = b 2 X 2 ¹ 0

20
It follows that excluding a relevant variable X 2 results in biased estimates for the included
variable, X 1 (i.e. b1 is biased).

In general, model specification is fundamentally a theoretical exercise. We build models to

reflect our theories. Variable selection therefore should be based on theoretical understanding of
the causal (cause-and-effect) relationship between the dependent and independent variables.
However, there are rules of thumb for including or excluding variables from the model. This
problem is explicitly theoretical rather than “statistical”. A model must include a specific
variable in a regression equation if the variable is correlated with other independent variables
included in the model AND the variable is also a cause of the dependent variable (Y ). Including
an independent variable which has no effect on the dependent variable but correlated with the
included independent variables leads to a reduction in the efficiency of estimation of the
variables included in the regression (it creates multicollinearity problem; yet the estimates
remain unbiased).

1.6.3 Functional form misspecification

Specification of a regression model can refer to the selection of the independent variables
(omitted variable bias and including irrelevant variables) and functional form (such as linear or
non-linear specifications) of the estimated relationship. Economic theory often provides some
guidance in model specification but may not explicitly indicate how a specific variable should
enter the model, identify the functional form, or spell out precisely how the stochastic elements
( e i ) enter the model.

Goodness of fit measures and hypothesis testing ( R 2 , testing each b using t - test and
F - test for joint testing) are just ways of making sure if our model reliably predicts the variation
in the dependent variable. The results from these post-estimation techniques are meaningful if
and only if we estimated the model with the right functional form. So, a model that includes the
appropriate independent variables may be mis-specified because the model may not reflect the
algebraic form of the relationship between the dependent and independent variables. As
indicated earlier, theory is often silent over whether a model should be estimated in level terms;

21
as a log-linear structure; as a polynomial for one or more of the independent variables or in
logarithms.

For instance, suppose that the true model specifies a nonlinear relationship between Y and X ' s –
such as a polynomial relationship–and we omit the squared term. Doing so would be mis-
specifying the functional form. Likewise, if the true model expresses a constant-elasticity
relationship, the model fitted based on logarithms of Y and X could render conclusions different
from those of a model fitted in level terms of the variables. Thus, in a misspecification of the
functional form, we have all the appropriate variables at hand and we only have to choose the
appropriate algebraic form in which they enter the regression function. Ramsey’s regression
specification error test (RESET) implemented by Stata’s estat ovtest3 can provide a useful
information about the problem of functional misspecification and omitted variable bias for linear
models. Ramsey’s RESET runs an augmented regression that includes the original regressors,
powers of the predicted values from the original regression and powers of the original regressors.

Suppose the true regression model is Yi = b 0 + b1 X 1i + b2 X 2i + e i . The RESET test goes as

follows. First we, estimate the model and obtain Yˆi . Then, we estimate a new model with the
powers of the predicted values of the original model as follows:

Yi = b0 + b1 X 1i + b2 X 2i + a 2Ŷi2 + a 3Ŷi3 + a 4Ŷi4 +...,+anŶin

Usually, the powers of the predicted values are 2 or 3. Using F - test , we then test the null
hypothesis of no misspecification ( H0 :a 2 = a3 = a4 = an = 0 ). If the null hypothesis of no
misspecification holds, our model has no omitted variables and no misspecification of functional
form.

Consider the following Stata output and the subsequent RESET test. Only the powers of the
predicted values are used in the Stata output below. Moreover, Stata’s default use of 3 power
levels was used.
. regress wage hours iq kww educ exper age famsize meduc

3
In Stata version 8 and below, this command is only ovtest.

22
Source | SS df MS Number of obs = 857
-------------+------------------------------ F( 8, 848) = 25.87
Model | 27852695.7 8 3481586.96 Prob > F = 0.0000
Residual | 114113060 848 134567.288 R-squared = 0.1962
-------------+------------------------------ Adj R-squared = 0.1886
Total | 141965756 856 165847.846 Root MSE = 366.83

------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hours | -3.390489 1.757339 -1.93 0.054 -6.839733 .0587553
iq | 3.907971 1.046046 3.74 0.000 1.854829 5.961114
kww | 7.105818 2.12599 3.34 0.001 2.932998 11.27864
educ | 43.57976 7.97551 5.46 0.000 27.9257 59.23381
exper | 12.54908 4.003661 3.13 0.002 4.690833 20.40733
age | 7.069505 5.356417 1.32 0.187 -3.443885 17.58289
famsize | -2.150576 5.995267 -0.36 0.720 -13.91788 9.616727
meduc | 10.93596 4.917083 2.22 0.026 1.28488 20.58704
_cons | -612.0657 193.4389 -3.16 0.002 -991.7408 -232.3906
------------------------------------------------------------------------------

For the above linear regression function of the wage variable and independent variables (level–
level form), the following RESET test is produced.
estat ovtest

Ramsey RESET test using powers of the fitted values of wage

Ho: model has no omitted variables
F(3, 845) = 3.04
Prob > F = 0.0283

The F - test from Stata’s RESET output shows the null hypothesis of no misspecification is
rejected ( F * = 3.04 is greater than Fc = 2.64 ). Notice that the above test incorporates the powers
of the predicted values only, which is Stata’s default. However, the powers of the regressors can
also be used. The RESET test including the powers of the original regressors is presented as
follows:
estat ovtest,rhs
Ramsey RESET test using powers of the independent variables
Ho: model has no omitted variables
F(24, 824) = 1.56
Prob > F = 0.0419

We can reject the RESET’s null hypothesis of no omitted variables for the model. Both of these
tests indicate that there is indication of misspecification (indication of omitted variable bias or
functional form misspecification). Let us run the regression by including an interaction variable
expeduc (= educ*exper), [using Stata’s generate command]. In the Stata output below,

23
the interaction variable (expeduc) is evidently significant, so a model excluding that term can
be considered mis-specified, which is also in line with the RESET test given below.
regress wage hours iq kww educ exper age famsize meduc expeduc

Source | SS df MS Number of obs = 857

-------------+------------------------------ F( 9, 847) = 23.88
Model | 28728785.9 9 3192087.32 Prob > F = 0.0000
Residual | 113236970 847 133691.818 R-squared = 0.2024
-------------+------------------------------ Adj R-squared = 0.1939
Total | 141965756 856 165847.846 Root MSE = 365.64

------------------------------------------------------------------------------
wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
hours | -3.275315 1.752191 -1.87 0.062 -6.714461 .1638299
iq | 3.861757 1.042794 3.70 0.000 1.814994 5.908521
kww | 6.529982 2.130969 3.06 0.002 2.347383 10.71258
educ | 2.517139 17.90253 0.14 0.888 -32.62138 37.65566
exper | -40.03575 20.92583 -1.91 0.056 -81.10832 1.036823
age | 9.190347 5.402864 1.70 0.089 -1.414225 19.79492
famsize | -1.675904 5.97861 -0.28 0.779 -13.41053 10.05872
meduc | 10.62493 4.902567 2.17 0.030 1.002327 20.24754
expeduc | 3.979945 1.55473 2.56 0.011 .9283689 7.031521
_cons | -101.9723 277.2744 -0.37 0.713 -646.1978 442.2532
------------------------------------------------------------------------------

estat ovtest

Ramsey RESET test using powers of the fitted values of wage

Ho: model has no omitted variables
F(3, 844) = 1.97
Prob > F = 0.1169

The calculated F-value (which is equal to 1.97) is less than the F-critical (2.65) which means that
we cannot reject the hypothesis of no misspecification (you can alternatively check the F–test p-
value). Notice that in this particular model specification, though the interaction term is
significant, the variable educ has become insignificant whereas the exper variable is only
significant at 10%.

1.6.4 Heteroskedasticity

There are often instances in econometric modeling where the assumption of homoskedastic
variance of the error term is unreasonable (the constant variance assumption of the error terms
fails). As an illustration, consider a cross section of firms in one industry. Error terms associated
with large firms may have large variances than those error terms of the smaller firms in the
industry. Another example is the cross section of household income and expenditure. It seems
reasonable to observe that low income households would spend rather steadily while the

24
spending pattern of high income households would be volatile. This suggests that in a model
where expenditures are the dependent variable, error variance associated with high income
households would be greater than the error variance of low income households. This
phenomenon of non-constant variance of the error terms is known as heteroskedasticity. It is a
systematic pattern of the error terms where the variances of the error terms are not constant.

Var(ei ) ¹ d 2 (non constant)

The case of heteroskedasticity therefore means that the variance of the error term can vary
among observation units (individuals, households, firms, farms, etc.) as follows:

Var(ei ) = di2 (variance that varies with i)

Hetroskedasticity in particular can also occur if the variance of the errors is a function of
explanatory variables. This case is defined as follows:

Var(ei ) = d (X i ) (variance that varies with X )

In figure 1.4, the presence or absence of heteroskedasticity is demonstrated. Panel (a) shows the
case of homoscedasticity (constant variance). Panels (b), (c) and (d) illustrate cases of
heteroskedasticity. In panel (b), the variance (di2 ) is observed to be high in the middle of the
data. In panel (c), variance of the error term is observed to be higher for low values of the X -
variable (i.e. income). Panel (d) shows the case of increasing error variance with the variable X
(variance of the error term increases with income).

But, why does heteroskedasticity occur? Some of the reasons where the variances of the error
terms may not be constant include:

(1) Error learning model: this assumes that as people learn, their error behavior becomes
smaller in which case variance of error terms (di2 ) is expected to decrease.
(2) As data collection techniques improve, error variance (di2 ) is also expected to decrease.
(3) The presence of outliers may also result in heteroskedasticity.
(4) Hetroskedasticity may also arise due to the fact that some important variables are omitted
from the model or the wrong functional form has been used.

25
Income
Income (b) heteroskedastic errors
(a) homoskedastic pattern of errors

Income Income
(c) heteroskedastic errors (d) heteroskedastic errors

Figure 1.4 homoskedastic and heteroskedastic patterns of errors

1.6.5 Consequences for OLS estimator

What is the big deal with heteroskedasticity after all? Why worry about it? What are the
consequences or what happens when we use Ordinary Least Square procedure to a model with
heteroskedastic error terms? Recalling among the assumptions in section 1.3, if the error terms
have zero mean and homoskedastic variance, OLS estimator is Best Linear Unbiased Estimator
(BLUE). How does heteroskedasticity then affect this? If the error terms have no constant
variance (heteroskedastic), OLS estimators are still unbiased (unless there are omitted variables).
However OLS estimators are no longer efficient or have minimum variance. In the presence of
heteroskedasticity, the estimated variances or standard errors of the estimated coefficients are

26
biased estimators of the true variances (standard errors) of the population parameters (the
formulae used to estimate the standard errors of the coefficient are no longer correct). In that
case, the usual t -tests and F -tests will be misleading for drawing inferences (conclusions).
When heteroskedasticity is present, Ordinary Least Square estimation places more weight on the
observations with large error variances than those with smaller variances.

1.6.6 Detecting and testing for heteroskedasticity

There are informal and formal ways of detecting or testing for the presence of heteroskedasticity.
Informal way of detecting heteroskedasticity is based on the analysis of the plot between error
terms and predicted values or independent variables. To see if a given data exhibits
heteroskedasticity or not, we examine whether there exists systematic relationship between
residual ( ê i ) and predicted values of Yi (Yˆi ) or independent variables ( X i ' s ). The presence of
systematic patterns (increasing, decreasing or both) shows the presence of heteroskedasticity.

Panel (a) in figure 1.5 indicates homoskedastic error terms as there is no systematic pattern
between the residual ( ê i ) and the independent variable ( X ). Panels (b) – (f) indicate systematic
relationships between residuals ( ê i ) and
X and Yˆ , which suggest the presence of
heteroskedasticity in the data. Visual inspection of the datai and plot relationships therefore can
tell if systematic pattern is observed in the data. If there is a constant spread of the residuals
across all values of X , it is an indication of homoskedastic error variance. It is therefore always
a good idea to inspect the data before going to the more formal tests to check for the presence of
heteroskedasticity. Some of the formal tests for heteroskedasticity are White test, Goldfeld-
Quandt test and Breusch-Pagan test.

27
ê êi êi
i

Xi Xi
Xi (b) (c)
(a)

êi êi

Xi
Xi (e)
(f) Yˆ
(d)
Figure 1.5 detecting heteroskedasticity using plots

White test for heteroskedasticity

Econometrician Halbert White (1980) proposed a test for heteroskedasticity that constitutes the
independent variables ( X i ), the squares of the independent variables ( X i2 ) and if there is enough
data the cross products ( X i X j ) of all of the independent variables to the regression model. This
test is explicitly intended to test for forms of heteroskedasticity that invalidate the usual OLS
standard errors and test statistics.

28
If the regression model we have is given by

Y i = b 0 + b1 X 1 + b2 X 2 + b3 X 3 + e i

The White test for heteroskedasticity proceeds as follows:

1. Estimate the original model using OLS and obtain the residuals, ê i
2. Next, regress residuals squared ( ê i2 ) on a constant, original regressors, square of these
regressors and cross product of the regressors given by the auxiliary regression model as:
2
ê i = a0 +a1 X 1 +a 2 X 2 +a3 X 3 +a 4 X1 2+ a 5 X 2 +2 a 6 X 3 +2 a 7 X 1 X 2 +a8 X 1 X 3 +a 9 X 2 X 3 + ui
3. Calculate the value of R 2 from the auxiliary regression model
4. Now calculate NR 2 , where N is the number of total observations and R 2 is from auxiliary
regression
5. The test statistic is asymptotically distributed with a Chi-squared distribution with K -1
degrees of freedom ( c K2 -1 ) in which case K is the number of coefficients from the auxiliary
regression model
6. Given the following hypothesis
H 0 :d i2 = d 2 (homoskedastic )
H1 : H 0 is not true (hetroskedastic )
Reject the hypothesis of homoskedastic error terms if NR 2 > c1-a 2
,K -1 . If the null hypothesis is
rejected, it implies that heteroskedasticity is identified in the data. The White test is a general
way of testing for the presence of heteroskedasticity. Because it also includes higher order terms
such as the squared and cross product terms, it may identify the presence of mis-specified
functional form rather than heteroskedasticity. This makes the test weaker. Nonetheless, it
remains important to identify heteroskedasticity. If we remove the higher order terms from the
White test, it leads us to the Breusch-Pagan test.

Breusch-Pagan test for heteroskedasticity

Two other econometricians, Breusch and Pagan (1980) suggested a different form of the test for
heteroskedasticity that assumes the errors are normally distributed. Consider the original
regression model.

29
Y i = b 0 + b1 X 1 + b2 X 2 + b3 X 3 + e i
Assume that the error variance d i2 is described as a function of Z i variables (where some or all
of the X ' s can serve as Z ' s ) given by
d i2 = f (a0 +a1Z1i + +a2 Z 2i + ... +a k Zki )
In which case d i2 is a linear function of the Z ' s given as

d i2 = a0 +a1Z1i + a 2 Z 2i + ... +a k Zki

If a1 = a2 = a3 = ... = ak = 0 , then d i2 = a0 which is a constant value. Therefore, to test if d i2 is

homoskedastic, one can test the hypothesis that a1 = a2 = a3 = ... = ak = 0 . This is the basic idea
behind the Breusch–Pagan test. The Breusch-Pagan (BP) test for heteroskedasticity works (more
or less similar to the White test) as follows:

1. Estimate the original regression model by OLS, as usual.

2. Obtain the squared OLS residuals, ê i2
2
3. Obtain the estimated residual variance4 using dˆ 2 = ê and obtain the normalized residual
å i

N
ê i2
squared, .
dˆ
4. Run the regression of normalized squared residuals on the independent variables (auxiliary
regression)
ê i2
= a0 + a1 Z1 + a2 Z 2 + a3 Z 3 + ... + ak Z k + u i
dˆ
1
5. Obtain the test statistic, which is equal to
ESS from this auxiliary regression
2
6. Assuming error terms ( e i ) are normally distributed, one can show that if there is
homoscedasticity, then test statistic is asymptotically distributed with a Chi-square
2
1-a ,K
distribution with 1-a significance level and K -1 degrees of freedom ( c - )
4
Remember that the given formula for estimated residual variance is using the formula for maximum
2
likelihood estimator. The estimated residual variance using OLS is dˆ 2 = åê i .
N-K

30
7. Given the following hypothesis
H 0 :d i2 = d (homoskedastic )
2
.
H1 : H 0 is not true (hetroskedastic )
The null hypothesis of homoscedasticity is rejected if the test statistic ( 1 2 ESS ) is larger than the
2
1-a ,K
Chi-square critical test
valuefor - )
( cheteroskedasticity
Goldfeld-Quandt
This method of testing for heteroskedasticity is applicable if one assumes that the heteroskedastic
variance is positively related to one of the explanatory variables in the regression model. Assume
that we are considering the two-variable model:

Y i = b 0 + b1 X i + e i (1.10
Suppose the variance d i2 is positively related with X i as
2
d i2 = d X i2

The Goldfeld-Quandt test procedure involves the calculation of two least-squares regression
lines, one using data thought to be associated with low variance errors and the other using data
thought to be associated with high variance errors. If the residual variances associated with each
regression line are approximately equal, the homoskedastic assumption should not be rejected,
but if the residual variance increases substantially, it is possible to reject the null hypothesis. The
test is carried out in the following step-wise manner:

1. Order or rank the data (observations) according to the values of X i , the variable that is
thought to be related to the error variance (begin with the lowest X value.
2. Omit the middle C5 observations, where C is usually about one-fifth of the total sample size
( N ). Next, divide the remaining ( N - C ) observations into two groups each of which has
( N -C ) observations.
2
5
Notice that C must be small enough to ensure that sufficient degrees of freedom are available to
allow for the proper estimation of each separate regression.

31
3. Fit (estimate) separate regressions, the first for the portion of the data associated with low
values of X (indicated by subscript 1) and the second associated with high values of X
(indicated by subscript 2).
4. Calculate the Residual Sum Squares ( RSS ) associated with each regression: RSS1 ,
associated with low X ' s , and RSS 2 , associated with high X ' s .
RSS 2 df
5. The test statistic is l = where the degrees of freedom of each regression line is
RSS1 df

æN- C- ö
ç 2K ÷ wher K is the number of coefficients to be estimated in one of the
è ø
regressions. Assuming the errors are normally distributed, the test statistic is distributed with
an F -distribution with degrees of freedom of (N - C - 2K 2) in both the nominator and
denominator.
6. Given the following hypothesis
H 0 :d i2 = d (homoskedastic )
2
.
H1 : H 0 is not true (hetroskedastic )
The null hypothesis of homoscedasticity is rejected at the chosen level of significance if the
calculated test statistic is larger than the critical value of the F -distribution.

1.6.7 Correcting for Hetroskedasticity

The plus side of heteroskedasticity is that it does not affect the unbiasedness and consistency
properties of the OLS estimator. The OLS estimator however is no longer efficient (not even
asymptotically: for large sample size). This loss of efficiency makes the usual hypothesis-testing
misleading. Therefore, if heteroskedasticity is detected using one of the tests, remedial measures
are necessary. By correcting for heteroskedasticity, the appropriate estimation technique is
ensured (i.e. unbiased, consistent and efficient) There are two approaches to remediation: when
d i2 is known and when d i2 is not known.

When d i2 is known: the method of Weighted Least Squares (WLS)

This is based on the assumption that there is sufficient prior knowledge available for each of the
error variance to be known. Although this is rare in econometric modeling, it gives a good

32
illustration of how heteroskedasticity can be corrected. Unlike the classical OLS, assigning
different weight to observations using the known variances is known as the Weighted Least
Square. The weighted least squares estimation procedure (which is derived from the Maximum
Likelihood function) can be illustrated using the two-variable model.
Y i = b 0 + b1 X i + e i
If the heteroskedastic variances of the error terms are known, and are given by:
Var(ei ) = di2
Dividing each term of the equation by d i , by weighting the original data we get

æç 1 ÷ö æç b 0 + b1 X i + ei ö
Yi d i = b 0 d i + b1 X i d i + e i d i Û Yiç ÷ ÷=
ç ÷
èd i ø è di ø

This equation can be transformed into the usual regression model format as

* *
Yi* = b 0 + b1 X i* + e i* (1.11)
What is the purpose of transforming the original model (i.e. dividing it with d i )? To see this,
notice the following feature of the transformed error term, e i*
2
æei ö 1
Var (ei ) = E (ei )÷ = Eçç E (e i )
2 2
* *
÷ =2
Since d i2 is known
di
di
è ø 2
1 2 Since E (ei ) is d i2
= 2variance
The di is now constant. That is, the variance of the transformed disturbance term e i* is
di
now homoskedastic. Since we are still retaining the other assumptions of the classical model, the
finding that e * is homoscedastic suggests that if we apply OLS to the transformed model of it
will produce estimates that are BLUE. In short, the estimated b 0* and b1* are now BLUE and not
the OLS estimators b 0 and b1 . This procedure of transforming the original variables in such a
way that the transformed variables satisfy the assumptions of the classical model and then
applying OLS to them is known as the method of generalized least squares (GLS). In short, GLS
is OLS on the transformed variables that satisfy the standard least-squares assumptions. The
estimator thus obtained is known as GLS (WLS) estimator, and it is this estimator that is BLUE.

33
The appropriate estimator of b 0* and b1* is then obtained by minimizing the residual sum square
given by

å (ê ) = å (Y )
2 2
i
*
i
* ˆ *0 - bˆ *1 Xi*
-b
2

Minimiz å çæç Yi - b0 - b1 X
ˆ
d
ˆ
i
ö
Û Minimize
å ÷
( )2
Yi* - bˆ *0 - bˆ 1* Xi* (1.12
è i ø *
Partially differentiating the function with respect to both coefficients, b 0 and b1 are given by
*
n
æ ö æ n öæ n ö
 å ( )
wi ç å wi X Yi i ÷ - ç å wiX i÷ç å wY i i÷
è i=1 ø è i=1 øè i=1 ø , where w = 1 d 2
b1* =
n n
2
i i (1.13
æ 2
ö æ w ö
(
å wi èç i=1 )
å wi X i÷ø- çè å
i=1 iX i÷ø

Since equation (1.12) minimizes a weighted Residual Sum Squares, it is appropriately known as
weighted least squares (WLS), and the estimators thus obtained and given in equation (1.13) are
known as WLS estimators. But WLS is just a special case of the more general estimating
technique, GLS. In the context of heteroskedasticity, one can treat the two terms WLS and GLS
interchangeably.

When is not known: error variance varies as a function of an independent variable

Consider an original model given by Yi = b 0 + b1 X i + e i . The following cases are discussed in
how the variance of the error term may vary as a function of the independent variables.
Case 1: the error variance is proportional to X i , Var(ei ) = E (ei ) = d X i2
2 2 2

If there is an indication (for example from visual inspection of the graph between error terms
and X ' s ) that the variance of the error terms ( e i ) is proportional to the square of the explanatory
variable X , one may transform the original model as follows. Divide the original model by X i .
Yi b 0 b1 X i e i Y 1
=
Xi Xi
+
Xi
+
Xi
Û i = b0
Xi Xi
+ b1 + u i (1.14
where ui is the transformed error term, equal to e i X i and now it is easy to verify that

34
2
1
but E (ei ) = d 2 Xi2
2 2
Var (ui ) = E (ui ) = E æçe i X i ö ÷ = E (ei2 ) ,
è ø X i2
It follows that Var (ui ) = 1 d X i2 = d
2 2

X2
Hence the variance of ui i X i ) is now homoskedastic, and one may proceed to apply OLS to
the transformed equation, regressing X i on1 X i . Notice that in the transformed regression the
intercept term b1 is the slope coefficient in the original equation and the slope coefficient b 0 is
the intercept term in the original model.

Case 2: the error variance is proportional to X i , Var(ei ) = E (ei ) = d X i

2 2

If it is believed that the variance of the error term ( e i ) is proportional to X i , then the original

model can be transformed as follows: dividing them all by X i

Yi b bX ei
= 0 + 1 i +
Xi Xi Xi Xi
b0 Yi 1
Yi = + b1 X i + e i = b0 + b 1 X i + ui (1.15
Xi Û Xi Xi
Xi Xi
where ui is the transformed error term, equal to e i X i and one can readily show that the error
terms are homoskedastic. Therefore, one may proceed to apply OLS to transformed model,
regressing X i on 1 X i and X i .Note an important feature of the transformed model: it
has no intercept term. Therefore, one will have to use the regression-through-the-origin model to
estimate b 0 and b1 .

Apart from transforming the model to obtain homoskedastic error terms the following can also
be used as remedies for heteroskedasticity.

1. Re-specification of the model

Include relevant omitted variable(s)
Express model in log-linear form
Express variables in per capita form

35
2. Heteroskedastic Consistent Standard Errors (White standard errors)
Where re-specification will not solve the problem, use robust Heteroskedastic Consistent
Standard Errors (White standard errors)
1.6.8 Autocorrelation

The classical regression model includes one important assumption about the independence of the
error terms from observation to observation. This assumption is the error terms from any two
observations are uncorrelated with each other, which implies there is no autocorrelation6 (no
serial correlation).
( ) ( )
Cov ei ,e j = 0 Þ E ui , u j = 0 "i, j i ¹ j
If this assumption is violated, the errors in one time period are correlated with their own values
in other periods and there is the problem of autocorrelation–sometimes referred to as serial
correlation. All time series variables can exhibit autocorrelation, with the values in a given time
period depend on values of the same series in previous periods. The assumption that errors
corresponding to different observations are uncorrelated often breaks down in time-series data.
When error terms from different (usually adjacent) time periods are correlated with each other,
we say that the error terms are serially-correlated or auto-correlated. Autocorrelation occurs in
time-series studies when errors associated with observations in a given time period carry over
into future time periods. Autocorrelation can be positive as well as negative, although most
economic time series generally exhibit positive autocorrelation. This is because most of them
either move upward or downward over extended time periods; and they do not exhibit constant
up-and-down movements. Incorrect functional form, omitted variables and an inadequate
dynamic specification of the model may all lead to the problem of autocorrelation.

First order autocorrelation

The most common form of autocorrelation is known as the first-order autoregressive process
[AR (1)]. In other words, if the value of e t in any particular period depends on its own value in
6
Notice the difference between correlation and autocorrelation. Autocorrelation is a special case of
correlation which refers to the relationship between successive values of the same variable while
correlation may also refer to the relationship between two or more different variables.

36
the preceding period alone, we say that e t ' s follow a first order-autoregressive scheme AR (1).
That is, in its simplest form, the errors in one period are related to those in the previous period by
a simple first-order autoregressive process. Thus, the error term in the simplest classical
regression model:
Y t = b 0 + b1 X t + e t (1.16)
is assumed to depend upon its predecessor as follows

e t = re t-1 + vt -1 < r < 1 (1.17 )

Equation (1.17 ) assumes that the value of the error term in any observation (time period) is equal
to r (= rho) times its value in the previous observation (time period) plus a random
component, vt . The random component in equation (1.17 ) is assumed to have a zero mean and
constant variance, d v2 . If r > 0 , we have positive autocorrelation, with each error arising as a
proportion of last period's error plus a random shock and r < 0 corresponds to negative
autocorrelation. The coefficient r is known as coefficient of autocorrelation. When the
condition r < 1 is satisfied we say that the first-order autoregressive process is Stationary. A
stationary process is such that the mean, variances and covariance of residuals (e t ) do not
change over time. We will be focusing on the AR (1), however bear in mind that there are also
higher order autoregressive processes [AR (S)] as well given as follows:
e t = r1e t-1 + r2e t-2 + ..., r se t-s + vt (1.18)
1.6.9 The consequences of autocorrelation for OLS estimator
As in the case of heteroskedasticity, in the presence of autocorrelation the OLS estimators are
still linearly unbiased as well as consistent and asymptotically normally distributed, but they are
no longer efficient (i.e., have no minimum variance). What then happens to our usual hypothesis
testing procedures if we continue to use the OLS estimators? Provided no other classical
assumptions are simultaneously violated, the OLS estimators are still unbiased. However, the
variances of the parameter estimates will be affected. Consequently, the standard errors and

37
t - values will also be affected. If there is positive autocorrelation, the standard errors will be
underestimated and the t - values will biased upwards. The variance of the error term will also
be underestimated under positive autocorrelation so that R-squared will be exaggerated. Over all,
OLS is not BLUE. The F - test formula will also be incorrect. Forecasts based on the OLS
regression model will be inefficient (they will have larger variances than those from some other
techniques). We cannot make any inference using the computed standard errors.
1.6.10 Detecting and testing for autocorrelation
A good starting point to detect autocorrelation is to visually inspect the plot between residuals
( e t ) and lagged residuals ( e t-1 ). The existence of systematic pattern between them indicates the
presence of autocorrelation.
e et
t

e t-1
e t-1

(b) Negative autocorrelation

(a) Positive
Figure 1.6: graphical analysis of autocorrelation

Based on the impression obtained from the graphical analysis of residuals and lagged residuals,
one can proceed to the formal test in order to make sure that autocorrelation in fact exists in a
particular regression model. There are some tests readily available in order to test for
autocorrelation.
Asymptotic tests
The OLS residuals provide useful information about the possible presence of autocorrelation in
the equation’s error term. A very good starting point in this case is to consider the regression of
the OLS residual ( e t ) upon its lag ( e t-1 ). This regression may be done with or without an
intercept, which might lead to marginally different results. This auxiliary regression not only

38
produces an estimate for the first-order autocorrelation coefficient but also routinely provides a
standard error for the estimate. In the absence of lagged dependent variables, the corresponding
t- is asymptotically valid. In fact, the resulting test statistic can be shown to be
asymptotically equal to
t» Tr
ˆ
which provides an alternative way of computing the test statistic. Consequently, at the 5%
significance level we reject the null hypothesis of no autocorrelation against a two-sided
alterative if t > 1.96 . If the alternative is positive autocorrelation (r > 0) , which is often
expected a priori, the null hypothesis is rejected at the 5% level if t > 1.64 .
The Breusch-Godfrey test
This test for error autocorrelation is based on an auxiliary regression involving the residuals from
the original regression ( e t ), regressed on a set of lagged residuals, e t -s (up to order S ) and all
the variables which were used in the initial regression. Essentially, we are testing that the
coefficients of the lagged residuals in the auxiliary regression are all zero.
This alternative test is based on the R 2 of the auxiliary regression (including the intercept term).
If we take the R 2 of this regression and multiply it by the effective number of observations,
T - K , we obtain a test statistic, under the null hypothesis that has a Chi-squared ( c 2 )
distribution with T - K degrees of freedom. An R 2 close to zero in this auxiliary regression
implies the lagged residuals are not explaining current residuals and a simple way to test r = 0
is by computing the test statistic, (T - K )R 2 . If the test statistic is larger than the Chi-squared
critical value ( c a2,T -K ), we reject the null hypothesis of no autocorrelation ( r = 0 ). If the model
of interest includes a lagged dependent variable (or other explanatory variables that are
correlated with lagged error terms), the above tests are still appropriate provide that the
regressors X t are included in the auxiliary regression.
Given the original model, the Breush-Godfrey test for higher order autoregressive processes
proceeds as follow.

39
1. Estimate the original model and obtain residuals
2. Run the regression of the residuals on original independent variables and lagged residuals
of order S , AR ( S )
ê t = aX t + rê t-1 + rê t-2 + ... + rêt-s + vt
3. Obtain the R 2 form this auxiliary regression and calculate the test statistic (T - K )R 2
4. If the test statistic is larger than the Chi-square critical c a2 ,S , reject the
( T -
hypothesis of no autocorrelation.
The Durbin-Watson test
A popular test for first order autocorrelation is the Durbin-Watson test, which has a known small
sample distribution under some restrictive conditions. Some of these restrictions are the
regression model includes a constant; only applies for first order autocorrelation, and regression
does not include a lagged dependent variable. The Durbin-Watson test involves the calculation of
a test statistic based on the residuals from the Ordinary Least Squares regression. This test
statistic is famously known as Durbin-Watson, dw statistic.
T
- êt -1 )
2
å (ê t

dw t =2
T

åê
t =1 t
2

Notice that the numerator cannot include a difference of the first observation in the sample since
no earlier observation is available. When successive values of ê t are close to each other, the dw
statistic will be low, suggesting the presence of positive autocorrelation. The dw statistic lies in
the range of 0 and 4, with a value near 2 indicating no first order autocorrelation. By making
several approximations, it is possible to show that dw = 2(1- ˆr ) . Thus, when there is no
autocorrelation ( r = 0 ), the dw statistic will be close to 2. Positive autocorrelation is associated
with dw values below 2 and negative autocorrelation is associated with the dw values above 2.
For hypothesis testing, there are upper ( dU ) and lower ( d L ) limits for the critical values of the
dw test statistic. These critical values d L
and independent variables ( K ). and dU depend on the number of observations ( T )

40
Zone of Zone of
Reject H 0 indecision Accept H 0 (No Reject H 0
indecision
(Positive (Inconclusive autocorrelation) (Negative
autocorrelation) region) (Inconclusive
autocorrelation)
region)
0 dU

d L1.7: Durbin-Watson dw statistic

Figure 2 4 - dU 4- 4

The following steps can be followed in performing the Durbin-Watson test:

1. Run the OLS regression and obtain the residuals and lagged residuals
2. Compute the value the dw test statistic
3. For the given sample size ( T ) and given number of explanatory variables ( K ), find out the
critical d L and dU values
4. Now follow the decision rules given in Figure 1.7 to test for the presence of autocorrelation
1.6.11 Correcting for autocorrelation
Remember that autocorrelation can be caused by model mis-specification. Among these, some
are incorrect functional form, omitted (auto-correlated) explanatory variables, inappropriate time
periods and incorrect dynamic structure. Therefore, if after applying one or more of the
diagnostic tests of autocorrelation discussed in the previous sections, we find that there is
autocorrelation, what then? We have two options: 1) try to find out if the autocorrelation is pure
autocorrelation and not the result of mis-specification of the model. Sometimes, we observe
patterns in residuals because the model is mis-specified, that is, it has excluded some important
variables-or because its functional form is incorrect.

1. If it is pure autocorrelation, one can use appropriate transformation of the original model so
that in the transformed model we do not have the problem of (pure) autocorrelation. As in the

41
case of heteroskedasticity, we will have to use some type of generalized least-square (GLS)
method.
2. In large samples, we can use the Newey–West method to obtain standard errors of OLS
estimators that are corrected for autocorrelation. This method is actually an extension of White’s
heteroskedasticity-consistent standard errors method.
The Generalized Least Square (GLS) estimator
Limiting ourselves to the first-order autoregressive process, we will discuss how to correct for
autocorrelation in this section. The GLS works by transforming the original model in such a way
that the transformed model fulfils the usual OLS assumption (produce efficient variance).
Consider the regression model given by:
Y t = b 0 + b1 X t + e t (1.19)
whose first-order autoregressive process, AR (1) is given by
e t = re t-1 + vt
The GLS estimator works differently in two ways: when the coefficient of autocorrelation ( r ) is
known and when not known.
When r is known
If the coefficient of first-order autocorrelation is known, the problem of autocorrelation can be
easily solved. If equation (1.19) holds true at time t , it also holds true at time ( t -1 ). Hence,
Yt-1 = b0 + b1 X t -1 + e t-1 (1.20)
Multiplying equation (1.20) by r on both sides, we get
rYt-1 = rb0 + rb1 X t -1 + ret-1
Subtracting equation (1.20) from equation (1.19) gives
(Yt - rYt-1 )= (1- r)b0 + b1 ( X t - rX t -1 ) + (et - ret-1 ) (1.21
We can express equation (1.21) as )
Yt* = b 0* + b1*Xt* + e t*
where Yt* = (Yt - rYt-1 ) , b0* = (1- r)b0 , Xt* = (X t - rX t -1 ) and b1 * = b1 .

42
Since the error term in (1.22) satisfies the usual OLS assumptions, we can
apply OLS to the * *
transformed variables Y and X and obtain estimators with all the optimum
properties, namely,
BLUE. In effect, running (1.22) is tantamount to using generalized least squares
(GLS) –recall
that GLS is nothing but OLS applied to the transformed model that
satisfies the classical
assumptions.

When r is not known

Although conceptually straightforward to apply, the method of generalized
difference given in
(1.22) is difficult to implement because r is rarely known in practice.
Therefore, we need to
find ways of estimating r. The most common way of doing so is to use the
method of first
difference.

The First-Difference method

Since the coefficient of correlation r lies between 0 and ±1 , one could start
from two extreme
positions. At one extreme, one could assume that r = 0, that is, no (first-order)
correlation, and at the other extreme we could let ρ = ±1, that is, perfect
positive or negative
autocorrelation. As a matter of fact, when a regression is run, one generally
assumes that there is
no autocorrelation and then lets the Durbin–Watson or other tests show whether
this assumption
is justified. If, however ρ = +1 , the generalized difference equation (1.21)
reduces to the first-
difference equation:

Yt -Yt-1 = b1 ( X t - X t -1 ) + (e t - e t-1 )
(1.23)

Since the error term in (1.21) is free from (first-order autocorrelation), to run
the regression
(1.21) all one has to do is form the first differences of both the
dependent variable and
regressor(s) and run the regression on these first differences. The first difference
transformation
may be appropriate if the coefficient of autocorrelation is very high, say in
excess of 0.8, or the
Durbin–Watson dw is quite low. An interesting feature of the first-difference
model (1.21) is

1
that there is no intercept in it. Hence, to estimate (1.21), you have to use the
regression through
the origin routine (that is, suppress the intercept term), which is now available in
most software
packages.
Model Selection
Usually the choice of models that individuals and firms used depends on the nature of the
dependent variable in the study. For instance, if the dependent variable has two outcomes, the
choices can be represented by a binary model, whereas if the dependent variable is continuous
which means if it has more than two outcomes it goes to other models such as leaner
regression depending on its detail nature (Gujarati, 2004). Furthermore, for categorical
variables the choice of model depends on the nature of the response for dependent variable.
For instance, in cases where there are unordered responses or outcome variables of more than
two values, the multinomial logit and probit model is appropriate. While ordinal logit and
probit model is applied in cases where there is clear natural ranking or order from low to high
among the outcomes but the distance between adjacent categories is unknown.
Eg. Determinant of higher education institutions on promoting students’
entrepreneurship across discipline: Evidence from Dire Dawa, Haramaya
and Adama University.

University Environment Factors

Curriculum

University
University Environment Factors
Commitment

Delivery Method

Curriculum Students’
Entrepreneurs
Assessment Method
hip
University Commitment Promotion
Learning Facilities
Delivery Method

Entrepreneurship Students’
In this study the dependent Course
variable (students’ entrepreneurship promotion) was measured
Entrepreneurs
Assessment
using five point likert Method
scale (highly promoted, promoted, undecided, low promoted and very
hip
low promoted). As the dependent variable contains five ordered responses, ordinal model was
Field of Study Promotion
used to examine the relationship between the independent variable and dependent variable
which in turn to reach a conclusion.
Learning According to Gujarati (2004), ordinal model may be logit
Facilities
or probit. Due to its ease to apply and inclusion of probability, logit model is more preferable.
Thus, in this study, ordinal logit model was applied.
Entrepreneurship Course
2
Field of Study
When modeling these types of outcomes, numerical values are assigned to the outcomes, but
the numerical values are ordinal and reflect only the ranking of the outcomes. That is we
might assign a dependent variable the values 1 for “highly promoted”, 2 for “promoted”, 3
for “undecided”, 4 for “low promoted” and 5 for “very low promoted”.
Consider the generic population regression function given by:

If the latent variable denotes a natural ordering among the possible outcomes, then the
observed the dependent variable can assume a data generating process of the following type.

where is the observed scores for the dependent variable that are given numerical values as
follows: 1 for “highly promoted”, 2 for “promoted”, 3 for “undecided”, 4 for “low promoted”
and 5 for “very low promoted”; is the unobservable value of the dependent variable, is a
vector of variables that explains the variation in the observed dependent variable; is a vector
of coefficients; are the threshold parameters to be estimated along with ; and is a
disturbance term that is assumed normally distributed. These threshold parameters, which
usually must be estimated, determine how the values of to get translated into the five
possible values of

Accordingly, the derived model for this study was:

Yi =β0 + β1CU + β2UC + β3DM + β4AM + β5LF + β6EN + β7FS + εi
Where,
Yi = dependent variable (students’ entrepreneurship promotion)
0 = coefficient of Intercept,
= coefficient of variables,
CU = curriculum,
UC = university commitment
DM = delivery method,
AM = assessment method
LF = learning facility,
EN = entrepreneurship course
FS = field of study
εi = Extraneous variable

3
1.5. Test of the Model
To make the regression result of the model ready for discussion and to get reliable output from the research,
different tests were run. These tests are mainly intended to check whether the proportional odds assumption
and other classical linear regression model (CLRM) assumptions are fulfilled when the dependent variable is
regressed against the independent variables. The explanations of each test, decision rules therein, and their
implications are discussed as follows.
1. cross tab of the categorical variables
Before directly running the model a cross tab of the response variable with the categorical variables were
made to see if any cells are empty or extremely small. In this study all independent variables are categorical
variables. A cross tab of the response variable with these categorical variables were made. As a result, none
of the cells is too small or empty (see appendix A) so that it becomes ready for running the model.
2. Proportional odds assumption test
One of the assumptions underlying ordinal logistic regression is that the relationship between each pair of
outcome groups is the same. In other words, ordinal logistic regression assumes that the coefficients that
describe the relationship between, say, the lowest versus all higher categories of the response variable are the
same as those that describe the relationship between the next lowest category and all higher categories, etc.
This is called the proportional odds assumption or the parallel regression assumption. Because the
relationship between all pairs of groups is the same, there is only one set of coefficients (only one model). If
this was not the case, we would need different models to describe the relationship between each pair of
outcome groups. Thus, we need to test the proportional odds assumption, and there are two tests that can be
used to do so: omodel and brant test.
In this study the first method that is omodel was applied. In order to apply the command, we need to
download a user-written command called omodel (type finditomodel). Accordingly, the user-written
command called omodel was downloaded from the internet and run it on the satata. As rule of thumb, the
proportional odds assumption will be fulfilled, if the result of the test is insignificant. That means Prob > chi2
should be insignificant at a significance level of 1, 5 or 10 percent . In line to this the omodel result of this
study is insignificant (i.e. 0.2700) and the proportional odds assumption is fill filled (see appendix D).
3. Testing the overall model fitness
P-value (Prob>f): Used to determine the overall significance of the model. In other words, it describes the
reliability of a group of independent variables in predicting the dependent variable. If the p-value of the
group of independent variable is less than 5 percent, they would have statistically significant relationship
with dependent variable or reliably predict the dependent variable, whereas if the p-value is more than 5
percent, it would conclude that the group of independent variables does not show a statistically significant
relationship with the dependent variable, or that the group of independent variables does not reliably predict
the dependent variable (Gujarati, 2004). Since the p-value of the group of independent variables of this
model is 0.000 which is less than 5 percent, it is possible to conclude that they can reliably predicting the
dependent variable. Hence, the requirement for fitness of model was safely fulfilled (see appendix B or C).
4. Test of Model Specification Error
Model specification error can occur when one or more relevant variables are omitted from the model or one
or more irrelevant variables are included in the model. If relevant variables are omitted from the model, the
common variance they share with included variables may be wrongly attributed to those variables, and the
error term is inflated. On the other hand, if irrelevant variables are included in the model, the common
variance they share with included variables may be wrongly attributed to them. Model specification errors
can substantially affect the estimate of regression coefficients (Gujarati, 2004). In this study, to detect
whether there is model specification error or not, both link test and ov test was adopted.
Linktest: It is the one which used to detect the inclusion of one or more irrelevant variables in the model. In
linktest two new variables are created; the variable of prediction, _hat, and the variable of squared prediction,
_hatsq. The model is then refitted using these two variables. Hence, if the p-value of the _hatsq is
insignificant at less than 10 percent, the model would be specified correctly, whereas if the p-value of the
_hatsq is significant, there would be a model specification error (Gujarati, 2004). In this study as the result of
_hatsq is insignificant (i.e. 0.151), there is no model specification error (see appendix E).
Ovtest: Used to detect whether one or more relevant variables are omitted. With regard to its decision rule: if
the prob >f result shows significant, the Ramsey RESET null hypothesis (i.e. Ho: model has no omitted
variables) is rejected and it shows there is a model specification error (omitted variable), otherwise the null
hypothesis is accepted and indicated as there is no model specification error. Hence, in this study as the prob
>f of the model is insignificant (0.2100), Ramsey RESET null hypothesis (i.e. Ho: model has no omitted
variables) is accepted which indicated that there was no model misspecification error (see appendix F).
5. Test of multi colliniarity
An important assumption for the multiple regression models is that independent variables are not perfectly
multi-collinear. Multi-collinearity problem is the existence of a “perfect,” or exact, linear relationship among
some or all explanatory variables of a regression model (Gujarati, 2004). In this paper, to detect whether
there is a collinearity problem or not, vif (Variance Inflation Factor) was utilized. As a rule of thumb multi-
collinearity test of the model states that a variable whose values are greater than 10 or whose 1/VIF value is
less than 0.1 indicates the possible problem of multi-collinearity. In connection to this, in this study there is
no perfect collinearity among and between discrete and continuous variable because the VIF value is below
5.03 and 1/VIF is greater than 0.199 (see appendix G).
In addition, a pair-wise correlation matrix of the selected variables is also employed to check whether there
exists a multicolliniarity problem in the model or not using correlation command (pwcorr). If the correlation
between each variable is greater than or equal 0.8 or -0.8; results could show the existence of perfect positive
or negative (serious problem) of correlation. This study tested the model for checking such correlation
problem using pair wise correlation (pwcorr) test and the result showed that the pair wise correlation of all
variables were different from 0.8 as well as -0.8 (i.e. between 0.73 and -0.075) (see appendix H). Thus,
based on the given result and its justification there was no intolerable problem of correlation.
6. Test of Heteroscedasticity
The other assumption of CLRM is the homogeneity of variance of the residuals. If the model is well-fitted,
there should be no pattern to the residuals plotted against the fitted values. If the variance of the residuals is
non-constant then the residual variance is said to be "heteroscedastic." There are graphical and non-graphical
methods of detecting hetroscedasticity problem (Gujarati, 2004). In this study, hettest was used to check
whether there is hetroscedasticity problem or not. As Breusch-Pagan/Cook-Weisberg test shows the null
hypothesis (i.e., Ho: constant variance) was rejected because the test result showed P-value of 0.5857 (58.57
percent), which is greater than the significance level (1 percent, 5 percent, and 10 percent). Thus, the result
indicated that there is equal variance among the error terms. Therefore, there was no serious problem of
Heteroscedasticity in the process of model specification and the model was well fitted. (Appendix I).

Econometrics Lecture Notes Booklet
No ratings yet
Econometrics Lecture Notes Booklet
81 pages
Introduction Econometrics
100% (1)
Introduction Econometrics
27 pages
Gretl Empirical Exercise 2
No ratings yet
Gretl Empirical Exercise 2
4 pages
Econometrics for Accounting Students
No ratings yet
Econometrics for Accounting Students
132 pages
Chapter one and two (8)
No ratings yet
Chapter one and two (8)
73 pages
EconometricsChapter One
No ratings yet
EconometricsChapter One
32 pages
Econometrics for Finace Lecture I
No ratings yet
Econometrics for Finace Lecture I
36 pages
Chapter One and Two Econometrics
No ratings yet
Chapter One and Two Econometrics
63 pages
Econometrics PART ONE
No ratings yet
Econometrics PART ONE
33 pages
Eco CH1
No ratings yet
Eco CH1
33 pages
1 Econometrics
No ratings yet
1 Econometrics
63 pages
Econometrics All Chpter
No ratings yet
Econometrics All Chpter
233 pages
Chap1_Econometrics
No ratings yet
Chap1_Econometrics
42 pages
Chapter 01
No ratings yet
Chapter 01
26 pages
Chapter 1 - Introduction ppt
No ratings yet
Chapter 1 - Introduction ppt
16 pages
Econometrics I_Chapter 1 2 3
No ratings yet
Econometrics I_Chapter 1 2 3
87 pages
Econometrics I CH-1
No ratings yet
Econometrics I CH-1
32 pages
Lecture 1. Introduction To Econometrics
100% (1)
Lecture 1. Introduction To Econometrics
24 pages
Basics of Econometrics
No ratings yet
Basics of Econometrics
54 pages
Chapter One
No ratings yet
Chapter One
5 pages
Econometrics Chapter 1& 2
No ratings yet
Econometrics Chapter 1& 2
35 pages
Econometrics For MGT Chapter1
No ratings yet
Econometrics For MGT Chapter1
24 pages
Chapter 1
No ratings yet
Chapter 1
61 pages
Econometrics Chapter I
No ratings yet
Econometrics Chapter I
15 pages
Understanding Econometrics Basics
No ratings yet
Understanding Econometrics Basics
10 pages
Assignment 1: Introduction To Econometrics: Submitted By: Ankur Gautam MBA LSCM 2010-2012 Roll No: 06 SAP ID: 500011485
No ratings yet
Assignment 1: Introduction To Econometrics: Submitted By: Ankur Gautam MBA LSCM 2010-2012 Roll No: 06 SAP ID: 500011485
7 pages
Econometrics I
No ratings yet
Econometrics I
47 pages
Econometric Methods
No ratings yet
Econometric Methods
6 pages
Econometrics Lecture1
No ratings yet
Econometrics Lecture1
24 pages
Econometrics
100% (1)
Econometrics
115 pages
Econometrics I Lecture Notes
100% (1)
Econometrics I Lecture Notes
74 pages
IntroduEconometrics - MBA 525 - FEB2024
No ratings yet
IntroduEconometrics - MBA 525 - FEB2024
266 pages
Econometrics_Chapter One (1)
No ratings yet
Econometrics_Chapter One (1)
36 pages
Econometric S
No ratings yet
Econometric S
59 pages
Chapter 01 - EM
No ratings yet
Chapter 01 - EM
44 pages
Econometrics: Chapter 1: Introduction
No ratings yet
Econometrics: Chapter 1: Introduction
30 pages
Econometrics Module
No ratings yet
Econometrics Module
155 pages
Eco For Miliyo
No ratings yet
Eco For Miliyo
8 pages
Econometrics: Introduction To The Module
No ratings yet
Econometrics: Introduction To The Module
31 pages
Chapter 1 - Metrics Presentation
No ratings yet
Chapter 1 - Metrics Presentation
46 pages
Introduction To Econometrics For Finance
100% (1)
Introduction To Econometrics For Finance
93 pages
Econometrics Module
No ratings yet
Econometrics Module
148 pages
Econometrics I-For Lectuure Latest
67% (3)
Econometrics I-For Lectuure Latest
148 pages
econo metri
No ratings yet
econo metri
21 pages
Econometrics For Finance Chapter-1
No ratings yet
Econometrics For Finance Chapter-1
48 pages
Econometrics Module
No ratings yet
Econometrics Module
190 pages
Econometrics Chapter 1-3
No ratings yet
Econometrics Chapter 1-3
41 pages
Basic Econometrics Intro Mujahed
No ratings yet
Basic Econometrics Intro Mujahed
57 pages
Short Note For Theme 2
No ratings yet
Short Note For Theme 2
63 pages
Econometrics ppt-1
No ratings yet
Econometrics ppt-1
205 pages
Econometrics
No ratings yet
Econometrics
205 pages
Lecture 1
No ratings yet
Lecture 1
21 pages
Econometrics II
No ratings yet
Econometrics II
15 pages
Econometrics Lecture 1: Introduction
No ratings yet
Econometrics Lecture 1: Introduction
18 pages
Chapter1
No ratings yet
Chapter1
55 pages
ECONOMETRICS An Introduction PDF
No ratings yet
ECONOMETRICS An Introduction PDF
53 pages
IntroEconmerics AcFn 5031 (q1)
No ratings yet
IntroEconmerics AcFn 5031 (q1)
302 pages
Econometrics Chapter 1 7 2d AgEc 1
No ratings yet
Econometrics Chapter 1 7 2d AgEc 1
89 pages
By: Domodar N. Gujarati: Prof. M. El-Sakka
No ratings yet
By: Domodar N. Gujarati: Prof. M. El-Sakka
19 pages
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
Econometrics: The Essentials
From Everand
Econometrics: The Essentials
Samir Ganaka
No ratings yet
Chapter Four Motivation Concepts and Their Applications
No ratings yet
Chapter Four Motivation Concepts and Their Applications
6 pages
CH - 6 Consumption
No ratings yet
CH - 6 Consumption
59 pages
Why Social Safe Guarding
No ratings yet
Why Social Safe Guarding
66 pages
Procurement
100% (1)
Procurement
58 pages
Chapter Three Accounting Cycle For Merchandising Business
No ratings yet
Chapter Three Accounting Cycle For Merchandising Business
16 pages
Study Plan For Graduate Studies
No ratings yet
Study Plan For Graduate Studies
1 page
Chapter Five Management of Organizational Conflict 5.1. Nature and Definition of Conflict
No ratings yet
Chapter Five Management of Organizational Conflict 5.1. Nature and Definition of Conflict
8 pages
Chapter Three Foundation of Group Behavior 3.1. Defining and Classifying Team and /or Group
No ratings yet
Chapter Three Foundation of Group Behavior 3.1. Defining and Classifying Team and /or Group
14 pages
ESHS PPT F
No ratings yet
ESHS PPT F
33 pages
6.1 What Is Stress?: Chapter-Six Work-Related Stress and Stress Management
No ratings yet
6.1 What Is Stress?: Chapter-Six Work-Related Stress and Stress Management
10 pages
POLCY Final Draft
No ratings yet
POLCY Final Draft
18 pages
Concept Note
No ratings yet
Concept Note
21 pages
Guidellines On Technology Selection and Transfer
No ratings yet
Guidellines On Technology Selection and Transfer
16 pages
Guidlines On Environmental Procurment
No ratings yet
Guidlines On Environmental Procurment
27 pages
Guidelines On Pollution Release and Transfer Registry
No ratings yet
Guidelines On Pollution Release and Transfer Registry
19 pages
Guidelines On Integrated Pollution Prevention and Control
No ratings yet
Guidelines On Integrated Pollution Prevention and Control
18 pages
Final Ambient Environment Standards4
No ratings yet
Final Ambient Environment Standards4
109 pages
Guideline On Enforcement and Compliance
No ratings yet
Guideline On Enforcement and Compliance
20 pages
2 Impact Mitigation Measures - Berhanu
No ratings yet
2 Impact Mitigation Measures - Berhanu
27 pages
Hydropower Production, Transportation
No ratings yet
Hydropower Production, Transportation
21 pages
Research
No ratings yet
Research
10 pages
ESMF Tigray - Training - 2023
No ratings yet
ESMF Tigray - Training - 2023
97 pages
Income Inequality and Education Revisited Persistence Endogeneity and Heterogeneity
No ratings yet
Income Inequality and Education Revisited Persistence Endogeneity and Heterogeneity
16 pages
Econometrics: Chapter 6: Multiple Regression Model
No ratings yet
Econometrics: Chapter 6: Multiple Regression Model
23 pages
ICT Infrastructure and Economic Growth Evidence
No ratings yet
ICT Infrastructure and Economic Growth Evidence
19 pages
Ward-Gleditsch (2008) Spatial Regression Models
No ratings yet
Ward-Gleditsch (2008) Spatial Regression Models
87 pages
Impacts of Immigrants To GDP in Malaysia: Manufacturing, Construction, Services and Agriculture
No ratings yet
Impacts of Immigrants To GDP in Malaysia: Manufacturing, Construction, Services and Agriculture
7 pages
PDF An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download
100% (4)
PDF An Introduction to Econometrics A Self Contained Approach 1st Edition Frank Westhoff download
81 pages
A. Data Science Methods
No ratings yet
A. Data Science Methods
25 pages
Ridge Regression LASSO
No ratings yet
Ridge Regression LASSO
18 pages
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
No ratings yet
When Should You Adjust Standard Errors For Clustering?: Alberto Abadie, Susan Athey, Guido Imbens, & Jeffrey Wooldridge
33 pages
Sun 2015
No ratings yet
Sun 2015
12 pages
Multilevel Modeling Using R
No ratings yet
Multilevel Modeling Using R
253 pages
Pricing and Hedging of Japan Equity-Linked Power Reverse Dual Note
No ratings yet
Pricing and Hedging of Japan Equity-Linked Power Reverse Dual Note
59 pages
3.chapter 2 - Least Square Adjustment
No ratings yet
3.chapter 2 - Least Square Adjustment
25 pages
Journal of Financial Economics: Luc Laeven, Ross Levine
No ratings yet
Journal of Financial Economics: Luc Laeven, Ross Levine
17 pages
A Study On Factors Influencing Claims in General Insurance Business in India
No ratings yet
A Study On Factors Influencing Claims in General Insurance Business in India
13 pages
Artificial Intelligence Measurement of Disclosure AIMD
No ratings yet
Artificial Intelligence Measurement of Disclosure AIMD
36 pages
FULLTEXT012
No ratings yet
FULLTEXT012
126 pages
Lecture 7
No ratings yet
Lecture 7
20 pages
Notes Part 2
No ratings yet
Notes Part 2
101 pages
Dynamic Econometric Models: Autoregressive and Distributed-Lag Models
100% (1)
Dynamic Econometric Models: Autoregressive and Distributed-Lag Models
11 pages
Nadeem Et Al., 2017
No ratings yet
Nadeem Et Al., 2017
22 pages
Cracking the Personality Code: Predicting Personality from Smartphone Sensor Data & App Usage Data 1st Edition Swaviman Kumar download pdf
100% (1)
Cracking the Personality Code: Predicting Personality from Smartphone Sensor Data & App Usage Data 1st Edition Swaviman Kumar download pdf
65 pages
Multivariate Regression, slides
No ratings yet
Multivariate Regression, slides
61 pages
Access To Credit and Performance of Small Scale Farmers in Nigeria
No ratings yet
Access To Credit and Performance of Small Scale Farmers in Nigeria
126 pages
Immediate download Statistics in Engineering With Examples in MATLAB and R Second Edition Chapman Hall CRC Texts in Statistical Science Andrew Metcalfe ebooks 2024
No ratings yet
Immediate download Statistics in Engineering With Examples in MATLAB and R Second Edition Chapman Hall CRC Texts in Statistical Science Andrew Metcalfe ebooks 2024
55 pages
CHAPTER 4_violations of Assumptions
No ratings yet
CHAPTER 4_violations of Assumptions
96 pages
Demand Forecasting and Estimating Methods Problems
No ratings yet
Demand Forecasting and Estimating Methods Problems
22 pages
3 Sls
No ratings yet
3 Sls
31 pages
Rock-Physics Relationships Between Inverted Elastic Reflectivities
No ratings yet
Rock-Physics Relationships Between Inverted Elastic Reflectivities
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Econometric S 1 R

Uploaded by

Econometric S 1 R

Uploaded by

SUN DAERO COLLEGE

MODULE FOR THE COURSE

COURSE CODE MGMT 3071

Chapter 2: Classical and multiple linear regression

1.1 Basis of regression: classical linear regression model

1.2 Curve fitting through the method of Least Squares (OLS)

Table 1 Income–consumption data

Y i = b 0 + b1 X i + e i Population regression function

[Y - (b +b X )] =(Y - b - 

i=1 i=1 i=1

Solving the system simultaneously, we get 2

åi=1 i å i=1 åi=1 i

Yi = bˆ 0 + bˆ 1 X i , where s.e represents the standard errors of coefficients

Note that the degrees of freedom ( n - 2 ) is constrained by the number of coefficients to be

s.ebˆ s2 and s.ebˆ

1.3 Assumptions of Least Square

A stochastic model of the following type

1.4 Desirable properties of the OLS

Example: assume the true relationship between X and Y is given as follows: Yi = 1+ 2 X + e i ,

b) Efficiency (Minimum variance plus unbiasedness)

1.5 Hypothesis testing

Table 1.3: steps in hypothesis testing

If P( t > ta ) reject null hypothesis (1-tailed)

The F - test follows an F-distribution with K -1 and n - K degrees of freedom for

1.6 Multiple regression

In simple two-variable regression model, we discussed the relationship between a dependent

1.6.2 Omitted variable bias

In general, model specification is fundamentally a theoretical exercise. We build models to

1.6.3 Functional form misspecification

Suppose the true regression model is Yi = b 0 + b1 X 1i + b2 X 2i + e i . The RESET test goes as

Yi = b0 + b1 X 1i + b2 X 2i + a 2Ŷi2 + a 3Ŷi3 + a 4Ŷi4 +...,+anŶin

Ramsey RESET test using powers of the fitted values of wage

Source | SS df MS Number of obs = 857

Ramsey RESET test using powers of the fitted values of wage

Var(ei ) ¹ d 2 (non constant)

Var(ei ) = di2 (variance that varies with i)

Var(ei ) = d (X i ) (variance that varies with X )

Figure 1.4 homoskedastic and heteroskedastic patterns of errors

1.6.5 Consequences for OLS estimator

1.6.6 Detecting and testing for heteroskedasticity

White test for heteroskedasticity

The White test for heteroskedasticity proceeds as follows:

Breusch-Pagan test for heteroskedasticity

d i2 = a0 +a1Z1i + a 2 Z 2i + ... +a k Zki

If a1 = a2 = a3 = ... = ak = 0 , then d i2 = a0 which is a constant value. Therefore, to test if d i2 is

1. Estimate the original regression model by OLS, as usual.

1.6.7 Correcting for Hetroskedasticity

When d i2 is known: the method of Weighted Least Squares (WLS)

When is not known: error variance varies as a function of an independent variable

Case 2: the error variance is proportional to X i , Var(ei ) = E (ei ) = d X i

model can be transformed as follows: dividing them all by X i

1. Re-specification of the model

First order autocorrelation

e t = re t-1 + vt -1 < r < 1 (1.17 )

(b) Negative autocorrelation

d L1.7: Durbin-Watson dw statistic

The following steps can be followed in performing the Durbin-Watson test:

When r is not known

The First-Difference method

University Environment Factors

Accordingly, the derived model for this study was:

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.