0% found this document useful (0 votes)
26 views

Lecture 1

1. The document provides an overview of time series analysis and vector autoregressive (VAR) models. 2. It discusses key concepts such as stationary vector processes, the lag operator, polynomials in the lag operator, covariance stationarity, and convergence of random variables. 3. The course will cover VAR models, estimation and hypothesis testing, forecasting, structural VAR models, cointegration, structural factor models, Bayesian VARs, and will involve problem sets and a take-home exam assessing students' understanding.

Uploaded by

katerinne
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views

Lecture 1

1. The document provides an overview of time series analysis and vector autoregressive (VAR) models. 2. It discusses key concepts such as stationary vector processes, the lag operator, polynomials in the lag operator, covariance stationarity, and convergence of random variables. 3. The course will cover VAR models, estimation and hypothesis testing, forecasting, structural VAR models, cointegration, structural factor models, Bayesian VARs, and will involve problem sets and a take-home exam assessing students' understanding.

Uploaded by

katerinne
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 60

Time Series (Part II)

Luca Gambetti
UAB

IDEA Winter 2011

1
Contacts

Prof.: Luca Gambetti


Office: B3-1130 Edifici B
Office hours:
email: luca.gambetti@uab.es
webpage: http://pareto.uab.es/lgambetti/

2
Goal of the course

The main objective of the course is to provide the students with the knowledge
of a comprehensive set of tools necessary for empirical research with time series
data.

Description

This is the second part of an introductory 40-hours course in Time Series Anal-
ysis with applications in macroeconomics. This part focuses on the theory of
multivariate time series models.

3
Contents

1. VAR: preliminaries, representation, stationarity, second moments, specifica-


tion (1-2).
2. VAR: estimation and hypothesis testing, Granger causality (1-2).
3. VAR: forecasting and impulse response functions (1).
4. Structural VAR: theory (1).
5. Structural VAR: applications (1).
6. Nonstationary data - Cointegration (1).
7. Structural Factor models and FAVARs (1).
8. Bayesian VARs (1).

4
References

1. P. J.Brockwell, and R. A. Davis, (2009), Time Series: Theory and Methods,


Springer-Verlag: Berlin

2. F. Canova (2007), Methods for Applied Macroeconomic Research, Princeton


University Press: Princeton

3. J. D. Hamilton (1994), Time Series Analysis, Princeton University Press:


Princeton

4. H. Lutkepohl (2005), New Introduction to Multiple Time Series, Springer-


Verlag: Berlin

5
Grades

40% problem sets


60% final take-home exam.

Econometric Software

GRETL, MATLAB.

6
1. STATIONARY VECTOR PROCESSES1

1
This part is partly based on the Hamilton textbook and Marco Lippi’s notes.

7
1 Some Preliminary Definitions and Results

• Random Vector : A vector X = (X1, ..., Xn) whose components are scalar-
valued random variables on a probability space.

• Vector Random Process: A family of random vectors {Xt, t ∈ T } defined


on a probability space, where T is a set of time points. Typically T = R, T = Z
or T = N, the sets or real, integer and natural numbers, respectively.

• Time Series Vector : A particular realization of a vector random process.

8
1.1 The Lag operator

• The lag operator L maps a sequence {Xt} into a sequence {Yt} such that
Yt = LXt = Xt−1, for all t.

• If we apply L repeatedly on a process, for instance L(L(LXt)), we will use


the convention L(L(LXt)) = L3Xt = Xt−3.

• If we apply L to a constant c, Lc = c.

• Inversion: L−1 is the inverse of L, such that L−1(L)Xt = Xt.

9
1.2 Polynomials in the lag operator

We can form polynomials; α(L) = 1 + α1L + α2L2 + ... + αpLp is a polynomial in


the lag operator of order p and is such that α(L)Xt = Xt +α1Xt−1 +...+αpXt−p.

• Lag polynomials can also be inverted. For a polynomial φ(L), we are look-
ing for the values of the coefficients αi of φ(L)−1 = α0 + α1L + α2L2 + ... such
that φ(L)−1φ(L) = 1.

Case 1: p = 1. Let φ(L) = (1 − φL) with |φ| < 1. To find the inverse write

(1 − φL)(α0 + α1L + α2L2 + ...)−1 = 1

note that all the coefficients of the non-zero powers of L must be equal to zero.
This gives

α0 = 1 ⇒ α1 = φ
−φ + α1 = 0 ⇒ α1 = φ
−φα1 + α2 = 0 ⇒ α2 = φ2
−φα2 + α3 = 0 ⇒ α3 = φ3
10
P∞
and so on. In general αk = φk , so (1−φL)−1 = j j
j=0 φ L provided that |φ| < 1.
It is easy to check this because

(1 − φL)(1 + φL + φ2L2 + ... + φk Lk ) = 1 − φk+1Lk+1

so
2 2 1 − φk+1Lk+1
k k
(1 + φL + φ L + ... + φ L ) =
(1 − φL)
Pk j 1
and k → ∞ j=0 φ Lj → (1−φL) .

11
Case 2: p = 2. Let φ(L) = (1 − φ1L − φ2L2). To find the inverse it is useful
to factor the polynomial in the following way

(1 − φ1L − φ2L2) = (1 − λ1L)(1 − λ2L)

where the λ1, λ2 are the reciprocal of the roots of the above left-hand side poly-
nomial or equivalently the eigenvalues of the matrix
 
φ1 φ2
1 0
Suppose λ1, λ2 < 0 and λ1 6= λ2. We have that (1 − φ1L − φ2L2)−1 = (1 −
λ1L)−1(1 − λ2L)−1. Therefore we can use what we have seen above for the case
p = 1. We can write
 
λ1 λ2
(1 − λ1L)−1(1 − λ2L)−1 = (λ1 − λ2)−1 −
1 − λ1L 1 − λ2L
λ1  2

= 1 + λ1L + λ1L + ... −
λ1 − λ2
λ2 
1 + λ2L + λ2L2 + ...


λ1 − λ2
= (c1 + c2) + (c1λ1 + c2λ2)L + (c1λ21 + c2λ22)L2 + ...

12
where c1 = λ1/(λ1 − λ2), c2 = −λ2/(λ1 − λ2)

• Matrix of polynomial in the lag operator : A(L) if its elements are poly-
nomial in the lag operator, i.e.
     
1 L 1 0 0 1
A(L) = = + L
0 2+L 0 2 0 1
We also define  
1 0
A(0) =
0 2
and  
1 1
A(1) =
0 3

13
1.3 Covariance Stationarity

Let Yt be a n-dimensional random vector, Yt0 = [Y1t, ..., Ynt]. Then Yt is co-
variance (weakly) stationary if E(Yt) = µ, and the autocovariance matrix
E(Yt − µ)(Yt−j − µ)0 = Γj for all t, j, that is are independent of t and both
finite.
− Stationarity of each of the components of Yt does not imply stationarity of
the vector Yt. Stationarity in the vector case requires that the components
of the vector are stationary and costationary.
− Although γj = γ−j for a scalar process, the same is not true for a vector
process. The correct relation is

Γ0j = Γ−j

14
1.4 Convergence of random variables

We refresh three concepts of stochastic convergence in the univariate case and


then we extend the three cases to the multivariate framework.

Let {xT , T = 1, 2, ...} be a sequence of random variables.

• Convergence in probability The sequence {xT } converges in probability to


the random variable x if for every  > 0
lim P r(|xT − x| > ) = 0
T →∞
When the above condition is satisfied, the variable c is called probability limit or
plim of the sequence {xT }, indicated as
plimxT = x

• Convergence in mean square. The sequence of random variables {xT } con-


m.s.
verges in mean square to x, denoted by xT → c, if
lim E(xT − x)2 = 0
T →∞

15
• Convergence in distribution Let FT denote the cumulative distribution func-
tion of xT and F the cumulative distribution of the scalar x. The sequence is
d L
said to converge in distribution written as xT → x (or xT → x), if for all real
numbers c for which F is continuous

lim FT (c) = F (c).


n→∞

We will equivalently write


d
XT → N (µ, σ 2)

• The concepts of stochastic convergence can be generalized to the multivariate


setting. Suppose {XT }T =1∞ is a sequence of n-dimensional random vectors and
X is a n-dimensional random vector. Then
p p
1. XT → X if XkT → Xk for k = 1, ..., K
m.s.
2. XT → X if lim E(XT − X)0(Xt − X) = 0
d
3. XT → X if limn→∞ FT (c) = F (c) where F and FT are the joint distribution
of X and XT .
16
Proposition C.1 L. Suppose {XT }T =1∞ is a sequence of n-dimensional ran-
dom vectors. Then the following relations hold:
m.s. p d
(a) XT → X ⇒ XT → X ⇒ XT → X
(c) (Slutsky’s Theorem) If g : Rn → Rm is a continuous function, then
p p
XT → X ⇒ g(XT ) → g(X) (i.e. plimg(XT ) = g(plimXT ))
d d
XT → X ⇒ g(XT ) → g(X)

d
Example Suppose that XT → N (0, 1). Then XT2 converges in distribution to the
d
square of N (0, 1), i.e. XT2 → χ21.

17
Proposition C.2 L. Suppose {XT }T =1∞ and {YT }T =1∞ are sequences of
n × 1 random vectors and AT is a sequence of n × n random matrices, x s a n × 1
random vector, c s a fixed n × 1 vector, and A is a fixed n × n matrix.
1. If plimXT , plimYT and plimAT exist then
(a) plim(XT + YT ) = plimXT + plimYT , plim(XT − YT ) = plimXT − plimYT ,
(b) plimc0XT = c0plimXT
(c) plimXT0 YT = (plimXT )0(plimYT )
(d) plimAT XT = (plimAT )(plimXT )
d d
2. If XT → X and plim(Xt − YT ) = 0 then YT → X.
d
3. If XT → X and plimYT = c, then
d
(a) Xt + YT → X + c
d
(b) YT0 Xt → c0X
d d
4. If XT → X and plimAT = A then AT XT → AX
d
5. If XT → X and plimAT = 0 then plimAT XT = 0

18
d
Example Let {XT } be a sequence of n × 1 random vectors with XT → N (µ, Ω),
p
and Let {YT } be a sequence of n × 1 random vectors with YT → C. Then by
d
3.(b) YT0 XT → N (C 0µ, CΩC 0) .

19
1.5 Limit Theorems

The Law of Large Numbers and the Central Limit Theorem are the most impor-
tant results for computing the limits of sequences of random variables.

There are many versions of LLN and CLT that differ on the assumptions about
the dependence of the variables.

Proposition C.12 L (Weak law of large numbers)


1. (iid sequences) Let {Yt} be an i.i.d sequence of random variables with finite
mean µ. Then
T
−1
X p
ȲT = T Yt → µ
t=1

2. (independent sequences) Let {Yt} be a sequence of independent random vari-


ables with E(Xt) = µ < ∞ and E|Xt|1+ ≤ c < ∞ for some  > 0 and a
p
finite constant c. Then T −1 Tt=1 Yt → µ.
P

3. (uncorrelated sequences) Let {Yt} be a sequence of uncorrelated random


variables E(Xt) = µ < ∞ and V ar(Xt) ≤ c < ∞ for some finite constant
20
PT p
c. Then T −1 t=1 Yt → µ.

4. (stationary processes) Let Yt be a convariance stationary process with finite


E(Yt) = µ and E[(Yt − µ)(Yt−j − µ)] = γj with absolutely summable auto-
P m.s. p
covariances j=0 |γj | < ∞. Then ȲT → µ hence ȲT → µ.

• Weak stationarity and absolutely summable covariances are sufficient conditions


for a law of large numbers to hold.

Proposition C.13L (Central limit theorem)


1. (i.i.d. sequences) Let {YT } be a sequence of n-dimensional iid(µ, Σ) random
variables then
√ d
T (ȲT − µ) → N (0, Σ)
P∞
2. (stationary processes) Let Yt = µ + j=0 Φj εt−j be a n-dimensional station-
P∞
ary random process with εt a i.i.d vector, E(Yt) = µ < ∞, j=0 kΦj k < ∞.
√ d P∞
Then T (ȲT − µ) → N (0, j=−∞ Γj ) where Γj is the autcovariance matrix
at lag j.

21
2 Some stationary processes

2.1 White Noise (WN)

A n-dimensional vector white noise 0t = [1t, ..., nt] ∼ W N (0, Ω) is such if
E(t) = 0 and Γk = Ω (Ω a symmetric positive definite matrix) if k = 0 and 0
if k 6= 0. If t, τ are independent the process is an independent vector White
Noise (i.i.d). If also t ∼ N the process is a Gaussian WN.

Important: A vector whose components are white noise is not necessarily a white
0
noise. Example:
 2 let ut be a scalar white noise and
 define t = (ut, ut−1) . Then
σu 0 0 0
E(t0t) = and E( 0
t t−1 ) = .
0 σu2 σu2 0

22
2.2 Vector Moving Average (VMA)

Given the n-dimensional vector White Noise t a vector moving average of order
q is defined as
Yt = µ + t + C1t−1 + ... + Cq t−q
where Cj are n × n matrices of coefficients and µ is the mean of Yt.

• The VMA(1)
Let us consider the VMA(1)

Yt = µ + t + C1t−1

with t ∼ W N (0, Ω). The variance of the process is given by

Γ0 = E[(Yt − µ)(Yt − µ)0]


= Ω + C1ΩC10

with autocovariances

Γ1 = C1Ω, Γ−1 = ΩC10 , Γj = 0 for |j| > 1

23
• The VMA(q)
Let us consider the VMA(q)

Yt = µ + t + C1t−1 + ... + Cq t−q

with t ∼ W N (0, Ω), µ is the mean of Yt. The variance of the process is given
by

Γ0 = E[(Yt − µ)(Yt − µ)0]


= Ω + C1ΩC10 + C2ΩC20 + ... + Cq ΩCq0

with autocovariances

Γj = Cj Ω + Cj+1ΩC10 + Cj+2ΩC20 + ... + Cq ΩCq−j


0
f or j = 1, 2, ..., q
0 0 0
Γj = ΩC−j + C1ΩC−j+1 + C2ΩC−j+2 + ... + Cq+j ΩCq0 f or j = −1, −2, ..., −q
Γj = 0 f or |j| > q

24
• The VMA(∞)
A useful process, as we will see, is the VMA(∞)

X
Yt = µ + Cj εt−j (1)
j=0

A very important result is that if the sequence {Cj } is absolutely summable (i.e.
P∞ P P
j=0 kC j k < ∞ where kC j k = m n cmn,j , or equivalenty each sequence
formed by the elements of the matrix is absolutely summable) then infinite se-
quence above generates a well defined (mean square convergent) process (see for
instance proposition C.10L).

Proposition (10.2H). Let Yt be an n × 1 vector satisfying



X
Yt = µ + Cj εt−j
j=0

where εt is a vector WN with E(εt) = 0 and E(εtε0t−j ) = Ω for j = 0 and


zero otherwise and {Cj }∞j=0 is absolutely summable. Let Yit denote the ith
element of Yt and µi the ith element of µ. Then

25
(a) The autocovariance between the ith variable at time t and the jth variable
at time s periods earlier, E(Yit − µi)(Yjt−s − µj ) exists and is given by
the row i column j element of

X
Γs = Cs+v ΩCv0
v=0

for s = 0, 1, 2, ....
(b) The sequence of matrices {Γs}∞
s=0 is absolutely summable.

If furthermore {εt}∞ t=−∞ is an i.i.d. sequence with E|εi1 t εi2 t εi3 t εi4 t | ≤ ∞ for
i1, i2, i3, i4 = 1, 2, ..., n, then also
(c) E|Yi1t1 Yi2t2 Yi3t3 Yi4t4 | ≤ ∞ for all t1, t2, t3, t4
p
(d) (1/T ) Tt=1 YitYjt−s → E(YitYjt−s), for i, j = 1, 2, ..., n and for all s
P

Implications:
1. Result (a) implies that the second moments of a M A(∞) with absolutely
summable coefficients can be found by taking the limit of the autocovariance
of an M A(q).
26
2. Result (b) ensures ergodicity for the mean
3. Result (c) says that Yt has bounded fourth moments
4. Result (d) says that Yt is ergodic for second moments

27
2.3 Invertibility and fundamentalness

• The VMA is invertible if and only if the determinant of C(L) vanishes only
outside the unit circle, i.e. if det(C(z)) 6= 0 for all |z| ≤ 1.

• If the process is invertible it possesses a unique VAR representation (clear


later on).

Example Consider the process


    
Y1t 1 L ε1t
=
Y2t 0 θ−L ε2t

det(C(z)) = θ − z which is zero for z = θ. Obviously the process is invertible if


and only if |θ| > 1.

• The VMA is fundamental if and only if the det(C(z)) 6= 0 for all |z| < 1.
In the previous example the process is fundamental if and only if |θ| ≥ 1. In the
case |θ| = 1 the process is fundamental but noninvertible.

28
• Provided that |θ| > 1 the MA process can be inverted and the shock can
be obtained as a combination of present and past values of Yt. In fact
 L 
1 − θ−L
  
Y1t ε1t
1 =
0 θ−L Y2t ε2t

• Notice that for any noninvertible process with determinant that does not van-
ish on the unit circle there is an invertible process with identical autocovariance
structure.

Example: univariate MA. Consider the MA(1)

yt = ut + mut−1

with ut ∼ W N , and |m| > 1 i.e. yt is noninvertible. The autocovariances are


E(yt2) = (1 + m)2σu2 , E(ytyt−j ) = mσu2 for j = 1, −1 and E(ytyt−j ) = 0 for
j > |1|. Now consider this alternative representation
1
y t = vt + vt−1
m
29
which is invertible and
 −1  −1
1 1
vt = 1 + L yt = 1 + L (1 + mL)ut
m m
where vt is indeed a white noise process with variance m2σu2 .

30
2.4 Wold Decomposition

Any zero-mean stationary vector process Yt admits the following representation

Yt = C(L)εt + µt (2)

where C(L)t is the stochastic component with C(L) = ∞ i


P
i=0 Ci L and µt the
purely deterministic component, the one perfectly forecastable using linear com-
binations of past Yt.

If µt = 0 the process is said regular. Here we only consider regular processes.

(2) represents the Wold representation of Yt which is unique and for which
the following properties hold:
(a) t is the innovation for Yt, i.e. t = Yt − Proj(Yt|Yt−1, Yt−1, ...).
(b) t is White noise, Et = 0, Et0τ = 0, for t 6= τ , Et0t = Ω
(c) The coefficients are square summable ∞ 2
P
j=0 kCj k < ∞.

(d) C0 = I

31
• The result is very powerful since holds for any covariance stationary process.

• However the theorem does not implies that (2) is the true representation of
the process. For instance the process could be stationary but non-linear or non-
invertible.

32
2.5 Other fundamental MA(∞) Representations

It is easy to extend the Wold representation to the general class of fundamental


MA(∞) representations. For any non singular matrix R of constant we have

Yt = C(L)Rut
= D(L)ut

where ut ∼ W N (0, R−1ΩR−10) and ut = R−1t.

• Fundamentalness is ensured since ut is a linear combination of the Wold shocks.


The roots of the deteminant of D(L) will coincide with those of C(L).

33
3 VAR: representations

• If the MA matrix of lag polynomials is invertible, then for a vector stationary


process a Vector Autoregressive (VAR) representation exists.

• We define C(L)−1 as an (n × n) lag polynomial such that C(L)−1C(L) = I;


i.e. when these lag polynomial matrices are matrix-multiplied, all the lag terms
cancel out. This operation in effect converts lags of the errors into lags of the
vector of dependent variables.

• Thus we move from MA coefficient to VAR coefficients. Define A(L) = C(L)−1.


Then given the (invertible) MA coefficients, it is easy to map these into the VAR
coefficients:
Yt = C(L)t
A(L)Yt = t (3)
where A(L) = A0L0 + A1L1 + A2L2 + ... and Aj for all j are (n × n) matrices
of coefficients.

34
• To show that this matrix lag polynomial exists and how it maps into the coef-
ficients in C(L), note that by assumption we have the identity

(A0 + A1L1 + A2L2 + ...)(I + C1L1 + C2L2 + ...) = I

After distributing, the identity implies that coefficients on the lag operators must
be zero, which implies the following recursive solution for the VAR coefficients:

A0 = I
A1 = −A0C1
Ak = −A0Ck − A1Ck − ... − Ak−1C1

• As noted, the VAR is of infinite order (i.e. infinite number of lags required
to fully represent joint density). In practice, the VAR is usually restricted for
estimation by truncating the lag-length.

• The pth-order vector autoregression, denoted VAR(p) is given by

Yt = A1Yt−1 + A2Yt−2 + ... + ApYt−p + t (4)

35
Note: Here we are considering zero mean processes. In case the mean of Yt is not
zero we should add a constant in the VAR equations.

• VAR(1) representation Any VAR(p) can be rewritten as a VAR(1). To


form a VAR(1) from the general model we define: e0t = [0, 0, ..., 0], Yt0 =
0
[Yt0, Yt−1 0
, ..., Yt−p+1 ]
 
A1 A2 ... Ap−1 Ap
 In 0 ... 0 0 
 
A =  0 In ... 0 0 
 
 . ... ... 
 .. 
0 ... ... In 0

Therefore we can rewrite the VAR(p) as a VAR(1)

Yt = AYt−1 + et

This is also known as the companion form of the VAR(p).

36
• SUR representation The VAR(p) can be stacked as

Y = XΓ + u

where X = [X1, ..., XT ]0, Xt = [Yt−1 0 0


, Yt−2 0 0
..., Yt−p ] Y = [Y1, ..., YT ]0, u =
[1, ..., T ]0 and Γ = [A1, ..., Ap]0

• Vec representation Let vec denote


 the  stacking columns operator, i.e X =
X11
 X21 
X11 X12
   
X 
 X21 X22  then vec(X) =  31 
 X12 
 
X31 X32 
 X22 

X32
Let γ = vec(Γ), then the VAR can be rewritten as

Yt = (In ⊗ Xt0)γ + t

37
4 VAR: Stationarity

4.1 Stability and stationarity

Consider the VAR(1)


Yt = µ + AYt−1 + εt
Substituting backward we obtain

Yt = µ + AYt−1 + εt
= µ + A(µ + AYt−2 + εt−1) + εt
= (I + A)µ + A2Yt−2 + Aεt−1 + εt
...
j−1
X
Yt = (I + A + ... + Aj )µ + Aj Yt−j + Aiεt−i
i=0

If all the eigenvalues of A are smaller than one in modulus then


1. Aj = P Λj P −1 → 0.
2. the sequence Ai, i = 0, 1, ... is absolutely summable.

38
Pj−1
3. the infinite sum i=0 Aiεt−i exists in mean square (see e.g. proposition
C.10L);
4. (I + A + ... + Aj )µ → (I − A)−1 and Aj → 0 as j goes to infinity.
Therefore if the eigenvalues are smaller than one in modulus then Yt has the
following representation

X
−1
Yt = (I − A) + Aiεt−i
i=0

• Note that the eigenvalues (λ) of A satisfy det(Iλ − A) = 0. Therefore


the eigenvalues correspond to the reciprocal of the roots of the determinant of
A(z) = I − Az.

• A VAR(1) is called stable if det(I − Az) = 6 0 for |z| ≤ 1. Equivalently


stability requires that all the eigenvalues of A are smaller than one in absoulte
value.

39
• For a VAR(p) the stability condition also requires that all the eigenvalues of A
(the AR matrix of the companion form of Yt) are smaller than one in modulus or
all the roots larger than one. Therefore we have that a VAR(p) is called stable if
det(I − A1z − A2z 2, ..., Apz p) 6= 0 for |z| ≤ 1.

• A condition for stationarity: A stable VAR process is stationary.

• Notice that the converse is not true. An unstable process can be stationary.

• Notice that the vector M A(∞) representation of a stationary VAR satisfies


the absolute summability condition so that assumptions of 10.2H hold.

40
4.2 Back the Wold representation

• How can we find it the Wold representation starting from a VAR?

• We need to invert the VAR representation. So let us rewrite the VAR(p)


as a VAR(1). Substituting backward in the companion form we have

Yt = Aj Yt−j + Aj−1et−j+1 + ... + A1et−1 + ... + et


P∞ j
If conditions for stationarity are satisfied, the series i=1 A converges and Yt
has an VMA(∞) representation in terms of the Wold shock et given by

Yt = (I − AL)−1et
X∞
= Aj et−j
i=1
= C(L)et

where C0 = A0 = I, C1 = A1, C2 = A2, ..., Ck = Ak . The coefficients of the


Wold representation of Yt, Cj , will be the n × n upper left matrix of Cj .

41
Example A stationary VAR(1)
      
Y1t 0.5 0.3 Y1t−1 1t
= +
Y2t 0.02 0.8 Y2t−1 2t
   
1 0.3 0.81
E(t0t) = Ω = λ=
0.3 .1 0.48

Figure 1: Blu: Y1, green Y2.


42
5 VAR: second moments

Let us consider the companion form of a stationary (zero mean for simplicity)
VAR(p) defined earlier
Yt = AYt−1 + et (5)
The variance of Yt is given by
Σ = E(YtYt0 )
= AΣA0 + Ω (6)
a closed form solution to (6) can be obtained in terms of the vec operator. Let
A, B, C be matrices such that the product ABC exists. A property of the vec
operator is that
vec(ABC) = (C 0 ⊗ A)vec(B)
Applying the vec operator to both sides of (7) we have
vec(Σ) = (A ⊗ A)vec(Σ) + vec(Ω)
If we define A = (A ⊗ A) then we have
vec(Σ) = (I − A)−1vec(Ω)
43
The jth autocovariance of Yt (denoted Γj ) can be found by post multiplying (6)
by Yt−j and taking expectations:

E(YtYt−j ) = AE(YtYt−j ) + E(etYt−j )

Thus
Γj = AΓj−1
or
Γ j = Aj Γ
The variance Σ and the jth autocovariance Γj of the original series Yt is given by
the first n rows and columns of Σ and Γj respectively.

44
6 VAR: specification

• Specification of the VAR is key for empirical analysis. We have to decide about
the following:
1. Number of lags p.
2. Which variables.
3. Type of transformations.

45
6.1 Number of lags

As in the univariate case, care must be taken to account for all systematic dy-
namics in multivariate models. In VAR models, this is usually done by choosing
a sufficient number of lags to ensure that the residuals in each of the equations
are white noise.

• AIC: Akaike information criterion Choosing the p that minimizes the fol-
lowing
AIC(p) = T ln |Ω̂| + 2(n2p)

• BIC: Bayesian information criterionChoosing the p that minimizes the fol-


lowing
BIC(p) = T ln |Ω̂| + (n2p) ln T

• HQ: Hannan- Quinn information criterion Choosing the p that minimizes


the following
ln T
HQ(p) = T ln |Ω̂| + (n2p)
T
46
• p̂ obtained using BIC and HQ are consistent while with AIC it is not.

• AIC overestimate the true order with positive probability and underestimate
the true order with zero probability.

• Suppose a VAR(p) is fitted to Y1, ..., YT (Yt not necessarily stationary). In


small sample the following relations hold:
p̂BIC ≤ p̂AIC if T ≥ 8
p̂BIC ≤ p̂HQ for all T
p̂HQ ≤ p̂AIC if T ≥ 16

47
6.2 Type of variables

Variables selection is a key step in the specification of the model.

VAR models are small scale models so usually 2 to 8 variables are used.

Variables selection depends on the particular application and in general should


be included all the variable conveying relevant information.

We will see several examples.

48
6.3 Type of transformations

• Problem: many economic time series display trend over time and are clearly
non stationary (mean not constant).

• Trend-stationary series

Yt = µ + bt + εt, εt ∼ W N.

• Difference-stationary series

Yt = µ + Yt−1 + εt, εt ∼ W N.

49
Figure 1: blu: log(GDP). green: log(CPI)

50
• These series can be thought as generated by some nonstationary process. Here
there are some examples

Example: Difference stationary


            
Y1t 0.01 0.7 0.3 Y1t−1 1t 1 0.3 1
= + + , Ω= , λ=
Y2t 0.02 −0.2 1.2 Y2t−1 2t 0.3 .1 0.9

51
Example: Trend stationary
            
Y1t 0 0.5 0.3 Y1t−1 1t 1 0.3 0.81
= t+ + Ω= , λ=
Y2t 0.01 0.02 0.8 Y2t−1 2t 0.3 .1 0.48

52
So:

1) How do I know whether the series are stationary?

2) What to do if they are non stationary?

53
• Dickey-Fuller test In 1979, Dickey and Fuller have proposed the following test
for stationarity
1. Estimate with OLS the following equation

∆xt = b + γxt + εt

2. Test the null γ = 0 against the alternative γ < 0.


3. If the null is not rejected then

xt = b + xt + εt

which is a random walk with drift.


4. On the contrary if γt < 0, then xt is a stationary AR

xt = b + axt + εt

with a = 1 + γ < 1.
• An alternative is to specify the equation augmented by a deterministic trend

∆xt = b + γxt + ct + εt
54
With this specification under the alternative the preocess is stationary with a
deterministic linear trend.

• Augmented Dickey-Fuller test. In the augmented version of the test p lags of


the lags of ∆xt can be added, i.e.

A(L)∆xt = b + γxt + εt

or
A(L)∆xt = b + γxt + ct + εt

• If the test statistic is smaller than (negative) the critical value, then the null
hypothesis of unit root is rejected.

55
• Transformations I: first differences
Let ∆ = 1 − L be the first differences filter, i.e. a filter such that ∆Yt = Yt − Yt−1
and let us consider the simple case of a random walk with drift

Yt = µ + Yt−1 + t

where t is WN. By applying the first differences filter (1 − L) the process is


transformed into a stationary process

∆Yt = µ + t

Let us now consider a process with deterministic trend

Yt = µ + δt + t

By differencing the process we obtain

∆Yt = δ + ∆t

which is a stationary process but is not invertible because it contains a unit root
in the MA part.

56
• log(GDP) and log(CPI) in first differences

57
• Transformations II: removing deterministic trend
Removing a deterministic trend (linear or quadratic) from a process from a trend
stationary variable is ok. However this is not enough if the process is a unit root
with drift. To see this consider again the process

Yt = µ + Yt−1 + t

this can be writen as


t
X
Yt = µt + Y0 + j
j=0
By removing the deterministic trend the mean of the process becomes constant
but the variance grows over time so the process is not stationary.

58
• log(GDP) and log(CPI) linearly detrended

59
• Transformations of trending variables: Hodrick-Prescott filter
The filter separates the trend from the cyclical component of a scalar time series.
Suppose yt = gt + ct, where gt is the trend component and ct is the cycle. The
trend is obtained by solving the following minimization problem
T
X T −1
X
min c2t + λ [(gt+1 − gt) − (gt − gt−1)]2
{gt }Tt=1
t=1 t=2

The parameter λ is a positive number (quarterly data usually =1600) which pe-
nalizes variability in the growth component series while the first part is the penalty
to the cyclical component. The larger λ the smoother the trend component.

60

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy