0% found this document useful (0 votes)
3 views

L6 - Kalman Filter

The document discusses the Kalman filter, a method for estimating unobservable state variables using observable data in a dynamic system. It outlines the mathematical framework and assumptions behind the filter, including multivariate normality and state-space models. The document also provides examples and recursive equations for implementing the Kalman filter in practice.

Uploaded by

gaminggamer184
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

L6 - Kalman Filter

The document discusses the Kalman filter, a method for estimating unobservable state variables using observable data in a dynamic system. It outlines the mathematical framework and assumptions behind the filter, including multivariate normality and state-space models. The document also provides examples and recursive equations for implementing the Kalman filter in practice.

Uploaded by

gaminggamer184
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

LECTURE 6

KALMAN FILTER

Keith Law
FROM CONTEMPORANEOUS TO TEMPORAL

• We have so far focused on specifying and estimating factor models based on


contemporaneous (same time) relationship amongst the observed variables. For example, we
use PCA to decompose the covariance or correlation matrix of the N observable variables
into a set of K latent factors

• We have not yet studied dynamic patterns, for prediction purpose

• Sometimes, a system may be governed by unobservable variable(s), we call it state variable(s),


which we can only estimate and ‘update’ through the arrival of observable variables

• This is called ‘learning’ process.The main idea is Bayesian statistics

• A well-known linear state-space process with the assumption of multivariate normality for
state variables, is the Kalman filter
KALMAN FILTER

• Kalman filter is a powerful tool for combining information in the presence of uncertainty

• Denote the state variable as 𝑥𝑡 (which can be a vector of variables), which is unobservable, and the
observable variable as 𝑦𝑡 (which can be vector as well)

• The goal of the Kalman filter is to update knowledge of the state variable recursively when a new data point
becomes available. That is, knowing the conditional distribution of 𝑥𝑡 given the information at 𝑡 − 1 (ℱ𝑡−1 ) and
the new data 𝑦𝑡 , we want to obtain the conditional distribution of 𝑥𝑡 given ℱ𝑡 , where ℱ𝑡 = 𝑦1 , … , 𝑦𝑡

• To derive the Kalman filter, we need to consider the joint conditional distribution of (𝑥𝑡 , 𝑣𝑡 )′ given ℱ𝑡−1 ,
where 𝑣𝑡 = 𝑦𝑡 − 𝑦𝑡|𝑡−1 , and 𝑦𝑡|𝑡−1 means 𝐸𝑡−1 𝑦𝑡 ≡ 𝐸 𝑦𝑡 ℱ𝑡−1

• By the multivariate normality theorem (next page), if the state variable and observable variables are jointly
multivariate normally distributed, the conditional distribution of 𝑥𝑡 and 𝑣𝑡 given ℱ𝑡−1 are normal as well
MULTIVARIATE CONDITIONAL NORMALITY THEOREM
• For a random vector, say 𝑊, the multivariate normal density is given by:
1 1 ′ Σ −1 W − 𝜇
𝑓 𝑾 𝜇, Σ = 𝑘 1
exp − 𝑊 − 𝜇
2
2𝜋 2 Σ 2
• Multivariate Conditional Normality theorem: Suppose 𝑊 = (𝑥, 𝑦, 𝑧)′ where 𝑥, 𝑦 and 𝑧 are three random
vectors such that their joint distribution is multivariate normal. Assume Σ𝑤 is non-singular and Σ𝑦𝑧 =
′ = 𝟎 (block zero matrix). Then the conditional distribution of 𝑥 given y is normal with
Σ𝑧𝑦
−𝟏
𝑽𝒂𝒓 𝒙 𝒚 = 𝜮𝒙|𝒚 = 𝜮𝒙𝒙 − 𝜮𝒙𝒚 𝜮𝒚𝒚 𝜮𝒚𝒙 ,
−𝟏
𝑬 𝒙 𝒚 = 𝝁𝒙|𝒚 = 𝝁𝒙 + 𝜮𝒙𝒚 𝜮𝒚𝒚 (𝒚 − 𝝁𝒚 )

And the conditional distribution of 𝑥 given 𝑦 and 𝑧 is also normal with


𝑽𝒂𝒓 𝒙 𝒚, 𝒛 = 𝜮𝒙|𝒚,𝒛 = 𝜮𝒙|𝒚 − 𝜮𝒙𝒛 𝜮−𝟏
𝒛𝒛 𝜮𝒛𝒙

−𝟏 (𝒛 − 𝝁 )
𝑬 𝒙 𝒚, 𝒛 = 𝝁𝒙|𝒚,𝒛 = 𝝁𝒙|𝒚 + 𝜮𝒙𝒛 𝜮𝒛𝒛 𝒛
LINEAR STATE-SPACE MODEL
For 𝑡 = 1, … , 𝑇, 𝑥𝑡+1 = 𝑏𝑡 + 𝐴𝑡 𝑥𝑡 + 𝜖𝑡+1
𝑦𝑡 = 𝑑𝑡 + 𝐶𝑡 𝑥𝑡 + 𝜀𝑡
where 𝑥1|0 ~𝑁 𝜇0 , Σ0 , 𝜖𝑡 ~𝑁 0, 𝑄 , 𝜀𝑡 ~𝑁 0, 𝑅
• 𝑥𝑡 state variable (vector), the transition equation is a first-order Markov chain, with transition matrix 𝐴𝑡
• 𝑦𝑡 is observable, the measurement or observation equation relates observation to state vector 𝑥𝑡 through the
measurement matrix 𝐶𝑡
• 𝑏𝑡 (transition offsets) and 𝑑𝑡 (observation offsets) are deterministic (can be zero vectors)
• 𝜖𝑡 (state errors) and 𝜀𝑡 (measurement errors) are Gaussian white noise with state (transition) covariance 𝑄𝑡
and observation covariance 𝑅𝑡 , both positive-definite
• Initial state 𝑥1|0 ~𝑁(𝜇0 , Σ0 ), 𝜇0 , Σ0 are given and are independent of 𝜖𝑡 and 𝜀𝑡
• 𝐴𝑡 , 𝐶𝑡 , 𝑄𝑡 , 𝑅𝑡 are system matrices and known. These matrices are often sparse, and can be functions of some
parameters 𝜃, which can be estimated by maximum likelihood method
A DEEPER LOOK
For 𝑡 = 1, … , 𝑇, 𝑥𝑡+1 = 𝑏𝑡 + 𝐴𝑡 𝑥𝑡 + 𝜖𝑡+1
𝑦𝑡 = 𝑑𝑡 + 𝐶𝑡 𝑥𝑡 + 𝜀𝑡
where 𝑥1|0 ~𝑁 𝜇0 , Σ0 , 𝜖𝑡 ~𝑁 0, 𝑄 , 𝜀𝑡 ~𝑁 0, 𝑅
• If 𝐴𝑡 = 𝐼, then the regression coefficients (state equations) represent independent random walk, that is, each
factor’s impact on the observations is serially correlated but its evolution is independent from that of other
factors
• If 𝑄 = 𝟎, then the state-space model reduces to a simple OLS
• In this dynamic setting, an observable process 𝑦𝑡 is driven by a latent (unobservable) process 𝑥𝑡 up to
(independent) Gaussian errors. The latent process is autoregressive of order 1, that is, only depends on its own
previous realization
• This is Bayesian concept as we estimate (or update) 𝑥𝑡 (multi-dimensional) as time-varying when we observe 𝑦𝑡
overtime
LINEAR STATE-SPACE MODELS -EXAMPLES

CAPM with time-varying coefficients:


Denoting 𝑟𝑡 as the excess return of asset and 𝑟𝑀,𝑡 as the excess return of the market:
𝑟𝑡 = 𝛼𝑡 + 𝛽𝑡 𝑟𝑀,𝑡 + 𝜀𝑡 ,
𝛼𝑡+1 = 𝛼𝑡 + 𝜂𝑡+1 ,
𝛽𝑡+1 = 𝛽𝑡 + 𝜖𝑡+1 ,

We can cast the problem into linear state space format as:

𝜶𝒕 𝟏 𝟎 𝟎 ′ 𝟏 𝟐 𝝈𝟐𝜼 𝟎
𝒙𝒕 = 𝜷 , 𝑨𝒕 = , 𝒃𝒕 = 𝒅𝒕 = , 𝑪𝒕 = 𝒓 , 𝑹 𝒕 = 𝝈 𝜺 , 𝑸𝒕 =
𝒕 𝟎 𝟏 𝟎 𝑴,𝒕 𝟎 𝝈𝟐𝝐
KALMAN FILTER RECURSION
For 𝑡 = 1, … , 𝑇,
𝑥𝑡+1|𝑡 = 𝐸𝑡 𝑏𝑡 + 𝐴𝑡 𝑥𝑡 + 𝜖𝑡+1 = 𝑏𝑡 + 𝐴𝑡 𝑥𝑡|𝑡 (1)
Σ𝑡+1|𝑡 = 𝑉𝑎𝑟𝑡 𝐴𝑡 𝑥𝑡 + 𝜖𝑡+1 = 𝐴𝑡 Σ𝑡|𝑡 𝐴′𝑡 + 𝑄𝑡 (2)
𝑦𝑡 = 𝑑𝑡 + 𝐶𝑡 𝑥𝑡 + 𝜀𝑡 , 𝑦𝑡|𝑡−1 = 𝑑𝑡 + 𝐶𝑡 𝑥𝑡|𝑡−1

𝑣𝑡 = 𝑦𝑡 − 𝑦𝑡|𝑡−1 = 𝑦𝑡 − 𝑑𝑡 + 𝐶𝑡 𝑥𝑡|𝑡−1 = 𝐶𝑡 𝑥𝑡 − 𝑥𝑡|𝑡−1 + 𝜀𝑡 (3)


𝐸𝑡−1 (𝑣𝑡 ) = 0, 𝑐𝑜𝑣(𝑣𝑡 , 𝑦𝑗 ) = 0 𝑓𝑜𝑟 1 ≤ 𝑗 < 𝑡, 𝑉𝑡 = 𝑉𝑎𝑟𝑡−1 (𝑣𝑡 ) = 𝑉𝑎𝑟 𝑣𝑡 = 𝐶𝑡 Σ𝑡|𝑡−1 𝐶𝑡′ + 𝑅𝑡

𝑥𝑡|𝑡 = 𝐸𝑡 𝑥𝑡 = 𝐸 𝑥𝑡 ℱ𝑡−1 , 𝑣𝑡 = 𝐸 𝑥𝑡 ℱ𝑡−1 + 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑡 𝑉𝑡−1 𝑣𝑡 = 𝑥𝑡|𝑡−1 + Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 𝑣𝑡


So, (1) can be written as:
𝑥𝑡+1|𝑡 = 𝑏𝑡 + 𝐴𝑡 𝑥𝑡|𝑡−1 + 𝐴𝑡 Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 𝑣𝑡 = 𝑏𝑡 + 𝐴𝑡 𝑥𝑡|𝑡−1 + 𝐾𝑡 𝑣𝑡 where 𝐾𝑡 = 𝐴𝑡 Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 (Kalman gain)

𝛴𝑡|𝑡 = 𝑉𝑎𝑟 𝑥𝑡 ℱ𝑡 = 𝑉𝑎𝑟 𝑥𝑡 ℱ𝑡−1 , 𝑣𝑡 = Σ𝑡|𝑡−1 − 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑡 𝑉𝑡−1 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑡 ′ = Σ𝑡|𝑡−1 − Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 𝐶𝑡 Σ𝑡|𝑡−1
(2) can be written as:
Σ𝑡+1|𝑡 = 𝐴𝑡 Σ𝑡|𝑡−1 𝐴′𝑡 − 𝐴𝑡 Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 𝐶𝑡 Σ𝑡|𝑡−1 𝐴′𝑡 + 𝑄𝑡 = 𝐴𝑡 Σ𝑡|𝑡−1 𝐿′𝑡 + 𝑄𝑡 where 𝐿𝑡 = 𝐴𝑡 − 𝐾𝑡 𝐶𝑡 .
STATE ESTIMATION ERROR

Define state error as: 𝑒𝑡 = 𝑥𝑡 − 𝑥𝑡|𝑡−1


𝑉𝑎𝑟 𝑒𝑡 ℱ𝑡−1 = 𝑉𝑎𝑟 𝑥𝑡 ℱ𝑡−1 = Σ𝑡|𝑡−1

𝑒𝑡+1 = 𝑥𝑡+1 − 𝑥𝑡+1|𝑡 = 𝐴𝑡 𝑥𝑡 − 𝑥𝑡|𝑡−1 + 𝜖𝑡+1 − 𝐾𝑡 𝑣𝑡 = 𝐴𝑡 𝑒𝑡 + 𝜖𝑡+1 − 𝐾𝑡 𝐶𝑡 𝑥𝑡 − 𝑥𝑡|𝑡−1 + 𝜀𝑡


= 𝐿𝑡 𝑒𝑡 + 𝜖𝑡+1 − 𝐾𝑡 𝜀𝑡
Now we obtain a state-space form for 𝑣𝑡 :
𝑣𝑡 = 𝐶𝑡 𝑒𝑡 + 𝜀𝑡
𝑒𝑡+1 = 𝐿𝑡 𝑒𝑡 + 𝜖𝑡+1 − 𝐾𝑡 𝜀𝑡
with 𝑒1 = 𝑥1 − 𝑥1|0 for 𝑡 = 1, … , 𝑇
STATE SMOOTHING

Since ℱ𝑡−1 𝑎𝑛𝑑 {𝑣𝑡 , … , 𝑣𝑇 } are independent, 𝑣𝑡 are serially independent.

′ ′ −1 ′
𝑥𝑡|T = 𝐸 𝑥𝑡 ℱ𝑇 = 𝐸 𝑥𝑡 ℱ𝑡−1 , 𝑣𝑡 , … , 𝑣𝑇 ) = 𝐸(𝑥𝑡 ℱ𝑡−1 + 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑡 , … , 𝑣 𝑇 𝑐𝑜𝑣 𝑣𝑡 , … 𝑣𝑇 𝑣𝑡 , … , 𝑣𝑇

−1 𝑣𝑡
′ 𝑉𝑡 0 ,,, 0
𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑡 𝑣𝑡+1
0 𝑉𝑡+1 … 0
= 𝑥𝑡|t−1 + … …
… … … …
𝑐𝑜𝑣 𝑥𝑡 , 𝑣 𝑇 𝑣𝑇
0 … … 𝑉𝑇

= 𝑥𝑡|t−1 + ෍ 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑖 𝑉𝑖−1 𝑣𝑖


𝑖=𝑡
The covariance matrices 𝐶𝑜𝑣 𝑥𝑡 , 𝑣𝑖 𝑓𝑜𝑟 𝑖 = 𝑡, … , 𝑇 are:
𝐶𝑜𝑣 𝑥𝑡 , 𝑣𝑖 = 𝐸 𝑥𝑡 𝑣𝑖′ = 𝐸 𝑥𝑡 𝐶𝑖 𝑒𝑖 + 𝜀𝑖 ′
= 𝐸 𝑥𝑡 𝑒𝑖′ 𝐶𝑖′
STATE SMOOTHING


𝐸 𝑥𝑡 𝑒𝑡′ = 𝐸 𝑥𝑡 𝑥𝑡 − 𝑥𝑡|𝑡−1 = 𝑉𝑎𝑟 𝑥𝑡 = Σ𝑡|𝑡−1

𝐸 𝑥𝑡 𝑒𝑡+1 = 𝐸 𝑥𝑡 𝐿𝑡 𝑒𝑡 + 𝜖𝑡+1 − 𝐾𝑡 𝜀𝑡 ′
= Σ𝑡|𝑡−1 𝐿′𝑡

𝐸 𝑥𝑡 𝑒𝑡+2 = Σ𝑡|𝑡−1 𝐿′𝑡 𝐿′𝑡+1
:
𝐸 𝑥𝑡 𝑒𝑇′ = Σ𝑡|𝑡−1 𝐿′𝑡 … 𝐿′𝑇−1

Plugging into𝑥𝑡|T = 𝑥𝑡|t−1 + σ𝑇𝑖=𝑡 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑖 𝑉𝑖−1 𝑣𝑖 = 𝑥𝑡|t−1 + σ𝑇𝑖=𝑡 𝐸 𝑥𝑡 𝑒𝑖′ 𝐶𝑖′ 𝑉𝑖−1 𝑣𝑖

𝑥𝑇|𝑇 = 𝑥𝑇|𝑇−1 + Σ 𝑇|𝑇−1 𝐶𝑇′ 𝑉𝑇−1 𝑣𝑇


′ −1
𝑥𝑇−1|𝑇 = 𝑥𝑇−1|𝑇−2 + Σ 𝑇−1|𝑇−2 𝐶𝑇−1 𝑉𝑇−1 𝑣𝑇−1 + Σ 𝑇−1|𝑇−2 𝐿′𝑇−1 𝐶𝑇′ 𝑉𝑇−1 𝑣𝑇
:
′ −1
𝑥𝑡|𝑇 = 𝑥𝑡|𝑡−1 + Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 𝑣𝑡 + Σ𝑡|𝑡−1 𝐿′𝑡 𝐶𝑡+1 𝑉𝑡+1 𝑣𝑡+1 + ⋯ + Σ𝑡|𝑡−1 𝐿′𝑡 𝐿′𝑡+1 … 𝐿′𝑇−1 𝐶𝑇′ 𝑉𝑇−1 𝑣𝑇
STATE SMOOTHING

Let’s define 𝑞𝑇 = 0 and 𝑞𝑡−1 = 𝐶𝑡′ 𝑉𝑡−1 𝑣𝑡 + 𝐿′𝑡 𝑞𝑡 , we can simplify the recursive formula as
𝑥𝑡|𝑇 = 𝑥𝑡|𝑡−1 + Σ𝑡|𝑡−1 𝑞𝑡−1

for 𝑡 = 𝑇, … , 1

How about Σ𝑡|𝑇 ?


𝑇

Σ𝑡|𝑇 = 𝑉𝑎𝑟 𝑥𝑡 ℱ𝑡−1 , 𝑣𝑡 , … , 𝑣𝑇 = Σ𝑡|𝑡−1 − ෍ 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑖 𝑉𝑖−1 𝑐𝑜𝑣 𝑥𝑡 , 𝑣𝑖 ′

𝑖=𝑡
′ −1
Σ𝑡|𝑇 = Σ𝑡|𝑡−1 − Σ𝑡|𝑡−1 𝐶𝑡′ 𝑉𝑡−1 𝐶𝑡 Σ𝑡|𝑡−1 − Σ𝑡|𝑡−1 𝐿′𝑡 𝐶𝑡+1 𝑉𝑡+1 𝐶𝑡+1 Lt Σ𝑡|𝑡−1 − ⋯ − Σ𝑡|𝑡−1 𝐿′𝑡 … 𝐿′𝑇−1 𝐶𝑇′ 𝑉𝑇−1 𝐶𝑇 𝐿 𝑇−1 … 𝐿𝑡 Σ𝑡|𝑡−1

Define 𝑀𝑇 = 0 and 𝑀𝑡−1 = 𝐶𝑡′ 𝑉𝑡−1 𝐶𝑡 + 𝐿′𝑡 𝑀𝑡 𝐿𝑡 , we have


Σ𝑡|𝑇 = Σ𝑡|𝑡−1 − Σ𝑡|𝑡−1 Mt−1 Σ𝑡|𝑡−1
for 𝑡 = 𝑇, … , 1
FORECASTING

So far, we have been dealing with the state variable and have defined filtering as:
𝐸 𝑥𝑡 |ℱ𝑡−1 ≡ 𝑥𝑡|𝑡−1
And smoothing as𝐸 𝑥𝑡 |ℱ𝑇 ≡ 𝑥𝑡|𝑇

But the ultimate goal is to forecast the observable variable via the state variable, i.e. 𝐸 𝑦𝑡+1 |ℱ𝑡 ≡ 𝑦𝑡+1|𝑡
Suppose the forecast origin is t and we are interested in predicting 𝑦𝑡+𝑗 𝑓𝑜𝑟 𝑗 = 1, … , ℎ, where h > 0
𝑦𝑡+1|𝑡 = 𝑑𝑡+1 + 𝐶𝑡+1 𝑥𝑡+1|𝑡

The forecast error is 𝜉𝑡+1 = 𝑦𝑡+1 − 𝑦𝑡+1|𝑡 = 𝐶𝑡+1 𝑒𝑡+1 + 𝜀𝑡+1 and 𝑉𝑎𝑟 𝜉𝑡+1 = 𝐶𝑡+1 Σ𝑡+1|𝑡 𝐶𝑡+1 + 𝑅𝑡+1
This is precisely the covariance matrix 𝑉𝑡+1 of the Kalman filter
INITIALIZATION

The whole iteration starts with 𝑥1|0 ~𝑁 𝜇0 , Σ0 . How do we set 𝜇0 and Σ0 ?


1. If 𝑡 is long enough, the initialization should have less impact as it is given long period to converge
2. In the Bayesian framework, we have the non-informative prior (diffused prior) concept. Basically, we set a large
variance so that the probability is ‘diffused’ such that the prior will have little impact on the posterior. We will
discuss this concept later in the course
3. We may also allow 𝜇0 and Σ0 to be estimated from the maximum likelihood based on the data. The algorithm is
known as Expectation-Maximization (EM). This can be implemented together with other variables like the state
(transition) covariance and the observation covariance etc.
EXAMPLE: DYNAMIC FACTOR EXPOSURE

Suppose we use 5 PCs from the factors of HW3. Run the NKY returns on the factors with dynamic factor
loadings. The left chart is on the filtered states, the right chart is on the smoothed states

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy