Unit 15 Stationary Processes: Structure
Unit 15 Stationary Processes: Structure
Structure
15.1 Introduction
Objectives
15.2 Stationary Processes
Stationary Process
Strict Stationary Process
Weak Stationary Process
15.3 Autocovariance and Autocorrelation
Autocovariance and Autocorrelation Coefficients
Estimation of Autocovariance and Autocorrelation Coefficients
15.4 Correlogram of Stationary Processes
Interpretation of Correlogram
15.5 Summary
15.6 Solutions / Answers
15.1 INTRODUCTION
In Units 13 and 14, you have learnt that a time series can be decomposed
into four components, i.e., Trend (T), Cyclic (C ), Seasonal (S) and Irregular
(I) components. We have discussed methods for smoothing or filtering the
time series and for estimating Trend, Seasonal and Cyclic components. We
have also explained how to use them for forecasting.
In this unit, we describe a very important class of time series, called the
stationary time series. In Sec. 15.2, we explain the concept of stationary
process and define weak and strict stationary processes. We discuss
autocovariance, autocorrelation function and correlogram of a stationary
process in Secs. 15.3 and 15.4. If a time series is stationary, we can model it
and draw further inferences and make forecasts. If a time series is not
stationary, we cannot do any further analysis and hence cannot make
reliable forecasts. If a time series shows a particular type of non-stationarity
and some simple transformations can make it stationary, then we can
model it.
In the next unit, we shall discuss certain stationary linear models such as
Auto Regressive (AR), Moving Average (MA) and mixed Autoregressive
Moving Average (ARMA) processes. We shall also discuss how to deal
with models with trend by considering an integrated model called
Autoregressive Integrated Moving Average (ARIMA) model.
Objectives
After studying this unit, you should be able to:
describe stationary processes;
define weak and strict stationary processes;
define autocovariance and autocorrelation coefficients;
estimate autocovariance and autocorrelation coefficients;
plot the correlogram and interpret it; and
make proper choice of probability models for further studies.
55
Time Series Modelling
15.2 STATIONARY PROCESSES
In the course MST-004, you have studied random variables and their
properties. Recall that a random variable Y is a function defined on a sample
space. A family of random variables defined on the same sample space
taking values over time is known as a random process. Most physical
processes in real life situations involve random components or variables and
a random process may be described as a statistical phenomenon that evolves
in time. A random process may be defined as a family of random variables
defined on a given probability space indexed by the parameter t. Here we
denote a stochastic variable by the capital letter Y and assume that it is
observable at discrete time points t1, t2, ....
A random process is a statistical phenomenon that evolves in time according
to some laws of probability. The length of queue in a system, the number of
accidents in a particular city in successive weeks, etc. are examples of a
random process. Mathematically, a random process is defined as the family
of random variables which are ordered in time, i.e., random process {Y(t); t
belongs to T} is a collection of random variables, where T is a set for which
all random variables Yt are defined on the same sample space. If T takes
continuous range of values, the random process is said to be a continuous
parameter process. On the other hand, if T takes discrete set of values, the
process is said to be a discrete parameter process. We use the notation Yt
for a random process when we deal with discrete parameter processes.
When T represents time, the random process is referred to as a time series.
In Units 13 and 14, we have dealt with one set of observations recorded at
different times. Thus, we had only a single outcome of the process and a
single observation on the random variable at time t. This sample may be
regarded as one time series out of the infinite set of time series, which might
have been observed. This infinite set of time series is called an Ensemble.
Every member of the ensemble can be taken as a possible realisation of the
stochastic process and the observed time series can be considered as one
particular realisation.
E[yt] = µ;
In the course MST-002, you have learnt how to calculate the covariance and
correlation between two variables for given N pairs of observations on two
variables X and Y, say {(x1, y1), (x2, y2), …, (xN, yN). Recall that the
formulas for computation of covariance and correlation coefficient are given
as:
Cov X, Y E X Y
Cov Yt , Yt k
… (2)
2Y
From equation (1), we note that
2Y 0 … (3)
k
Therefore, k and 0 1 … (4)
0
58
Stationary Processes
15.3.2 Estimation of Autocovariance and Autocorrelation
Coefficients
So far, we have defined the autocovariance and autocorrelation coefficients
for a random process. You would now like to estimate them for a finite time
series for which N observations y1, y2, ..., yN are available. We shall denote
a realisation of the random process Y1, Y2, …, YN by small letters y1, y2, ...,
yN. The mean µ can be estimated by
N
y yi N … (5)
i 1
59
Time Series Modelling From equation (5), we get
N
y
i 1
i
510
y 51.0
10 10
From equation (6), for k=0, the autocovariance coefficient is
Nk
2
y
t 1
i y
y 2
i Ny 2
c0
N N
(27906 26010)
189.6
10
For k = 1,
9
y
t 1
t y y t 1 y
c1 = – 166.33
9
From equation (7),
c1 166.33
r1 0.88
c2 189.6
You may now like to solve a problem to assess your understanding.
E1) Ten successive observations on a stationary time series are as
follows:
1.6, 0.8, 1.2, 0.5, 0.6, 1.5, 0.8, 1.2, 0.5, 1.3.
Plot the observations and calculate r1.
E2) Fifteen successive observations on a stationary time series are as
follows:
34, 24, 23, 31, 38, 34, 35, 31, 29,
28, 25, 27, 32, 33, 30.
Plot the observations and calculate r1.
(a)
(b)
Fig. 15.1: a) Plot of a time series for N = 200; b) correlogram for lag k = 1, 2, .., 17.
(a)
(b)
Fig. 15.2: a) Plot of alternating time series; b) correlogram for an alternating series
with lag up to 15.
Non-Stationary Time Series
If a time series contains trend, it is said to be non-stationary. Such a series is
usually very smooth in nature and its autocorrelations go to zero very slowly
as the observations are dominated by trend. Due to the presence of trend, the
autocorrelations move towards zero very slowly (see Fig. 15.3). One should
remove trend from such a time series before doing any further analysis.
(a)
(b)
62 Fig. 15.3: a) Plot of a non-stationary time series; b) correlogram of non-stationary series.
Seasonal Time Series Stationary Processes
If a time series has a dominant seasonal pattern, the time plot will show a
cyclical behaviour with a periodicity of the seasonal effect. If data have
been recorded on monthly basis and the seasonal effect is of twelve months,
i.e., s = 12, we would expect a highly negative autocorrelation at lag 6 (r6)
and highly positive correlation at lag 12 (r12). In case of quarterly data, we
expect to find a large negative r2 and large positive r4. This behaviour will
be repeated at r6, r8 and so on. This pattern of cyclical behaviour of
correlogram will be similar to the time plot of the data.
Years (2010-2012)
Fig. 15.4: Time plot of the average rainfall at a certain place, in successive months
from 2010 to 2012.
Therefore, in this case the correlogram may not contain any more
information than what is given by the time plot of the time series.
(a)
(b)
Fig. 15.5: a) Smoothed plot of the average rainfall at a certain place, in successive
months from 2010 to 2012; b) correlogram of monthly observations of
seasonal time series.
63
Time Series Modelling Fig. 15.5a shows a time plot of monthly rainfall and Fig. 15.5b shows the
correlogram. Both show a cyclical pattern and the presence of a strong 12
monthly seasonal effect. However, it is doubtful that in such cases the
correlogram gives any more information about the presence of seasonal
effect as compared to the time plot shown in Fig 15.4.
In general, the interpretation of correlogram is not easy and requires a lot of
experience and insight. Estimated autocorrelations (rk) are subject to
sampling fluctuations and if N is small, their variances are large. We shall
discuss this in more detail when we consider a particular process. When all
the population autocorrelations ρk (k ≠ 0) are zero, as happens in a random
series, then the values of rk are approximately distributed as N(0,1/N). This
is a very good guide for testing whether the population correlations are all
zeros or not, i.e., the process is completely random or not.
Example 2: For the time series given in Example 1, calculate r1, r2, r3, r4 and
r5 and plot a correlogram.
Solution: From Example 1 and its results we have the following:
y 51.0, c 0 = 189.6, c1 = −166.33 and r1 = − 0.88
Now we form the table for the calculations as follows:
S.
No.
Y Y2 Yt – Y Yt – Y Yt+2 – Y Yt – Y Yt+3 – Y Yt – Y Yt+4 – Y Yt – Y Yt+5 – Y
1 47 2209 −4
2 64 4096 13
3 23 529 −28 112
c2
8
y t y y t2 y
t 1 8
= 876/8 = 109.5
c2
r2 = 109.5/ 189.6 = 0.58
c0
For k = 3, we get
7
y
t 1
t y y t 3 y
c3
64 7
= –311/7 = –44.43 Stationary Processes
c3
r3 = −44.43/ 189.6 = −0.2343
c0
For k = 4, we get
6
y
t 1
t y y t 4 y
c4
6
= −234/6 = −39
c4
r4 = −39/ 189.6 = −0.2057
c0
For k = 5, we get
5
y
t 1
t y y t 5 y
c5
5
= 479/5 = 95.8
c5
r5 = 95.8/ 189.6 = −0.5052
c0
Thus, we have obtained the autocorrelation coefficients r1, r2, r3, r4 and r5
as r1= –0.88, r2 = 0.58, r3 = −0.2343, r4 = −0.2057, r5 = −0.5052,
respectively.
Now we plot the correlogram for the given time series by plotting the values
of the autocorrelation coefficients versus the lag k for k = 1, 2, …, 5. The
correlogram is shown in Fig. 15.6.
Fig. 15.7: Correlogram for 10 sample autocorrelation coefficients of the series of 200
observations.
Example 4: A random walk (St, t = 0, 1, 2, …) starting at zero is obtained
by cumulative sum of independently and identically distributed (i.i.d)
random variables. Check whether the series is stationary or non-stationary.
Solution: Since we have a random walk (St, t = 0, 1, 2, …) starting at zero
obtained from cumulative sum of independently and identically distributed
(i.i.d) random variables, a random walk with zero mean is obtained by
defining S0 = 0 and
St = Y1 + Y2 +….+Yt, for t = 1, 2, …
where {Yt} is i.i.d. noise with mean zero and variance σ2. Then we have
2
E (St) = 0, E (St ) = tσ2 < ∞ for all t
E3) Calculate r2, r3, r4 and r5 for the time series given in Exercise 1 and plot
a correlogram.
E4) Calculate r2, r3, r4 and r5 for the time series given in Exercise 2 and plot
a correlogram.
E5) A computer generates a series of 500 observations that are supposed to
be random. The first 10 sample autocorrelation coefficients of the
series are:
r1 = 0.09, r2 = –0.08, r3 = 0.07, r4 = –0.06, r5 = –0.05, r6 = 0.04,
r7 = –0.3, r8 = 0.02, r9= –0.02, r10 = –0.01
Plot the correlogram.
66
Let us now summarise the concepts that we have discussed in this unit. Stationary Processes
15.5 SUMMARY
1. A time series is said to be stationary if there is no systematic change in
mean, variance and covariance of the observations over a period of time.
If a time series is stationary, we can model it and draw further inferences
and make forecasts. If a time series is not stationary, we cannot do any
further analysis and make reliable forecasts. If a time series shows a
particular type of non-stationarity and some simple transformations can
make it stationary, then we can model it.
S. No. Y Y2 ( Y - Y) ( Y - Y) ( Y
t t t +1 -Y )
1 1.6 2.56 0.6
N
y y i 10 10 / 10 1.0
i 1
Nk
yi y 2 y 2
i N y2
c0
t 1
N N
y
t 1
t y y t 1 y
0.65
c1 0.072
N 1 9
c1
r1 = − 0.072/0.152 = − 0.475
68 c
E2) We first plot the stationary time series values as shown in Fig. 15.9. Stationary Processes
Nk
yi y 2 y 2
i N y2
c0
t 1
N
N
y t 1
t y y t 1 y
c1
N 1
1.773
0.1266
14
69
Time Series Modelling c1
r1 = 0.1266/5.145 = 0.0246
c0
E3) In Exercise 1, we have obtained the following values:
y 1.0 , c 0 = 0.152, c1 0.072 and r1 = − 0.475
S.
Y Y2 ( Y - Y) ( Y - Y) ( Y
t t t +2 -Y ) ( Y - Y)( Y
t t +3 -Y ) ( Y - Y) ( Y
t t +4 )
-Y ( Y - Y) ( Y
t t +5 -Y )
No.
1 1.6 2.56 0.60
2 0.8 0.64 −0.2
3 1.2 1.44 0.20 0.12
4 0.5 0.25 −0.50 0.10 −0.30
5 0.6 0.36 −0.40 −0.08 0.08 −0.24
6 1.5 2.25 0.50 −0.25 0.10 −0.10 0.30
7 0.8 0.64 −0.20 0.08 0.10 −0.04 0.04
8 1.2 1.44 0.20 0.10 −0.08 −0.10 0.04
9 0.5 0.25 −0.50 0.10 −0.25 0.20 0.25
10 1.3 1.69 0.30 0.06 −0.06 0.15 −0.12
10 11.52 0.23 −0.41 −0.13 0.51
0.23 8 0.02875
Now to plot the correlogram for the given time series, we plot the
values of the autocorrelation coefficients versus lag k for all
k = 1, 2, …, 5. The correlogram is shown in Fig. 15.10.
72