PTSP
PTSP
STOCHASTIC PROCESS
probability introduced through sets
and relative frequency
• Experiment:- a random experiment is an
action or process that leads to one of several
possible outcomes
Experiment Outcomes
Course Grades F, D, C, B, A, A+
Sample Space
• List: “Called the Sample Space”
• Outcomes: “Called the Simple Events”
This list must be exhaustive, i.e. ALL possible
outcomes included.
• Die roll {1,2,3,4,5} Die roll
{1,2,3,4,5,6}
P(B|A) P(A|B)
• The probabilities P(A) and P(AC) are called
prior probabilities because they are
determined prior to the decision about taking
the preparatory course.
• The conditional probability P(A | B) is called a
posterior probability (or revised probability),
because the prior probability is revised after
the decision about taking the preparatory
course.
Total probability theorem
• Take events Ai for I = 1 to k to be:
– Mutually exclusive: Ai Aj for all i,j
– Exhaustive: A 0 A S
1 k
p( Aj p(B Aj )
p( Aj B) B) p(B) kp( A)
p(B A ) i
p( A )
Independence
• Do A and B depend on one another?
– Yes! B more likely to be true if A.
– A should be more likely if B.
• If Independent
pA B pA
pB pA B pA
pB A pB
• If Dependent
p A B p A p B
p A B p A p B p A
B p A B p B A p A
Random variable
• Random variable
– A numerical value to each outcome of a particular
experiment
S
-3 -2 -1 0 1 2 3
• Example 1 : Machine Breakdowns
– Sample space : S {electrical, mechanical, misuse}
– Each of these failures may be associated with a
repair cost
– State space : {50, 200, 350}
– Cost is a random variable : 50, 200, and 350
• Probability Mass Function (p.m.f.)
– A set of probability value assigned to each
of the values taken by the discrete random variable x i
– 0 1 pi
i
p
– Probability : P( X x1i )
and i
pi
Continuous and Discrete random
variables
• Discrete random variables have a countable number
of outcomes
– Examples: Dead/alive, treatment/placebo, dice, counts,
etc.
• Continuous random variables have an infinite
continuum of possible values.
– Examples: blood pressure, weight, the speed of a car, the
real numbers from 1 to 6.
• Distribution function:
•
Probability Density Function (pdf)
• X : continuous rv, then,
• pdf
properties: 1.
t
2.
F (t) f (x)dx
t
f (x)dx
0
,
Binomial
• Suppose that the probability of success is p
• Examples
– Toss of a coin (S = head): p = 0.5 q = 0.5
– Roll of a die (S = 1): p = 0.1667 q = 0.8333
– Fertility of a chicken egg (S = fertile): p = 0.8 q = 0.2
binomial
• Imagine that a trial is repeated n times
• Examples
– A coin is tossed 5 times
– A die is rolled 25 times
– 50 chicken eggs are examined
• Assume p remains constant from trial to trial and that the trials are
statistically independent of each other
• Example
– What is the probability of obtaining 2 heads from a coin that
was tossed 5 times?
0.6
1.1
0.4 1.2
1.3
0.2 1.4
1.5
0 1.6
0 0.1 0.2 0.3 0.6 0.7 0.8 0.9 1 1.1 1.2 1.7
1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2
1.8
time 1.9
Uniform distribution
0 x<
{
, a,
F(x) xa
, a <x<
= ba
b
1 x>
, b.
Gaussian (Normal) Distribution
• Bell shaped pdf – intuitively pleasing!
• Central Limit Theorem: mean of a large
number of mutually independent rv’s (having
arbitrary distributions) starts following Normal
distribution as n
E( X ) pi xi
i
Expectation of a continuous random variable with
p.d.f f(x)
E( X )
continuous r.v.
state space
xf ( x ) d x
E[ X ] X N xf X
expectation of X = mean of X = average of X
(x)dx discrete r.v.
E[ X ] X x P(x )
i i
i1
f X (x a) f X (x a), x E[ X ]
a
X r.v. Y =g( X ) r.v.
Ex: Y g( X )
X2
P( X 0) P( X 1 1) P( X 1) P(Y 0) P(Y
2 1)
1
3 3 3
Expectation
discrete r.v.
i
conditional )] g(xof
E[g( Xexpectation
1 a r.v.
i )P(x i) X
E[ X B] continuous r.v.
N xfX (x
B)dx discrete r.v.
E[ X B] x P(x B)
i i
i1
Ex: B {X
b} f X (x)
b , x b
f X (x X b) f X (x)dx b
E[ X X b]
xf
X (x)dx
b
0, x f
X (x)dx
b
Moments
n-th moment of a r.v. X
mn E[ X n ] xn fX (x)dx
continuous r.v.
N
mn E[ Xn ] i P(xi )
x n
discrete r.v.
m i1
0
m1
1 X
properties of
expectation:
(1) E[c] c --
c constant
(2) E[ag( X ) bh( X )] aE[g( X )]
bE[h( X )]
PF: E[c] cf X (x)dx c f X
(x)dx c
m2
2 1
m3 E[ X 3 ] x3 e b dx a3 3a2b 6ab2
b 6b3
X 3XX 3 X 3 1] 2 3m1 m m1
2 3
3 E[( X X 3) ] a E[ 3X 2
3X 2
1
m 3m3 m
a 3a b 6ab 6b 3(a b){(a b)2 b 2 } 2(a b)3
3 2 2
2b3
skewness of a r.v. X 33 2b3
X b3 2
2
Chebychev's inequality P[ X X ] 2
X
X (x X ) fX (x)dx
2 2
(x X )2 f X (x)dx
x X 2
2
f (x)dx
P[ X X
] X
Markov's inequalityx X
P[ X 0] 0 P[ X a]
a
2 E[ 1
X ]
X
Ex 3.2-3: P[ X X 3 X ] 9 2 9
X
Characteristic function of r.v. X
jX
X () E[e ] f X (x)e jxdx
jx Fourier transform
f X (x) 2
1 X ( )e
d
X ( ) f X (x) ef X (x)dx 1 X
jx
dx
(0)
d
n
( )
n n j
x n n
X
f X (x) j x e dx f X (x)x dx j n
E[ X
d n
j 0
n
]
0 n d X ()
n
mn ( j)
dn
0
Functions That Give Moments
Moment generating function of r.v. X
M X (v) E[e ] vX
fX (x)evxdx
d n M (v)
n vx
X
f X (x)x e dx v f X (x)x n dx mn
dvn v
0
0
a 1 a ( 1b j ) x
1 ( j ) 1 e
e
jX
X ( ) ] b b x
dx b
monotone decreasing
T (x1 ) T (x2 ) for any x1
x2
Assume monotone increasing T Y T
() F ( y ) P[Y y ] P[ X x ] F ( X )
Y 0 0 0 X
(x0 )
y0 T 1 ( y0 )
fY ( y)dy
f X (x)dx
1 dT 1 ( y0 )
fY ( y0 ) f X [T ( y0 )]
dy0
1 dT 1 ( y) dx
fY ( y) f X [T f X (x)
dy
( y)]
dy
Assume monotone decreasing T Y T
()FY ( y0 ) P[Y y0 ] P[ X x0 ] 1( FXX )
(x0 )
dx
fY ( y) fX (x) dy
1
monotone T () dx
fY ( y) Xf (x) dy fX (x) dy
dx
nonmonotone T
()
Y T
(X)
f ( y) f X (xn )
Y
n dT
(x)dx x n
x
Ex 3.4-2:
Y T (X ) nonmonoto e
cX 2 n
f ( dy) f ( y / y /
c) dc
Y X
y
y/
fX ( y / c )
d c
d
y
fX ( y / c) fX ( y / y
2 0
c) cy
,
MULTIPLE RANDOM VARIABLES and OPERATIONS:
MULTIPLE RANDOM VARIABLES :
Vector Random Variables
A vector random variable X is a function that assigns a
vector of real numbers to each outcome ζ in S, the sample
space of the random experiment
y j 1,2,… k 1,2,…
P X jx ,Y ky
The probability of any event A is the(4.4)
sum of the pmf over the
outcomes in A
(4.5
PX in A p X ,Y (x j , yk ) . )
( x j , yk ) in A
(4.6
p X ,Y (x j , yk ) 1. )
j1 k 1
The marginal probability mass
functions :
p X (x j ) P X xj
PX x ,Y j
and Y y
PX xand Y y X
anything
j 1 j 2
x
(4.7a
pX,Y (x )
j ,yk ) ,
pY ( yk ) kP1Y yk
(4.7b
pX,Y ( x j ,yk ) )
.
j 1
The Joint cdf of X and Y
The joint cumulative distribution function of X and Y is
defined as the
probability of the product-form event X x1Y y1":
FX ,Y (x1, y1 ) PX x1,Y y1 (4.8
The
. )
joint cdf is nondecreasing in the “northeast”
direction,
(i) FX,Y(x1,y1 ) FX,Y(x2 ,y2 ) if x1 x2 and y1
y2 ,
and
lim FX ,Y (x, y) FX ,Y (x,
yb
b)
The Joint pdf of Two Jointly Continuous Random
Variables
We say that the random variables X and Y are jointly
continuous if the probabilities of events involving (X, Y) can
be expressed as an integral of a pdf. There is a
nonnegative function fX,Y(x,y), called the joint probability
density function, that is defined on the real plane such that
for every event A, a subset of the plane,
(4.9
PX in A A f X ,Y (x', y')dx' )
as shown in Fig. 4.7. When a is the entire plane, the
dy',
integral must
equal one :
(4.10
1 f X ,Y (x', y')dx'
The joint cdf can be obtained in terms of the joint) pdf
dy'.
of jointly
continuous random variables by integrating over the
semi-infinite
The marginal pdf’s fX(x) and fY(y) are obtained by taking the
derivative of the
corresponding marginal cdf’s
FX (x) FX ,Y (x, )
FY ( y) FX ,Y (,
FX (x) dx
d
x
y) .
f X ,Y (x',
y')dy'dx' (4.15a
f X ,Y (x, )
y')dy' .
(4.15
FY ( y) f X ,Y (x', b)
y)dx'.
INDEPENDENCE OF TWO RANDOM
VARIABLES
PX x PY
j k
Conditional Probability
In Section 2.4, we know
PY in A, X x (4.22
PY in A | X x PX .
)
If X is discrete, then x Eq. (4.22) can be used to
obtain the
conditional cdf of Y given X =
xk : PY y, X kx
FY ( y | xk ) , for PX xk (4.23
PX xk 0 . )
The conditional pdf of Y given X = xk , if the derivative
exists, is given
by Y d
f ( y | xk ) dy FY ( y | xk ) .
(4.24)
MULTIPLE RANDOM VARIABLES
Joint Distributions
The joint cumulative distribution function of X1, X2,…., Xn is
defined as the probability of an n-dimensional semi-infinite
rectangle associate with the point (x1,…, xn):
)
x
EXAMPLE 4.31 Sum of Two Random
Variables
Let Z = X + Y. Find FZ(z) and fZ(z) in terms of the joint
pdf of X
and Y.
The cdf ofzx'Z is
FZ (z) f X ,Y (x', y')dy' dx'.
The pdf of Z is
d Z
f (z) dz F (z)
Z f X ,Y (x', z x')dx' .
(4.53)
Thus the pdf for the sum of two random variables is given by a
superposition integral. If X and Y are
independent random variables, then by Eq. (4.21) the pdf is given
by the
convolution integral of the margial pdf’s of X and Y :
(4.54
f Z (z) f X (x') fY (z )
x')dx'.
pdf of Linear Transformations
We consider first the linear transformation of two random
variables
V aX bY V a b X
W c
eY
.
W cX eY
Denote the above matrix by A. We will assume A has an inverse,
so each
point (v, w) has a unique corresponding point (x, y) obtained
from
x 1 v (4.56
y A .
w )
In Fig. 4.15, the infinitesimal rectangle and the parallelogram are
equivalent
events, so their probabilities must be equal. Thus
f X ,Y (x, y) (4.57
fV ,W (v, w) dP ,
)
dxdy
where x an y are related to (v, w) by Eq. It
(4.56)
be shown tdhPat ae bc so the “stretch can
factor” is
dxdy ,
dP ae bc dxdy
ae bc A
dxdy dxdy
where |A|
, is the determinant of A.
Let the n-dimensional vector
Z be
Z AX,
where A is an invertible matrix.
nn The
joint of Z is then
EXPECTED VALUE OF FUNCTIONS OF RANDOM VARIABLES
gx, y X ,Y (x, y)
X, Y jointly
E Z
f continuous (4.64
i n i , yn ) pX ,Y (xi , yn )
g(x )
X,Y discrete.
*Joint Characteristic Function
The joint characteristic function of n random variables is
defined as
X , X1 2 ,…X (w
n 1, w2 , … w
n ) E e
jw1 X 1 w 2 X 2 w nX
n
(4.73
a)
.
X ,Y (w1 , w2 ) E jew1 X w2Y (4.73
b)
.
If X and Y are jointly continuous random
variables, then
X ,Y (w1 , w2 ) j w xw
f X ,Y (x, y)e 1 2y
dxdy.
(4.73c)
The inversion formula for the Fourier transform implies that the
joint pdf is
given by
1 j w1 xw2 y
f X ,Y (x, y)
4 2
X ,Y 1
(w , w )e
2
dw1dw2 .
(4.74)
JOINTLY GAUSSIAN RANDOM VARIABLES
The random variables X and Y are said to be jointly
Gaussian if their
joint pdf has the form
f X ,Y (x, y)
1 x m 2 x m 1 y m
2
2 X ,Y
y m
1 2 2
exp 2 2
1
X ,Y 1 1 2 2
21 2 1 X2 ,Y
(4.79)
x and y
The pdf is constant for values x and y for which the argument of the
exponent is constant
x m 12 x m 1 y m y m2
2
1 2 X ,Y 1
2
2
2 constant
When ρXY, = 0, X and Y are independent ; when ρX,Y ≠ 0, the major
axis of
the ellipse is oriented along the angle
1 2
2 arctan X2,Y1 22 . (4.80
1 2 )
Note that the angle is 45º when the variance are
equal.
The marginal pdf of X is found by integrating fX,Y(x, y)
over all y
xm 1 2/ 2 12
e
f X (x) , (4.81
21 )
that is, X is a Gaussian random variable with mean m1 and
variance
2
1
n Jointly Gaussian Random Variables
The random variables X1, X2,…, Xn are said to be jointly
Gaussian
joint pdf isif given
their 1
by
exp x m
T
K 1
x m
2
(x , x , … x ) 2 n / 2 k
f X (x) X1 , X 2 , … , Xn 1 2 , (4.83)
f n
1/ 2
where x and m are column vectors
defined by
x1 m1 EX1
x m EX
x , m 2 EX 2
2
⁝ ⁝ 3
x n m n EX 4
and K is the covariance matrix that is
defined by
VARX1 COVX 2 , X 1 COVX1, X n
COVX , X
2 1 VARX 2 COVX 2 , Xn
(4.84
K ⁝
⁝ ⁝ )
COVX n , X1 VARX n
Transformations of Random Vectors
Let X1,…, Xn be random variables associate with some experiment,
and let the random variables Z1,…, Zn be defined by n functions of X
= (X1,…, Xn) :
Z1 cdf
The joint g1 (X) g 2 z(X)
Z 2point
of Z1,…, Zn at the … Zzn)
= (z1,…, n isg n (X).
x':g (4.55b)
k
n
( x')z k
pdf of Linear Transformations
We consider first the linear transformation of two random
variables
V aX bY V a b X
W c
eY
.
W cX eY
Denote the above matrix by A. We will assume A has an
inverse, so each point (v, w) has a unique corresponding
point (x, y) obtained from
x 1 v (4.56
y A w. )
In Fig. 4.15, the infinitesimal rectangle and the
parallelogram are
equivalent events, so their probabilities must be equal.
Thus
) collection
(t,The assigned.of X (t , )
⁝
such waveforms
n
X (t , ) ⁝
form a stochastic
k
process. The X (t , ) ⁝
set of {k } and the 2
X (t , )
time index t can be 1
t
continuous or discrete 0
1
t 2
t
t) dx
represents the first-order probability density
function of the process X(t).
For t = t1 and t = t2, X(t) represents two different random
variables X1 = X(t1) and X2 = X(t2) respectively. Their joint
distribution is given by
F (x , x ,t ,t ) P{X (t ) x , X (t )
x}
X 1 2 1 2 1 1 2
2
an
d
2 F ( x , x , 1t , 2t
f X ( x1 , x2 ,1t ,2 t ) X
1 2 x1 2 )
x
represents the second-order density function of the process
X(t). Similarly f X (x1, x2 , xn , t1, t2 , trne)presents the nth
order density function of the process X(t). Complete
process X(t) requires
specification the
of the stochastic f X (x1 , x2 , x n, t 1, t2 , t n)
knowledge
i 1, 2, , and for all n. (an almost impossible
i of
for altl
in n task
reality).
,
Mean of a Stochastic Process:
(t) E{X (t)} x f X ( x,
t)dxX(t). In general, the
mean value of a process
represents the
mean of
a process can depend on the time index t.
Autocorrelation function of a process X(t) is
defined as
RXX (t1 ,t2 ) E{X (t1 ) X * 2(t x1 2 f X (x1 , x2 ,t1 ,t2 )dx1 dx2
x *
The
n
T
z T X (t)dt.
T T
E{X (t ) X * (t )}dt dt
E[| z | ]
2
T T 1 2
1
T T
2
R (t ,t )dt dt (14-
T T XX 1
2
2 1
10)
Stationary Stochastic Processes
Stationary processes exhibit statistical properties
that are invariant to shift in the time index. Thus, for
example, second-order stationarity implies that the
statistical properties of the pairs
{X(t1) , X(t2) } and {X(t1+c) , X(t2+c)} are the same
for any c. Similarly first-order stationarity implies that the
statistical properties of X(ti) and X(ti+c) are the same for
any c.
In strict terms, the statistical properties are
f X (x1 , x2by
governed ,xthen ,joint ,t n ) fdensity
t1 , t2probability X
(x1 , x2 function. c, t2 a
,x n , t1 Hence c,t n
process is nth-order
c)any c,Strict-Sense
for where the Stationary
left side represents
(S.S.S) if the joint density
function
the random of X1 X (t1 ), X 2 X (t2 ), , Xn an
variables
the right side corresponds to the joint density function ofd
X (tn )
the random variables X1 X (t1 c), X2 X (t2 c), , Xn
A process X(t) is said to be strict-sense stationary if (14-
14)X is
true c).ti , i 1, 2,, n, n 1, 2, and any
(tn all
for
c.
For a first-order strict sense stationary
process,
from (14-14) we have f (x,t) f (x,t
X c) (14-
X 15)
for any c. In particular c = – t gives
(14-
fX (x,t) f X(x) 16)
i.e., the first-order density of X(t) is independent of t. In
that case
Similarly, for a second-order strict-sense stationary
we have from (14-
process
(14-
14)
E[ X (t)] x f ( x)dx , a 17)
for any c. Forconstant.
c = – t2 we
get
f (x , x , t , t ) f (x , x , t c, t
2
c)
X 1 2 1 2 X 1 2 1
f (x , x , t , t ) f (x , x , t t ) (14-
X 1 2 1 2 X 1 2 1 2
18)
i.e., the second order density function of a strict sense
stationary
process depends only on the difference of the time t1 t2
indices R (t , t ) E{X (t ) X * .
(t )}
In that case the autocorrelation function is given by
1 2 x x f1 ( x , x2 , t2 )dx1dx2
*
t
XX
(14-
R (t1 21t )X 1 R 2 ) R* ( ), 19)
( XX XX
( 1 , 2 ,, n ) k 1 l ,k
22)
X
i ek
where C (t ,t ) is as defined on (14-9). If X(t) is wide-
XX
stationary, then using (14-20)-(14-21) in (14-22)
sense
we get n n n
j k 21 C XX (ti t k ) i k
X (1 , 2 , , n ) k11 11 k
(14-
and hence e if the set of time indices are shifted by a 23)
generate cato
constant new set of jointly Gaussian random X1 X 1
X2 X 2(t , Xn X (t
variables then their joint (t c),
n c), is identical
function c)to (14-23). characteristic
Thus the set of random
variables
and {X i }i have the same joint probability distribution for all n
n
i i
all c, establishing
1 the strict sense stationarity of Gaussian 1
processes from its wide-sense stationarity.
and To summarize
{X } n
if X(t) is a Gaussian process, then
wide-sense stationarity (w.s.s) strict-sense
stationarity (s.s.s).
Notice that since the joint p.d.f of Gaussian random variables
depends only on their second order statistics, which is also
the basis
PILLAI/
Systems with Stochastic Inputs
A deterministic system1 transforms each input waveform X
(t,i )into
time variable
an output t. Thus Y
waveform a (t,
set i )ofrealizations
T[X (t,i )] byatoperating
the input only on
corresponding
thea process X(t) generates a new set of realizations{Y
to
Y (t , i
X (t, i )
)
X(t )
T [ ] Y(t )
t t
Fig. 14.3
1A stochastic system on the other hand operates on both the variables t and .
Linear Systems: L[ ]represents a linear system
if (14-
L{a1 X (t1 ) a2 X (t2 )} a1L{X (t1 )} a2 L{X
Le (t )}. 28)
t 2 Y (t) L{X
(14-
represent the output of(t)} a linear
29)
system.
Time-Invariant System: L[ ] represents a time-invariant system if
h(t
)
Impulse
response
LTI h(t) of the
system
(t)
Fig. Impuls t
Impuls
14.5 e
the
Y
n (t )
X
(t ) t
X Y
t (t ) LTI (t )
(a
)
RXX (t1 ,t 2 ) h*(t2)
RYX (t1 h(t1)
RYY (t1 ,t2 )
,t2 )
In particular if X(t) is wide-sense stationary, then (t)
X X
we have
so that from (14-34)
Y (t) X h( )d X c, a constant. (14-
Als R (t,t ) R (t so that (14-35) 38)
XX 1 XX 1
o
R (treduces
t to )h*
2t )
R XY (t1 ,t ) 2
RYY ( ) XX
( ) h*
( ) h( (14-
R ). 41)
From (14-38)-(14-40), the output process is also wide-sense
This gives rise to the following
stationary.
representation
X (t) Y (t)
wide-sense wide-sense
stationary LTI system
h(t) stationary
process process.
(a
)
X (t)
strict-sense LTI system
Y (t)
strict-sense
stationary h(t) stationary
process
process (see Text
(b
for proof )
)
X (t) Gaussian
Y (t)
Gaussian
process Linear system process (also
(also stationary)
stationary) (c)
Fig.
Discrete Time Stochastic Processes:
n E{X (14-
57)
an R(n , (nT
n ) )} E{X (n T ) X * (n (14-
d T )} 58)
1 2 1 2
respectively. As before strict sense stationarity and
wide-sense
stationarity definitions apply here also.
For example, X(nT) is wide sense stationary if *
2 n
C(n1 , n2 ) R(n1 , n) 1 n2 (14-
an 59)
d
E. energy in the ( ,
Thu | X () |
2 representsthe signal
s
(see Fig band
)
18.1).
| X ( )|
X 2
Energy in( ,
(t )
)
0 t 0
Fig
18.1
However for stochastic processes, a direct application
of (18-1) a sequence of random variables for every .
generates
Moreover, for a stochastic process, E{| X(t) |2} represents
the ensemble average power (instantaneous energy) at the
instant t.
1 T T
j ( t1 t 2 )
T T
2T
R X X ( t1 , t2 ) e
d t 1 dt 2
(18-
represents the
power distribution of X(t) based on (– T, T ). 5)
For wide sense stationary (w.s.s) processes, it is possible to
further simplify (18-5). Thus if X(t) is assumed to be w.s.s,
then
and (18-5) simplifies toR (t ,t ) R (t t )
XX 1 2 XX 1 2