0% found this document useful (0 votes)
40 views

STATS Notes

This document provides an introduction to statistics, including summation notations, population and sample parameters, margin of error, and probability and non-probability sampling methods. It defines summation notations and indexes, explains the difference between population and sample means and variances, introduces the formula for calculating margin of error, and outlines probability sampling methods like random, systematic, and stratified sampling as well as non-probability methods like quota sampling.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views

STATS Notes

This document provides an introduction to statistics, including summation notations, population and sample parameters, margin of error, and probability and non-probability sampling methods. It defines summation notations and indexes, explains the difference between population and sample means and variances, introduces the formula for calculating margin of error, and outlines probability sampling methods like random, systematic, and stratified sampling as well as non-probability methods like quota sampling.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

STAT 1202

Statistics
-

Introductory -
LECTURE 1 : INTRODUCTION TO STATISTICS

SUMMATION NOTATIONS POPULATION & SAMPLES


→ index ends at 10 -

mr/ eaf.name/lPfnrea.maemt;Yeot#-
10
*

}
q i
= I t 2 + 3 . . - t 9 t 10
N =
Population mean

parameters of the

⇐I i 's
population variance
'

summing population
02 =

↳ index starts from 1


( O
-
-
standard deviation )

iii. iz = o
.
+ x. it .
. . . x. so
.
Tiffani:

MARGIN OF ERROR

Difference d sample
→ between population mean mean

FORMULA :
2¥ / z,Pxln
↳ * of selected ( p) to find
probability
being
throb ability varnishing
of
Thee
need the

MOE margin
error

: non - -
cannot be calculated )

equal &

every member of the population has an known
PROBABILITY SAMPLING
probability of being selected that is to
°
Random selected
randomly another member
equal chance of
being selected independent
-

, on

being selected
°
System e tic -
first member selected from the sampling frame at random ,
using a random number

°
stratified ↳
I chance of being selected ( Eg : to choose 100 people out of 500 ,
randomly choose a

÷ starting point & select


every 5th person )

NON -

PROBABILITY SAMPLING ( not ever body has a chance )

°
Quota ( to compare 2 diff
groups
in the populate eg :
males vs
females )
↳ sample will be proportional to actual population

°
Convenience


choosing samples based on ease
of accessibility leg : First 100
people who walk into school )

°
Purposive / Judgemental 1 Subjective

↳ specific goal
choosing samples e a

°
Self selection
-

↳ participants choose to be part of the sample themselves

Snowball
sampling
°


finding specific hard to reach samples leg :
drug addicts)
ERRORS
DISCRETE

DATA
/
-
NON -

SAMPLING
-
NUMERICAL ( countable,

1
Imperfect research
design

/
-
CONTINUOUS

#annan}¥'d :{vetoed
unearned '
Mistake in execution
1 1
-

related to ttlargersamplesize
Any
,
-
other reason not
SAMPLING difficult to quantify CATEGORICAL
sample selection

\
ERROR

/
( no order ( ranking )

Margin of Error qualitative ( non numerical )


-
-
-

°
Systematic Error

imperfect aspect of the research


-

/ any factor not related to the

sample selection
HOW NOMINAL DATA -

categorical qualitative leg


.

gender blood type


REDUCE
,

TO
- , ,

postal code)
larger sample
-

size

Estimate population 0
Random Errors
-

ORDINAL DATA ordered ( has ranking


confidence categories lorderieg
-

parameters
:
a

intervals managerial positions in


a
company )

INTERVAL DATA -
differences between measurements NO

{
,

200cal 2604
GRAPHICAL PRESENTATION leg
TRUE ZERO : 150C - 210C -

Quantitative ,

continuous
OF DATA

fifteen.gg?algwe,measnrements.Truezero
RATIO DATA -

* TRUE ZERO
of property being measured
: absence the

CATEGORICAL NUMERICAL ↳ OC : no true zero -


ooo Ino heat

°
line chart ( show aralne ↳
frequency changes in
distribution ✓ true zero
°
Okelvin : -
there is no heat at 01
Time overtime )
^

#
6

ocrosstabie
Ill

l-rari.net.in#re..I7noeh!a7Iia7n7
# o class intervals
't
,÷:th
no -
of desired intervals
' interval '

-
o
frequency distribution →
intervals are
mutually exclusive

mins
intervals
ospaoingbiw )
* → max 's
Histogram (
. .
"

ORANGE 500 600 750 the bars


-

/':/¥TI#
-

Age Frequency

Htt
-

obarcna.it soo .

I
200

Hii
-
-

"


137.42¥
31-36 11
>
-

30 35 40 45 50
A B C

stem & leaf display


o - -

Pie chart
°

many class intervals ( narrow)


:/
stem leaf ( 52,55 , 63,69 . 82,83 -
jagged distribution with
gaps from empty classes
( tens ) ( ones)

indication how
poor of frequency
-

varies
-
-
across

:
connie:*
:/ 7

8 2
! Imai : engineer → II:
-
intervals wide,
compress variation too much
distribution
.
yield blocky

- obscure impf patterns of variation


°
scatter plot ( observations taken from 2

numerical variables)
To DATA PRESENTATION ERRORS To

/
?
widths must
10

histogram interval
* be
¥ equal
- - - - - - - - - - - - -

x
! *
compressing ldiitortingverticyaxiv.no
point
-

;
Zero
( makes data diff
appear vastly
.
×

actuality
5 I
when in
there little diff )
- - - -

is
, ,
.

ulideopg 19
I

↳ see
/

:#
80
* do
30

provide basis
f- relation b/w 2
groups
< GRAPH PLOTTING ON EXCEL >

°
BAR & PIE CHARTS

i. select data

toggle)
2 .
insert ( top left

3. Recce .
charts → Choo ve the chart
you want

chart element
4 Add → axis titles

↳ x - axis : Im

↳ y axis - : Im

↳ chart title : Im

↳ correct data & presentation : 3M

5 . Add chart element → data label

TO HIGHLIGHT THE DATA OF ONE OF THE BARS :

i . Double click on the bar


you want to highlight
ii. Format data point → Fill -7 solid fill →
change
colour to make if itand out

÷÷÷÷÷÷÷i"
STACK PIE & PIVOT

/
°
.

TO PRESENT IN A CHART :

iii.sets:*::::
table
iii.
" "" "
i. ai . any data ."

2 . Insert → pivot table


"

"" "

T.ie::: as :
' "
ii's .
iiom
" e.
investment.
drag
.

g. '

VALUES
'

↳ variations for the columns


leg : Investor A. Investor B . Investor c) →
drag to

To change type of data presented ( arm 1 average I count Imax 1 min


etc )
.

right click field


settings you want
' '
change
'
→ → → WM to what

CHANGING B IN WIDTH

'

to show Format Data series


'
→ click on
graph

→ 3rd toggle ( Ms ) → Bins → Bin width


LECTURE 2 :
DESCRIPTIVE STATISTICS

MEAN MED BAN MODE

N =
IT 27=1 Xi
-

Ascending order -

most
frequent
Median position
-

Odd no . of values : median =


middle number

Even no .
of values : median =
average of the 2 middle numbers

mean

median
mode

Positive skew 1
it symmetrical
.

Right skewed
-
Distribution

|m
mode
median

mean ,
median

-
skew I
Negative
left skewed

QUARTILES
-

LOCATIONS
00 , 125%1 :
IT ( htt )
°
Q2 ( 500/01
}
( htt ) → median

° 03 (
75%1
!
4- ( htt )
ICR :
03 -

Qi → middle 50%

8 0.9 I O I 2 I 3 1.5 I 6 2 O
Eg
: O .
, ,
- , .
,
-
, . -
, -

Qi :
LT (8-11)=24 → 0.9 t IT (1.0-0.9) = 0.925

1.0
0.9 #

Position 2
'
14 of the Position 3
way
BOX E WHISKERS
( Median )
Q1 Q2 03

|%/
Minimum Maximum
-
( smallest value 't outlier) ( largest value t outlier)
go % values ( ( ( R)

values > 1.5 box


lengths from Q1 103 :
OUTLIER

values > 3 box lengths from Qi 103 : EXTREME OUTLIER

median

\

variability
/

VARIATION

↳ population variation ( 04

↳ sample variation (
-
s I

/
62 =
Ty 8 ( sci -
N )
2
S
-
=
NITE ( sci -

E)
2

# net, F.) IZ
=

8 ( xiz , (
- =
-

N sexist -

STANDARD DEVIATION

↳ population Holder .
( O)


sample Holder . CS )

• =
IT
/ s
-

-
1st
COMPARING VARIANCES : COEFFICIENT OF VARIANCE ( CV )

describes the dispersion of variable without


a
depending the variable 's measurement unit
-

on

( relative variation )

in percentage

- Used to compare datasets measured in different units

Population CV :
f- / sample cv :

MEASURE OF GROUP DATA

:÷t÷
÷

/
POPULATION SAMPLE

¥1
" fi
N Sfi * f: frequency
=
TOTAL COUNTS

" n' """ " " "" """

e ximi ,
mis
.

mean,

62 =
# Effi ( mi 2
] S2 # Effi ( mi
VARIATION -
N ) = -
E) 2
]
'
Efimi
-

Efim 't
( II )
= =
'
N
-
I'
NT
-

µ-
LECTURE 3 :
PROBABILITY

PROBABILITY :

Measures the chance that an event will occur

A B
A

①①
B

collectively exhaustive ( events

( "" "" '

s"
"" " " " " ""

Im
""

a'
'

CLASSICAL METHOD
(
intersection ( AAB )

YIIYI.FI ÷ } (

|m÷
-

All outcomes are


equally likely to happen Union ( AUB )

°
no .
of outcomes that satisfy A . P ( AUB) =P ( Alt PCB ) -
PLAN B )
PCA ) = -

total no -
f- outcomes in sample space

A B
THE ADDITIVE RULE

!!!!
""
"

" "" " " " " " " "" "

°
PLAT -

-
I -
PCA)
(
complement ( B )
CONDITIONAL PROBABILITY

PLAN B )
°
PCAIB ) : i PCBI A) =

PLA )
°
Pl A/B) =
; PCB , A) =
B
PCB ) PLA)

-
RELATIVE FREQUENCY METHOD

classical method cannot be used )


when
leg when n approaches as
-
:

STATISTICALLY INDEPENDENT EVENTS :

One event occurring does not make the occurance of the other event
anymore Hess probable


TESTS OF INDEPENDENCE

PCA AB )
°
PIAIB ) =
-
= PCA ) * PCB ) > O

PCB )

Plan B )
=
PCB ) * PCA ) > O
PCB I A)
ta
° =

VS
Mutually exclusive :
o
PCAAB ) =
PLA ) P ( B ) -

PLA AB ) -
-
O

* If two events are


mutually
CANNOT be
exclusivethey ,

independent
WEEK 4 :
DISCRETE PROBABILITY DECISIONS

DISCRETE RANDOM VARIABLE is a variable in which the vet of possible values

consists of isolated points on the number line .

Let X be a discrete random variable and x be one


of its possible values :

Probability of X
taking
a
specific value x is P (x )
-

on -
x .

PROBABILITY DISTRIBUTION FUNCTION is a


representation of the probabilities for all the possible outcomes .

Eg
:

× p ( x
-
- x )

2
¥ = 0.25

DISCRETE RANDOM VARIABLE

(
PROPERTIES OF DISCRETE PROBABILITY '
var ( x ) = 8 (x - E Cx ) ) Pla )

IE
"
°
Same Pint YI
°
MEAN : E (X) -
-

N
-

-
E Ix Pla ) ]

DISCRETE DATA :

/ Ell X ) Ty / Ty E
2 '
E IX ) PIX )
-

N)
- '
°
VARIANCE : P (x ) -
N -

62 = E. ( sci -
N)
2
( sci 21 -

°
CUMULATIVE PROBABILITY : F- ( x ) = C ( x =
.
x ) = E x ex Plk ) = PIX EX )

DISCRETE PROBABILITY DISTRIBUTION

A. BERNOUO.at DOG 'T Rb BUTTON 21 . BHNOMOAKDOSTRB BUTTON


-

Events we 2 Outcomes ( 41N ) -

counting success I failure of 2 -

state outcomes

Mean : E ( x ) =p fixed '


- '

n
-

Variance : var (x) =


Pll -
p)
-
observations are independent
D
X -
B ( hi P )

FORMULAS
31 PASSOW DOG 'T Rb BUTTON
.
o
c : :

x ! (n -
x )!
used to determine the probability of random variable
-
a

which characterises the of 1 successes of sequences are mutually exclusive


-

no , occurrences

certain event in interval


a a
given continuous
( f- occur at the same time )
( of accidents from b
Eg no on a
highway 8PM )
-
: .

on ! = h (n -

1) In -
2) . . .
x 2X I
^
P (x) (e ) for i
PIX ) O for all other
-
°
= x -
-
-
-
-

values of X

* where it is the mean rate ( occurrence of event per unit time )

° MEAN : E (x ) =
A

°
VARIANCE : Var ( x ) -
-
X
LECTURE 5 : CONTINUOUS PROBABILITY

continuous Random Variable

X
probability density function , fix ) function of x when x
-

is a
- -
.

area under distribution ( !)


probability curve
integration
-
=

( X 42 E 2 P ( 27=0 )
-
-

value X
at specific
=
-
P a
-

-
O -

↳ * prevent how ga is
phrased

Properties of probability denvity function


① f- ( x ) so for all values of x

fall floc ) dk = 1 ( P ( all f- ( x ) =


1)

② Mean : E ( x) = faux K '

f CX ) disc

③ Variance
faux [ El XII ) fall
'
var ( X) ( ( El x ) 32
-
x f (x) dx fix ) dx
=
:
x
-
=
x
- -

④ cumulative probability :

""
n At a
point height of
. the curve : fix ) / PDF
( axis )
y
-

't
::Ii
"

!
: -
Discrete :
PMF =

probability

amiiacieiiitinemweea.int '

" " " between " '

: :: :c; : :
a '
:

a b

Uniform distribution :
equal probabilities for all equal -
width intervals within the range of the random var .

f
f- (x ) = for a Ex Eb

for all other values of x

Total probability under


graph
-
-
I
- - -
-

:::c: ÷ :i÷÷
n

-
X min -

- a Xmax -
-
b
Exponential Distribution : distribution for events that
randomly happens
leg : customer
walking into
a
shop ) .

& time between 2 consecutive transactions at the ATM )


leg
occurrences :

f- ( x ) =

X
MEAN : ECH -

-
E

VARIANCE :
Varin
L
-

'
' ' ' bi

e'
f.
a - Rb
-

.
..

A b

Normal distribution :
random variable hav an infinite theoretical
range

HEY
-

OHH
-

e
-

✓ 2162

°
Mean :
ECXI
-
-
N X -
N ( N 02 ) ,

°
Variance : Var ( x) =
E (X -
N 12=62

f- ( K )

.
N

mean
Median
Mode

Standardization :
standard normal distribution 2. where ← Nl Oil )

"
°
X → 2 : 2
-
-

IN
PC XL 8. 6) =P ( 2<8 .bg?o8iO ) PIX > 8. 6) = I -
Pcxc 8.6 )

I 0.5478
=

=P ( 210.12 )
-

→ TABLE 4
=
O - 4522
= 0.5478

Probability of Normal Distribution

q
IN
"

PILL 1=0.2 → 20% ( Table 5) Table 5 :


XZXCP )
N -
- 8

IN
"
6=5 2
=
= -
O . 8416

"
? :O : Iai: ! :L: III. I. means : adder .
-
-
sareieootnanataia .

MT x

X? 8 2

2?

✓¥
Empirical Rule

I .
NIG contains 68.26% of data

g. nt-goontain.gs?:%74::aa
Nt % of 6 contains / I ×
-
36 -
26 -

G N O 2636

Z
3 I 2 3
-
-
2 -

I 0
LECTURE 6 :
SAMPLING DISTRIBUTION

Population Sample Sampling frame error Random sampling error

✓S certain sample listed error due to sample detected :


elements not
-
-

( N 6 ) ( I S ) in the difference b/w Populations


or
accurately represented
,
,

sample result ( aka margin of


sampling frame leg :
database not error )
updated )
using sample meant
eat population mean

I -

m.O.es Ns Itm .ae

N
need the sampling frame to calculate
probability of
-

m.o.esns-xtm.ae Central limit Theorem


awinnoobnatpi.mn?o?YYalnantioYtI7aibceuiYa9teYYYp:Inia-
-


ion
I
zernsn -×+zEn
-

ciitf population is normally distributed


theoampling distribution of
,
-

cannot be means
estimated leg convenience sample )
.
mean :

r
Willaloobe normally distributed .

Cii ) tf population is not


normally distributed sampling
the citrates that the
I -

m.ae I + more
,

distribution 9- meant will tend


towards a
normal distribution ausample
vile increases ( hbo )

& "
'

Population mean :
N = population variance : o
n

"

8nF On %
'

sample mean :
Nz : =
µ sample variance :
GI
'
= =
{ = → STANDARD ERROR S

sampling distributions of proportions


NOT
normally distributed
-

:
only YIN

Fhwa binomial distribution but be


approximated to
I>
POPULATION can
-

SAMPLE . a

BY CLT
normal divtribntionifnpll -

p)
> 5 .

µ
,

II

µ
"

YI
"
" Tt "'
?,xi
In
=

lfntso.by CLIF
"

Nipp - op
-
I
-

. .
, , ,

62=fiE,Cxi
'

m2 ↳ = =
l l
N n
-

'
i
POPULATION A- MME
>
x ! I -

XI N ×
I, I
µ N =P
Np N p
-
-
-
-

l l
-

l
G =P ( t P)
Pll
-

I N p)
g§2=
-
-

[ = I

IN
" N n
0
Z=
g- ( En )
-

al'll Pl
-

= I -

Gp

A
Pll p )
-_ -

A
"

it:
l
Ii: l
a.
Ii:
l
l
l
l

> l

N' O z
2 ?
-

2 ? N O
-

2=0
2=0

&
Point Interval estimates
-

confidence
Interval
I
p
tower 85% 90% 95% Upper
point
confidence estimate me confidence
( if Floris
Limit exactly at the confidence limit
poih-feotimafe.it ivan
Levey
unbiased estimate ) Ccl

Iisaneutimafeforlv

ouzisanettimateforo
'

il known population variance . o


-

( assume normal distribution )


2=1 -

Yoo
a :p
LELE )
→ .

II.
-

marginal error

in unknown population variance ,


o -
lavonne normal distribution )

l l

wet score distribution →
need to find degree of freedom
-

( v1 l

:µ=I±tvF( %)
-

2
→ Cl -
t
e
-

✓ =n -

I t
t

LEINE )
- F
:-( ng )
I -12
-

margin terror at
-

Yoo - -

lower limit
upper limit

Interpretation of
f- ( Ernle Ys :-( En )
the
Cl : Weare c% confident that population mean is
between I -
2 I -12 .

Confidence Interval for proportion POPULATION MEAN POPULATION PROPORTION

"' "
KEI ( g: ) CLE ) 't )
-

a- n
;
-

in fi
Cl :p
f±2E( )
e
-
-

sample vizeivlargeli.ee :npu population normally distributed


-

is
p) g)
byut

* ALWAYS
>

!
-

ROUNDUP
,

ol
ftp.notgivenihqneution.uoeworuf cave
possible scenario take P' to be 0.5 !
-
-
'
- - :

Reducing Margin f- Error ( LE ( Ent ) ( for population proportion )


:

+ to :
depends on
population
Th restricted due to
sample financial
X :
vice is reasons

✓ tell
LECTURE 9 : SAMPLING DISTRIBUTIONS ( 2 SAMPLES )

Dependant samples
Difference between paired values ( dit =
sci -

yi
* Both populations are
normally distributed

Mean
sample variance
f. mean
sample
sampling distribution paired difference
d- '
th E7= Eiichi d- 12
di Csd ,
-

a-

ton
"
,
=

n
Mean :p -0=14, variance :
i
-

confidence Interval
Cil variance unknown

Cl Ms d- t.tv :-( fda )


:

where V=h I
=
-

,
,


If Cl does not include a zero /
negative number sample A sample B
: -
> o

Independent samples

( Ii Ia ) E
( Ni Nz )
±
(I
-

I 2)
-
-

ME + ME

( it variances known Iii ) variance variances


unknown
unequal Ciii ) variance unknown equal variances
-
-

:( M
2E ( )
Cl Na ) ( Ii
on ? On!
- =
Iz ) ± '

J
-

( M
F (
+
Cl th ) ( Ii Ez ) Itv

Sn! t.sn! )
:
=
- -

Pool variance Csp )


sample
:

( 1) SZ
-
im (n .
-

II si t nz -

gpz =

margin of
error
- hit na -

margin of error

Cl ( M ( Tri f- ( I spank
thy ) )
:
2
822 Ma ) Fa ) tv
- = -
±
g.

(
z , +

)
,
t
ni na
where V
-

-
hi tha -

2
where ( Welch Satterthwaite formula ,
v
-

- -

( Sh! ) ( Sn ; )
- '

hi
-

I hz -

* round DOWN
integer
non
Df for
smaller
pudency

-

( in Proportions
F.
Film
F" a ph )
Varus ,
-
-

=
+

h 2

f
Cl Cp pal
fi
:

(
fat )
-
. =

72hL
±
LE F"
-

* Cl should NEVER be inte


prefect as a
probability leg :
lecture 9 example -

Cl does NOT show that women have


higher probability of getting a
college
degree than men
, only that women have a HIGHER PROPORTION of college degrees
than men .
LECTURE 10 :
HYPOTHESIS TESTING ( ISAMPLE )

A A. a
Ho :N73 Ho :NZ3 Ho :p =3

Hi : Ps3 Hiilv > 3


Hi :p -1-3

( .
: i ) ( id
l

l
l l
2x O
-

O 2x O
-
ZE ZE
Z
2
Lower tail levy upper tail test two tailteof
-
-
-

step
't :
Hypothesis Eg : left -

tail test
Ho :

Ho :p 73
HI :
Hi : Ps3

a
-
5%
Hepa Teuftfatisfics
-

L :

I -
No METHOD 1 : USING CRITICAL VALUE ( 2C in relation to 2x )
+ eototatiuticlti ) :

corn )

Hypothesis Teotf Means "e L


Ho

↳?f=feufvtaf.to/lsinthecriticalregion
l

Ci ) 6 known -
20.05=1.6449

NI =p .

6×-2=92 Werej Ho ( i. e. t - -
stats 2x for
* Assume normal distribution lower tail test ) -

o
Gx
-

=p
METHOD 2 :P -
VALUE
I -
NINI ,
0-+2 )

I -
No

5%j.
Test Hafiz -
= i

( Tn ) '

i i Ho
( i


l
Z

( iildunknown t stat leg : -21


-

① Area is called the p-value :


probability
NI -
N
falls into the
region
-

that 2 critical
2 S2
GI
=

Ifp value falls into critical


region rej.to
→ →
n
-

( i. e. p - values 2 for left -


tail test )
S
6g =

rn
p -
value = PC 242 Ts )
I -
N ( NI ,
672 ) =p ( za -
z ) For 2 -
tail .
P -
value

I -
No
=
I -

PC 2<2 ) =2[ Phet -

stat ]

Test stat :-(- =

Stfu I 0.97725
=
-

where V -
-
n -
I
=
0.02275

since 0.02275<0.05 →
rej - Ho

Hypothesis -647 Proportions

µp=p
Plt * Assume normal distribution
Gp2 62N P)
-

, =

h since npll -

p ) > b- → CLT →
Torn (
pp ,
Gp2 )
Gp Pll p)
-
=

f -

Po
Test Hat :2= where Po is the proportion test
we
against
-

Po ( I -
Po )

n
Conclusion for Hypothesis Testing :

since the terroristic falls , does not fall in the critical


region
.

We reject ldonotrejecttloata Youignificance level .


We therefore
conclude that there sufficient Iinuufficient evidence #
is
reject
-

the claim that . . . ( Ho ) .

Ho is true Ho is not true

TYPE 11 ERROR
DO NOT ✓
( P)
reject ( I )
x
-

" °
Confidence Level
Probability afnotrej .

a
false Ho

Reject Ho TYPEIERROR
y
( x )
P )
'

( l -

Possibility frej .

Power
a true Ho after

* & vice
Hype -1 error ,
T-type error versa

calculating Type # Error

True mean curve STEP 1 : FROM HYPOTHESISED MEAN GRAPH


-

Find
,
I 172<221 =L
Lansing
' -

* ,
N =

tihf-xcuvihg-xc.to ( En )
2.
-12 a

i
Hlo

W!%,
' '

l l l
l
! Mt THOD FROM ACTUAL MEAN GRAPH
-

2
'

(
,

" ¥-29 ?? I :& ( Area 3. Find P -


PCE IIc µ=µ
#
where
-
I -6449
in
hypothesised curve that

not rejected J
i
was ) N ( N
#
62 )
-

I
,
i

'
,
OR
#
'
i Ic -
N
#

213=1 Find 213 I N 3.


using
'
-

§p
'

2ps
=

! ( %)
own
4. Find P=P( 2>213 ) where 2-1410,1 )
13=1012 >
2131 =p ( I > I ,

I I N*
-

'

plz
'
>
)
=

615N
I

it
i

.
,

µ*=2.g 213
LECTURE 11 : HYPOTHESIS TESTING 12 samples )

Dependant samples
-

( i ) tuhknown
I PD
-

di sci yi 5~NCN-o.co -02 ) Test stat :-(


-
-
-
=

( sci )
-

Sd =
E. (di 412 NJ =
ND
tr
-

' for one fail


( Sd )
-

,
a
n I
( 5) where
-

var = V -
- n -
I
n
IV. E for two
tail
-

Edi
d- =

independent samples
(i ) 6 known

Var ( II -_
Var ( Xi ) t Var ( Xz )

6 , 2
622
t
=

MI M2
f- we
hypothesis @ that there

is no difference (µ ,
-

µ , ,=o
( Fi Fa )
.

( Ni µ ,)
- -
-

Teut ufaf :z=-

6,2 622
1-
hi H2

If set the
hypothesis Ho :P Nato
we as ,
-

Hi :p , -
Nz > o →
claiming that Hi >
Ho

liiltounknowndnneqnal
512
Var ( F) = Var ( Ii ) t Var ( Iz ) Var III ) = 6,2 =

At
512 822
t
=
822
hi h2 Var ( I 2) = 622 =

M2

( Il -
Xz ) -
( M -
Nz )

Test stat :-(


- =
where v
-
- WS
formula ( refer to formula vheef )
512 522
* ROUND DOWN the
Df !
t
hi h2

( iii ) 6 unknown &


equal
( ht -
1) S 't ( h2 1) ga -

gp2
=

hlthz -
2

( Il -
Xz ) -
( M -
Nz )
Test stat :[
-
=
where V -
- hit ha - 2

spank, th )

Teut for Proportions


hip , th 2152 ( total count of
''
successes
"

in both samples )
§= Nith 2

Fi Fa
Teufofaf
-

: 2- -

Ipu -
print that ,
LECTURE 12 : CER

CORRELATION


Correlation measurer the LINEAR relationship between two variables
'
- d2 )
Correlation 't causation correlation ( 3rd variable correlation b/w

spurious causing a
°

o causation =
correlation

E. Lxi -
Nil Cgi -
Ivy ) sample Holder of X
Population correlation coefficient P ,
-
-

'
E. ( Xi txt 8 ( yi 512 El Xi
2

F)
-
-
-

Sx
-
-

n -

I
( ssxy )
87=1 ( Xi -
Ill 's i -
51
sample correlation coefficient ,
r=
-
sample Holder of 7
'
E. ( Xi -
Tx ) 8 ( yi -
512
Sy El Yi
2
g)
=

(
-

ssxx l l ,
ssyy

=
87=1 ( ki -
E) ( Yi -
J ) h -
I

( n -
l ) Soc
Sy
Ssxy
✓ =

=
COV ( x , y )
ssxx Ssyy
Soc Sy

S ( sci
2

& E)
Ssxx =

!
-

Covariance between X & Y Cov (x , y )


=
n
( ki -
Ingi -

J)
,
,

i -
SSyy= El Yi -
5)
2
-
I

COV ( x when X & Y


independent
°
y ) o are
(r O )
-
-
, -

ssxy =
Ei! , ( sci E) ( -
Yi -

J)
°
Cov . is tied to the unit of x d y


eg weight 1kg

}
: x -

cow . unit : kgcm → cannot be used to compare


y -

height 1cm
t

user -
value to compare instead

( formula allows the unifvihthennm .

& the denom to cancel out - → r -

value
has no
units)
General Guidelines -

>
VERY STRONG CORRELATION COEFFICIENT : r 20.8

>
RELATIVELY STRONG CORRELATION COEFFICIENT : 0.GE r E -
0.7

>
MILD 1 RELATIVELY WEAK CORRELATION COEFFICIENT : 0.5 Er E O - G

>
WEAK CORRELATION COEFFICIENT : 0.4 E RE O - 5

VERY WEAK
Dependent
> : REO 3

Independent
-

Variable Variable

REGRESSION :
how much y wills
given a sin x
X Y
simple linear
regression motel :
f- Pot Pint E
Exogenous E dogenous
variable variable
where E is the random term E NCO
-
error ,
-

, Ge I

line
simple linear
regression equation I Bet fiflinel Predictive Modelling light least square :
-
-

Y bot bix ( aka y Muto )


- -
- -

where bo :
Y intersect 15 bi II Ei =
Difference between raw d
predicted value of
-
-

SS " y
bi :
gradient of the dope ( ssxx ) ↳ when
you square all
the Ei drum them
together ,

the line which the minimum E. Eia will be the


gives
beat
fit line
-
.
Auuumptions for Regression Analysis :

i .
True relation vhip form is linear Cy is a linear function of X t E

ii. Ei is
independent of X

iii. Ei is random variable with &


constant
a 2
mean o variance Ge
-
-

iv. E i are not correlated with one another

Sum of squared ( AEI


Ji )
' '
error
usyy ) E (
yi / Ee i
> :
-

of square regression Iss R ) /


SS
Model E Cgi ji
.
>
sum : -

↳ difference
between predicted value y I mean of y ( ONLY raw
look @ predicted line )
Sum
of square to -14 ( SST ) SSE Ssk E Cy J )
'
> : t = . -

y & J
↳ difference between
>

Point estimate for sea : Se


'
=

YE ,
.
where se is the Elandavg error of the estimates

R (
-

coefficient of determination ) :
portion of total variations of 7 that is
explained by X

Ji z )
'
SS R E ( -

SST Ss Rt Sse 8 ( '

if I ( yi
'

yi E
Cgi g) E
Ji )
-
t
= - -
= -

JST SSE -

R2
2
SST = r
SSE
for simple
= l -

S ST regression

Le auf square
estimates
Cov ( x , y )

87=1 ( sci -
E ) l Yi J ) -
n 'T E 7- I ( sci -

El Cudi g )-

rs , sy sy
✓= =
= =
r
(
Sy
Soc S"
)
n÷ E (
n -
I sci 2
Sx
-
-
E )
,

significance Teufel inferences about the


dope

step 1 :
Hypothesis
Ho : 13 ,
=
0

Ht :
Pi to

step Decision
2 :
rule
p
-
value s l s a

con duoion
steps :

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy