Correction and Regression
Correction and Regression
i to
CHAPTER FOUR
Ed
DF
The primary objective of correlation analysis is to measure the strength or degree of relationship
between two or more variables. If the change in one variable affects a change in the other variable,
the variables are said to be correlated.
rP
For example, the production of paddy is dependent on the rainfall. Here production of paddy is
considered to be a dependent variable.
Types of Correlation
te
Positive or negative
Simple or multiple
Linear or non-linear
as
Positive or negative
If the two variables deviate in the same direction, that is if the increase (or decrease) in one results
M
in a corresponding increase (or decrease) in the other, correlation is said to be director positive.
But if they constantly deviate in the opposite directions, that is if increase (or decrease) in one
results in corresponding decrease (or increase) in the other, correlation is said to be inverse or
negative. If the variables are independent, there cannot be any correlation and the variables are
in
For example, the correlation between (1) the heights and weights of a group of persons, (2) the
income and expenditure is positive and the correlation between (1) price and demand of a
ed
commodity, (2) the volume and pressure of a perfect gas is negative. And there is no correlation
between income and height.
Correlation only between two variables is called simple correlation. For example, correlation
between income and expenditure.
re
1
r
i to
Under Multiple Correlation three or more than three variables are studied. Ex. Qd= f ( P,PC, PS,
t, y )
Ed
Correlation is said to be linear when the amount of change in one variable tends to bear a constant
ratio to the amount of change in the other. The graph of the variables having a linear relationship
will form a straight line.
Example: X = 1, 2, 3, 4, 5, 6, 7, 8,
DF
Y = 5, 7, 9, 11, 13, 15, 17, 19,
Y = 3 +2x
The correlation would be non linear if the amount of change in one variable does not bear a
rP
constant ratio to the amount of change in the other variable.
Suppose, (x1,y1), (x2,y2)………..(xn,yn) are n pairs of observations. If the values of the variables x
and y be plotted along the x-axis and y-axis respectively in the xy-plane, the diagram of dots so
obtained is known as scatter diagram.
in
2
r
i to
Interpret of r
Ed
r= +1, indicates a perfect positive relationship between x and y. the scatter diagram will be as in
fig. 1.1
r=-1, indicates a perfect negative relationship between x and y. the scatter diagram will be as in
fig. 1.2
DF
r=0, means there is no linear relationship between x and y. In this case the two variables are
linearly independent. the scatter diagram will be as in fig. 1.5 and 1.6
0 < r <1, indicates a positive relationship between x and y. In this case the scatter diagram will be
as in fig. 1.3
rP
-1< r <0, indicates a negative relationship between x and y. In this case the scatter diagram will be
as in fig. 1.4
Correlation coefficient
te
The numerical value by which we measure the strength of linear relationship between two or more
variables is called correlation coefficient.
Let, (x1,y1), (x2,y2)………..(xn,yn) be the pairs of n observations. Then the correlation coefficient
as
between x and y is denoted by rxy and defined as,
n
(x i x )( y i y )
M
i 1
rxy = ……………..(1)
n n
x x y y
2 2
i i
i 1 i 1
Equation (1) is also called Karl pearson’s coefficient of correlation formula given by 1890.
in
n n
x i y i
ed
xi y i i 1 n i 1
r=
n
2
n
2
n x i n yi
at
i 1 i 1
x i y i
2 2
i 1 n i 1 n
re
3
r
i to
Ed
Assumptions of Pearson’s Correlation Coefficient
There is linear relationship between two variables, i.e. when the two variables are plotted
on a scatter diagram a straight line will be formed by the points.
Cause and effect relation exists between different forces operating on the item of the two
DF
variable series.
rP
0.7 c < 1 = Strong positive correlation
te
0.4 c < 0.7 = Fairly positive correlation
as
0 < c < 0.4 = Weak positive correlation
M
0 = No correlation
4
r
i to
Properties of correlation coefficient
Ed
3. Correlation coefficient is symmetric. i.e, rxy= ryx
b yx b xy
4. Correlation coefficient is the geometric mean of regression coefficients i.e, rxy=
5. For two independent variable correlation coefficient is zero
6. It is always unit free.
DF
Advantages of Pearson’s Coefficient
It summarizes in one value, the degree of correlation & direction of correlation also
rP
Always assume linear relationship
Interpreting the value of r is difficult.
Value of Correlation Coefficient is affected by the extreme values.
Time consuming methods
te
Coefficient of Determination
The convenient way of interpreting the value of correlation coefficient is to use of square of
as
coefficient of correlation which is called Coefficient of Determination.
Suppose: r = 0.9, r2 = 0.81 this would mean that 81% of the variation in the dependent variable has
M
The maximum value of r2 is 1 because it is possible to explain all of the variation in y but it is not
possible to explain more than all of it.
in
This implies that in the first case 36% of the total variation is explained whereas in second case
at
5
r
i to
Theorm: Show that Correlation coefficient lies between -1 to +1 i.e, -1 rxy 1.
Proof: Let, (x1,y1), (x2,y2)………..(xn,yn) be the pairs of n observations. Then the correlation
coefficient between x and y is denoted by rxy and defined as,
Ed
n
(x
i 1
i x )( y i y )
rxy = ……………..(1)
n n
x x y y
2 2
i i
i 1 i 1
DF
Suppose, xi x X and y i y Y therefore
r=
XY
X Y 2 2
rP
Let us consider the following expression which is always positive.
i.e,
X
Y
2 0
X 2
Y2
te
X
2
X Y Y2
or, 2 0
as
X 2 X 2
Y2 Y 2
or,
X 2
2
XY +
Y 0 2
X Y
M
X Y
2 2 2 2
or, 1 2r 1 0
or, 2(1 r ) 0
in
or, (1 r ) 0 ……(i)
or, r 1
or, -1 r …………(ii)
at
and 1-r 0
or, 1 r
re
6
r
i to
or, r 1 …………..(iii)
Ed
i.e, coefficient lies between -1 to +1.
Theorem: Show that for two independent variable correlation coefficient is zero.
Proof: Let, (x1,y1), (x2,y2)………..(xn,yn) be the pairs of n observations. Then the arithmetic mean
of x i is x and y i is y . Since x and y are independent therefore,
DF
n
(x
i 1
i x )( y i y )
Covariance, Cov(x,y)= 0
n
x x yi y 0
rP
or, i
(x
i 1
i x )( y i y )
We Know, rxy =
n n
x x y y
2
te 2
i i
i 1 i 1
0
as
=
n n
xi x y y
2 2
i
i 1 i 1
M
= 0 (proved)
Solution: Let, (x1,y1), (x2,y2)………..(xn,yn) be the pairs of n observations. Then the correlation
in
(x
i 1
i x )( y i y )
rxy = ……………..(1)
ed
n n
x x y y
2 2
i i
i 1 i 1
Now, y = mx + c………..(ii)
at
re
7
r
i to
n
(x
i 1
i x )( mxi c mx c)
Therefore, rxy = ……………..(1)
n n
x x mx c mx c
2 2
Ed
i i
i 1 i 1
(x
i 1
i x )( mxi mx )
=
n n
xi x mxi mx
2 2
DF
i 1 i 1
n
m ( xi x )( xi x )
i 1
=
n n
xi x xi x
rP
2 2
m
i 1 i 1
(x
i 1
i x) 2
= 1
n
te
(x
i 1
i x) 2
Calculate the deviations ‘x’ &’y’ in two series from their respective mean.
M
Square each deviation of ‘x’ &’y’ then obtain the sum of the squared deviation i.e.Σx2& .Σy2
Multiply each deviation under x with each deviation under y & obtain the product of ‘xy’.Then
obtain the sum of the product of x , y i.e. Σxy
in
Application Problem-1: A research physician recorded the pulse rates and the temperatures of
ed
water submerging the faces of ten small children in cold water to control the abnormally rapid
heartbeats. The results are presented in the following table. Calculate the correlation coefficient
between temperature of water and reduction in pulse rate.
Temperature of water 68 65 70 62 60 55 58 65 69 63
at
8
r
i to
Solution: Calculating table of correlation coefficient.
Ed
65 5 4225 25 325
70 1 4900 1 70
62 10 3844 100 620
60 9 3600 81 540
55 13 3025 169 715
DF
58 10 3364 100 580
65 3 4225 9 195
69 4 4761 16 276
63 6 3969 36 378
x 635 y 63 x
2
40537 y
2
541 x y =3835
rP
i i i i i i
n n
x i y i
xi y i i 1 n i 1
te
We know, rxy =
n
2
n
2
n x i n yi
i 1 i 1
x i y i
2 2
as
i 1 n i 1 n
M
635 63
3835
in
= 10
6352 541 632
40537
10 10
ed
= -0.94
The result -0.94, indicates that the correlation coefficient between temperature of water and
reduction in pulse rate is highly negatively correlated.
at
Assignment problem-1: Compute r for the for the following paired sets of values:
re
9
r
i to
i.(x, y): (1,2) , (2, 3), (3, 5), (4, 4), (5, 7)
Ed
ii. (x, y): (1,1) , (2, 3), (3, 5), (4, 7), (5, 9)
iii.(x, y): (1,10) , (2, 8), (3, 6), (4, 4), (5, 2)
DF
iv.(x, y): (2,9) , (3, 5), (4, 6), (5, 2), (6, 1)
rP
v.(x, y): (-2,4) , (-1, 1), (0, 0), (1, 1), (2, 4)
Solution 1: (x, y): (1,2) , (2, 3), (3, 5), (4, 4), (5, 7)
te
The formula for finding correlation coefficient is
as
n n
x i y i
xi y i i 1 n i 1
rxy =
M
n
2
n
2
n x i n yi
i 1 i 1
x i y i
2 2
i 1 n i 1 n
in
3 5 9 25 15
4 4 16 16 16
5 7 25 49 35
re
10
r
i to
x i 15 y i 21 x i
2
55 y i
2
103 x y i i 74
n n
x i y i
xi y i i 1 n i 1
Ed
rxy =
n
2
n
2
n x i n yi
i 1 i 1
x i y i
2 2
i 1 n i 1 n
DF
15 21
74
= 5
15
2
21
2
rP
55 103
5 5
= 0.90
te
Comment: There exists a strong positive relationship between x and y.
as
Assignment Problem-2: The following table gives the ages and blood pressure of 10 women:
M
Age in years 56 42 36 47 49 42 72 63 55 60
x
Blood pressure y 147 125 118 128 125 140 155 160 149 150
Draw a scatter diagram
in
Assignment Problem-3: The scores of 12 students in their mathematics and physics classes are:
Mathematics 2 3 4 4 5 6 6 7 7 8 10 10
at
Physics 1 3 2 4 4 4 6 4 6 7 9 10
11
r
i to
Comment on the followings:
Ed
(i) r=0, indicates that the correlation coefficient between x and y is zero.
(ii) r=-1, indicates that the correlation coefficient between x and y is perfect negative.
(iii) r= 1, indicates that the correlation coefficient between x and y is perfect positive.
(iv) r 1 i.e, r=1 and r>1 i.e, r>1, is not possible, because the Correlation coefficient lies between
DF
-1 to +1.
(v) r<1, not possible because, the Correlation coefficient lies between -1 to +1.
rP
1. To find the relationship between two variables.
2. To find the relationship between dependent variable and combined influence of a group of
independent variables.
3. To solve many problem in biology.
te
4. In social studies like relationships between crime and educations, correlation analysis has
got definite role to play.
5. In economies this is used specially.
as
RANK CORRELATION
Rank correlation: In some situation it is difficult to measure the values of the variables from
bivariate distribution numerically, but they can be ranked. The correlation coefficient between
M
these two ranks is usually called rank correlation coefficient, given by Spearman (1904). It is
denoted by R. this is the only method for finding relationship between two qualitative variables
like beauty, honesty, intelligence, efficiency and so on.
in
When there are no ties, the formula for computing the spearman’s rank correlation coefficient
6 d 2
R = 1-
n n2 1
ed
Remarks:
12
r
i to
(ii)Like simple correlation coefficient, rank correlation coefficient lies between -1 to +1.
Note: For finding rank correlation coefficient, we may have two types of data:
Ed
Actual observations are given
DF
If R = +1, then there is complete agreement in the order of the ranks and the ranks are in the same
direction
If R = -1, then there is complete agreement in the order of the ranks and the ranks are in the opposite
rP
direction
Application Problem-1: Obtain the rank correlation co-efficient for the following data:
te
A: 80 75 90 70 65 60
B: 65 70 60 75 85 80
as
Solution: Here ranks of the score are not given. Let us start ranking from the highest value for both
the variables as shown in the table given below:
(x) (y)
80 65 2 5 -3 9
75 70 3 4 -1 1
in
90 60 1 6 -5 25
70 75 4 3 1 1
65 85 5 1 4 16
60 80 6 2 4 16
Total d 0 d 68
ed
2
i i
6 d 2 64
R = 1-
n n2 1
= 1-
6 62 1
= - 0.94
at
Application Problem -2: Obtain the rank correlation co-efficient for the following data:
re
13
r
i to
Examiner A B C D E
I 1 2 3 4 5
II 2 4 1 5 4
Ed
Solution: Here ranks of the score are given:
Ranking by Ranking by d = R1 – R2 d2
examiner-I: R1 examiner-II: R2
1 2 -1 1
DF
2 3 -1 1
3 1 2 4
4 5 -1 1
5 4 1 1
Total d i 0 d i
2
8
rP
6 d 2 68
R = 1-
n n 1
2
= 1-
5 52 1
= 0.6 te
Comment: There is a positive rank correlation coefficient between the rankings of two examiners.
6 d 2
1
m1 m1
3
1
m 2 m 2 .............
3
12 12
M
R = 1-
n n 1
2
Application Problem -3: The following data refer to the marks obtained by 8 students in
mathematics and statistics:
Marks in mathematics 20 80 40 12 28 20 15 60
ed
Marks in statistics 30 60 20 30 50 30 40 20
Compute rank correlation coefficient and comment.
Solution: let the marks obtained by mathematics be x and the marks obtained by statistics be y.
at
14
r
i to
20 30 3.5 4 -0.5 0.25
80 60 8 8 0 0
40 20 6 2 4 16
12 30 1 4 -3 9
Ed
28 50 5 7 -2 4
20 30 3.5 4 -0.5 0.25
15 40 2 6 -4 16
60 10 7 1 6 36
d 81.5
2
i
DF
Here, m1 = 2, m2 = 3, n=8
681.5
1 3
2 2
1 3
3 3
rP
12 12
R = 1-
8 8 1
2
=0
Assignment problem-4:
Profit (Tk.Crore):x 25 28 27 33 31 10 16 16 18 23
(ii) Calculate Karl Pearson’s and Spearman rank correlation coefficients and comment.
Assignment problem-5:
re
15
r
i to
The following figures relate to advertisement expenditure and sales of a company:
Adv. Exp. 62 67 73 78 85 78 91 92 96 98
Ed
(Tk. Lac)
Sales 11 13 17 18 21 24 21 27 26 21
(Tk.Crore)
DF
Coefficient and comment.
Website:
http://www.pindling.org/Math/Statistics/Textbook/Examples/Chapter3/chapter3_examples.htm
rP
te
as
M
in
ed
at
re
16
r
i to
REGRESSION ANALYSIS
What is regression?
Ed
Ans: The probable movement of one variable in terms of the other variables is called
regression.
In other words the statistical technique by which we can estimate the unknown value of
one variable (dependent) from the known value of another variable is called regression.
DF
The term “regression” was used by a famous Biometrician Sir. F. Galton (1822-1911) in
1877.
Regression analysis.
rP
Ans: Regression analysis is a mathematical measure of the average relationship between
te
two or more variables in terms of the original units of data.
as
Regression coefficient.
Ans: The mathematical measures of regression are called the coefficient of regression.
M
Let, (x1,y1), (x2,y2)……….. (xn,yn) be the pairs of n observations. Then the regression
coefficient of y on x is denoted by byx and defined by
n
in
(x
i 1
i x )( y i y )
byx = n
x x
2
i
i 1
ed
(x i x )( y i y )
at
i 1
bxy = n
y y
2
i
i 1
re
17
r
i to
Regression lines:
If we consider two variables X and Y, we shall have two regression lines as the regression
line of Y on X and the regression line of X on Y. The regression line of Y on X gives the
Ed
most probable values of Y for given values of X and The regression line of X on Y gives
the most probable values of X for given values of Y. Thus we have two regression lines.
However, when there is either perfect positive or perfect negative correlation between the
two variables, the two regression lines will coincide i.e, we will have one line.
DF
Regression equation:
The regression equation of y on x is expressed as follows:
rP
variable, a is the intercept term (assume mean) and b is the slope of the line.
n
y bx (x
i 1
i x )( y i y )
Here, a = y - b x = and b= n
n n
x x
2
te
i
i 1
n n
x i y i
as
xi y i i 1
i 1
n
= 2
n
x
M
i
i 1
n
xi
2
i 1 n
Here, a = x - b y
n
(x
i 1
i x )( y i y )
And b=
at
y y
2
i
i 1
re
18
r
i to
n n
x i y i
xi y i i 1 n i 1
=
Ed
2
n
yi
i 1
n
y
2
i
i 1 n
DF
Ans: 1. Regression coefficient is independent of change of origin but not of scale.
rP
4. The geometric mean of regression coefficients is equal to correlation coefficient
i.e, rxy= b yx b xy
te
5. The arithmetic mean of two regression coefficient is greater than correlation
b yx bxy
Coefficient. i.e, rxy
as
2
6. If one of regression coefficient is greater than unity the other must be less than
M
Coefficient of Determination, r 2 or R2 :
in
measure that allows us to determine how certain one can be in making predictions from
a certain model/graph. The coefficient of determination is the ratio of the explained
variation to the total variation.
at
The coefficient of determination is such that 0 < r 2 < 1, and denotes the strength of
the linear association between x and y.
re
19
r
i to
The coefficient of determination represents the percent of the data that is the closest to
the line of best fit. For example, if r = 0.922, then r 2 = 0.850, which means that 85% of
the total variation in y can be explained by the linear relationship between x and y (as
Ed
described by the regression equation). The other 15% of the total variation in y remains
unexplained.
The coefficient of determination is a measure of how well the regression line
DF
represents the data. If the regression line passes exactly through every point on the
scatter plot, it would be able to explain all of the variation. The further the line is away
from the points, the less it is able to explain.
rP
Show that correlation coefficient is the geometric mean of regression
coefficients. i.e, rxy= b yx b xy
te
Proof: Let, (x1,y1), (x2,y2)………..(xn,yn) be the pairs of n observations. Then the
correlation coefficient between x and y is denoted by rxy and defined as,
as
n
(x
i 1
i x )( y i y )
rxy = ……………..(1)
n n
x x y y
2 2
M
i i
i 1 i 1
(x
i 1
i x )( y i y )
Again, the regression coefficient of y on x is, byx =
in
x x
2
i
i 1
(x x )( y i y )
ed
i
i 1
Again, the regression coefficient of x on y is, bxy = n
y y
2
i
i 1
at
re
20
r
i to
n n
(x
i 1
i x )( y i y ) x
i 1
i x y i y
byx bxy = n
n
x i x y y
2 2
Ed
i
i 1 i 1
(xi 1
i x )( y i y )
b yx b xy
n n
x x y y
2 2
i i
DF
i 1 i 1
= rxy (proved)
rP
The arithmetic mean of two regression coefficient is greater than correlation
b yx bxy
coefficient. i.e, rxy
2
te
Proof: Let, (x1,y1), (x2,y2)……….. (xn,yn) be the pairs of n observations. Then the
regression coefficient of y on x is denoted by byx and the regression coefficient of x on y is
denoted by bxy.
as
b yx bxy
The arithmetic mean of byx and bxy is A.M= and the geometric mean is
2
M
G.M= b yx b xy
i.e, rxy= b yx b xy
b yx bxy
or, b yx b xy
2
b yx bxy
r (proved
at
or,
2
re
21
r
i to
Uses of regression.
Ans: (i) Whether a relationship exists or not.
Ed
(ii) To find the strength of relationship.
DF
Distinguish between correlation coefficient and regression coefficient.
rP
the strength of linear relationship between regression are called the coefficient of
two or more variables is called correlation regression.
coefficient. te
2. Correlation coefficient is independent of 2. Regression coefficient is independent
change of origin and scale of measurement. of change of origin but not of scale.
6. When r=0 then the variables are 6. When r=0 then two lines of regression
in
Application problem-1: A researcher wants to find out if there is any relationship between
the ages of husbands and the ages of wives. In other words, do old husbands have old wives
ed
and young husbands have young wives? He took a random sample of 7 couples whose
respective ages are given below:
Ed
(d) Compute the value of correlation coefficient with the help of regression coefficients.
DF
n n
x i y i
xi y i i 1
i 1
n
Where, b = 2
and a = y - b x
n
xi
rP
i 1
n
xi
2
i 1 n
Computation table
te
x y x2 y2 xy
39 37 1521 1369 1443
25 18 625 324 450
as
29 20 841 400 580
35 25 1225 625 875
32 25 1024 625 800
27 20 729 400 540
M
n n
x i y i
xi y i
i 1 i 1
5798
224175
n 7
(a) Here, b = = 1.193
224 2
=
2
n
ed
n
x
i 1
i 7334
7
xi
2
i 1 n
at
And a = y - b x
=
y -b
x
re
n n
23
r
i to
=
175
-(1.193)
224 = 25-38.176 = -13.176
7 7
Ed
ŷ
(b) Hence, if the age of husband is 45, the probable age of wife would be
ŷ = -13.176 + 1.193x = -13.176 + 1.193 45 = 40.51 years.
DF
(c) The equation of the best –fitted regression line of y on x is x̂ = a + by
n n
x i y i
xi y i i 1 n i 1
rP
Where, b = 2
n
yi
i 1
n
y
2
i
i 1 n
te
5798
224 175
= 7 = 0.739
4643
175
2
as
7
And a = x -by
=
x by
M
n n
224 175
= 0.739 = 13.525
7 7
in
x̂ = a + by
Application problem-2: A research physician recorded the pulse rates and the
re
temperatures of water submerging the faces of ten small children in cold water to control
24
r
i to
the abnormally rapid heartbeats. The results are presented in the following table.
Calculate the correlation coefficient and regression coefficients between temperature of
water and reduction in pulse rate.
Ed
Temperature of water 68 65 70 62 60 55 58 65 69 63
Reduction in pulse rate. 2 5 1 10 9 13 10 3 4 6
b yx bxy
Also show that (i) rxy
2
DF
Solution: Calculating table of correlation coefficient and regression coefficients.
rP
65 5 4225 25 325
70 1 4900 1 70
62 10 3844 100 620
60 9 3600 81 540
55 13 3025 169 715
te
58 10 3364 100 580
65 3 4225 9 195
69 4 4761 16 276
as
63 6 3969 36 378
x i 635 yi 63 xi 40537
2
y i
2
541 xi yi =3835
M
n n
x i y i
xi y i i 1 n i 1
We know, rxy =
n
in
2 2
n
n x i n yi
i 1 i 1
x i y i
2 2
i 1 n i 1 n
ed
at
re
25
r
i to
635 63
3835
= 10
635 541 632
2
40537
Ed
10 10
= -0.94
n
(x i x )( y i y )
DF
i 1
We know, the regression coefficient of y on x is, byx = n
x x
2
i
i 1
n n
x i y i
rP
xi y i i 1 n i 1 3835
635 63
1655
= = 10 = = -0.77
n
2
6352 2145
xi 40537
n
i 1 10
xi
2
te
i 1 n
as
n
(x
i 1
i x )( y i y )
Again, the regression coefficient of x on y is, bxy = n
y y
2
i
i 1
M
n n
x i y i
xi y i i 1 n i 1 3835
635 63
1655
10 -1.1
in
= = =
n
2
63
2
1441
yi 541
n
i 1 10
y
2
i
i 1 n
ed
b yx bxy
(i) rxy
2
at
2 2
26
r
i to
Assignment Problem-1: The following data give the test scores and sales made by nine
salesmen during the last year of a big departmental store:
Test Scores: y 14 19 24 21 26 22 15 20 19
Ed
Sales(in lakh 31 36 48 37 50 45 33 41 39
Taka)
(a) Find the regression equation of test scores on sales.
Ans: ŷ = -2.4 + 0.56x
(b) Find the test scores when the sale is Tk. 40 lakh.
DF
Ans: 20 lakh
(c) Find the regression equation of sales on test scores.
Ans: x̂ = 7.8 + 1.61y
(d) Predict the value of sale if the test score is 30
rP
Ans: 56.1 lakh
(e) Compute the value of correlation coefficient with the help of regression coefficients.
Assignment Problem-2: The following table gives the ages and blood pressure of 10
women:
te
Age in years 56 42 36 47 49 42 72 63 55 60
x
as
Blood pressure 147 125 118 128 125 140 155 160 149 150
y
(i) Obtain the regression line of y on x. Ans: ŷ = 83.76+ 1.11x
(ii) Estimate the blood pressure of a women whose age is 50 years. Ans: 139.26
M
Assignment Problem-3: Consider the following data set on two variables x and y:
in
x:1 2 3 4 5 6
y:6 4 3 5 4 2
ed
27
r
i to
Assignment Problem-4: Cost accountants often estimate overhead based on production.
At the standard knitting company, they have collected information on overhead expenses
Ed
and units produced at different plants and what to estimate a regression equation to
predict future overhead.
Units 56 40 48 30 41 42 55 35
DF
(i)Draw a scatter diagram and comment
rP
(iii)Estimate overhead when 65 units are produced.
Assignment Problem-5: The following data refer to information about annual sales
Year of experience 7 4 5 6 11 12 13 17
Final examination
ed
SPRING-14(CSE)
(b) The following data give the hardness(X) and tensile strength(Y) of 7 samples of metal
at
in certain units.
re
28
r
i to
X 146 152 158 164 170 176 182
Y 75 78 77 89 82 85 86
Ed
(i)Obtain the regression equation of y on x (ii)Estimate the value y when x is 69.
Spring-2012 (EEE)
(a) What is regression analysis? Write down the properties of regression coefficient.
DF
(b) Prove that correlation coefficient is the geometric mean of regression coefficient.
(c) The regression coefficient of y on x is 0.5 and that of x on y is 1.9. Find the
b yx bxy
coefficient of correlation and also show that rxy
2
rP
Spring-2012 (ETE)
young husbands have young wives? He took a random sample of 7 couples whose
respective ages are given below:
(iii) Compute the value of correlation coefficient with the help of regression
coefficients.
Autumn-13(CSE)
re
29
r
i to
(a) What are regression coefficients? Point out the properties of regression coefficients.
(b) The following data give the hardness(X) and tensile strength(Y) of 7 samples of metal
Ed
in certain units.
X 146 152 158 164 170 176 182
Y 75 78 77 89 82 85 86
DF
(ii)Estimate the x when y is 79.
(iii)
rP
Spring-13(CSE)
(b) A researcher wants to find out if there is any relationship between the heights of the
sons and the heights of the fathers. He took a random sample of six fathers and their six
sons. Their heights in inches are given below:
te
Height of father(In inches): y 68 63 66 67 65 67
(i) Fit a regression line of the height of father y on the height of son x.
(ii) Predict the height of father if son’s height is 65 inches.
M
(c) The regression coefficient of y on x is -0.8 and that of x on y is -0.6. Find the
coefficient of correlation and comment.
ed
at
“THE END”
re
30