Linear Regression-2: Prof. Asim Tewari IIT Bombay
Linear Regression-2: Prof. Asim Tewari IIT Bombay
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
It assumes that there is approximately a linear
relationship between X and Y
or
β0 and β1 are intercept slope known as the
model coefficients or parameters
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Estimating the Coefficients
• Least squares approach
The least squares approach chooses parameters
to minimize the residual sum of squares (RSS)
represents ith residual
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Estimating the Coefficients
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Assessing the Accuracy of the Coefficient
Estimates
Standard Errors associated with coefficients
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Assessing the Accuracy of the Coefficient
Estimates
Standard Errors associated with coefficients
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Assessing the Accuracy of the Coefficient
Estimates
Standard Errors associated with coefficients
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Hypothesis tests on the coefficients
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
p-value is defined as
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
p-value is defined as
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
p-value is defined as
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
p-value is defined as
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
P-Value is the probability of observing any value
equal to |t| or larger for a t-distribution with
n−2 degrees of freedom
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
• The p-value represents the chance your results
could be random (i.e. happened by chance).
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
For the Advertising data, the least squares fit for the regression of sales onto TV is shown.
The fit is found by minimizing the sum of squared errors. Each grey line segment represents
an error, and the fit makes a compromise by averaging their squares. In this case a linear fit
captures the essence of the relationship, although it is somewhat deficient in the left of the
plot.
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
P-value
For the Advertising data, coefficients of the least squares model for the regression of number
of units sold on TV advertising budget. An increase of $1,000 in the TV advertising budget is
associated with an increase in sales by around 50 units (Recall that the sales variable is in
thousands of units, and the TV variable is in thousands of dollars).
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
SE of a mean of a RV
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
SE of a mean of a RV
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining
Simple Linear Regression
Assessing the Accuracy of the Model
Residual Standard Error (RSE)
R=
Asim Tewari, IIT Bombay ME 781: Statistical Machine Learning and Data Mining