Correlation and regression analysis ppt

application of regression analysis ppt and introduction to regression analysis ppt
Dr.ShawnPitt Profile Pic
Dr.ShawnPitt,Netherlands,Teacher
Published Date:25-07-2017
Your Website URL(Optional)
Comment
Regression Analysis Lecture 6: Regression Analysis MIT 18.S096 Dr. Kempthorne Fall 2013 MIT 18.S096 Regression Analysis 1Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Outline 1 Regression Analysis Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation MIT 18.S096 Regression Analysis 2Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Multiple Linear Regression: Setup Data Set n cases i = 1; 2;:::;n 1 Response (dependent) variable y; i = 1; 2;:::;n i p Explanatory (independent) variables T x = (x ;x ;:::;x ) ; i = 1; 2;:::;n i i;1 i;2 i;p Goal of Regression Analysis: Extract/exploit relationship between y and x . i i Examples Prediction Causal Inference Approximation Functional Relationships MIT 18.S096 Regression Analysis 3Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation General Linear Model: For each case i, the conditional distribution y j x is given by i i y = y + i i i where y = x + x + + x i 1 i;1 2 i;2 i;p i;p T = ( ; ;:::; ) are p regression parameters 1 2 p (constant over all cases)  Residual (error) variable i (varies over all cases) Extensive breadth of possible models j Polynomial approximation (x = (x ) , explanatory variables are di erent i;j i powers of the same variable x = x ) i Fourier Series: (x = sin(jx ) or cos(jx ), explanatory variables are di erent i;j i i sin/cos terms of a Fourier series expansion) Time series regressions: time indexed by i, and explanatory variables include lagged response values. Note: Linearity of y (in regression parameters) maintained with non-linear x. i MIT 18.S096 Regression Analysis 4Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Steps for Fitting a Model (1) Propose a model in terms of Response variable Y (specify the scale) Explanatory variables X ;X ;:::X (include di erent 1 2 p functions of explanatory variables if appropriate) Assumptions about the distribution of  over the cases (2) Specify/de ne a criterion for judging di erent estimators. (3) Characterize the best estimator and apply it to the given data. (4) Check the assumptions in (1). (5) If necessary modify model and/or assumptions and go to (1). MIT 18.S096 Regression Analysis 5Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Specifying Assumptions in (1) for Residual Distribution Gauss-Markov: zero mean, constant variance, uncorrelated 2 Normal-linear models:  are i.i.d. N(0; ) r.v.s i Generalized Gauss-Markov: zero mean, and general covariance matrix (possibly correlated,possibly heteroscedastic) Non-normal/non-Gaussian distributions (e.g., Laplace, Pareto, Contaminated normal: some fraction (1) of the are i.i.d. i 2 N(0; ) r.v.s the remaining fraction () follows some contamination distribution). MIT 18.S096 Regression Analysis 6Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Specifying Estimator Criterion in (2) Least Squares Maximum Likelihood Robust (Contamination-resistant) Bayes (assume are r.v.'s with known prior distribution) j Accommodating incomplete/missing data Case Analyses for (4) Checking Assumptions Residual analysis Model errors  are unobservable i Model residuals for tted regression parameters are: j e = y x + x + + x i i 1 i;1 2 i;2 p i;p In uence diagnostics (identify cases which are highly `in uential'?) Outlier detection MIT 18.S096 Regression Analysis 7Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Outline 1 Regression Analysis Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation MIT 18.S096 Regression Analysis 8Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Ordinary Least Squares Estimates T Least Squares Criterion: For = ( ; ;:::; ) , de ne 1 2 p P N 2 Q( ) = y y i i i=1 P N 2 = y ( x + x + + x ) i 1 i;1 2 i;2 i;p i;p i=1 Ordinary Least-Squares (OLS) estimate : minimizes Q( ). Matrix Notation 0 1 2 3 0 1 y x x  x 1 1;1 1;2 1;p 1 B C 6 7 y x x  x 2 2;1 2;2 2;p B C 6 7 B . C . y =B C X =6 7 = . . . . A . . . . . . . A 4 5 . . . . . p y x x  x n n;1 n;2 p;n MIT 18.S096 Regression Analysis 9Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Solving for OLS Estimate 0 1 y 1 B C y 2 B C y =B C = X and . . A . y n P n 2 T Q( ) = (y y ) = (y y) (y y) i i i=1 T = (y X ) (y X ) Q( ) OLS solves =0; j = 1; 2;:::;p j  P Q( ) n 2 = y (x +x +x ) i i;1 1 i;2 2 i;p p i=1 j j P n = 2(x )y (x +x +x ) i;j i i;1 1 i;2 2 i;p p i=1 T = 2(X ) (y X ) where X is the jth column of X j j MIT 18.S096 Regression Analysis 10Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Solving for OLS Estimate 2 3 2 3 Q T X (y X ) 1 1 6 7 6 7 Q T X (y X ) 6 7 6 7 Q 2 2 T 6 7 6 7 = =2 =2X (y X ) . . 6 7 6 7 . . . . 4 5 4 5 Q T X (y X ) p p So the OLS Estimate solves the \Normal Equations" T X (y X ) = 0 T T () X X = X y T 1 T =) = (X X) X y N.B. For to exist (uniquely) T (X X) must be invertible () X must have Full Column Rank MIT 18.S096 Regression Analysis 11Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation (Ordinary) Least Squares Fit OLS Estimate: 0 1 1 B 2 C T T 1 = = (X X) X y Fitted Values: . A . . p 0 1 0 1 y x + +x 1;1 1 1;p p 1 B C B C y x + +x 2 2;1 1 2;p p B C B C y = = B C B C . . . . A A . . y x + +x n n;1 1 n;p p T T 1 = X = X(X X) X y = Hy T T 1 Where H = X(X X) X is the nn \Hat Matrix" MIT 18.S096 Regression Analysis 12Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation (Ordinary) Least Squares Fit n The Hat Matrix H projects R onto the column-space of X Residuals:  = y y ; i = 1; 2;:::;n i i i 0 1  1  B 2C  = = y y = (I H)y . A n . .  n 0 1 0 B . C T T . Normal Equations: X (y X ) = X  = 0 = p A . 0 N.B. The Least-Squares Residuals vector  is orthogonal to the column space of X MIT 18.S096 Regression Analysis 13Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Outline 1 Regression Analysis Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation MIT 18.S096 Regression Analysis 14Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Gauss-Markov Theorem: Assumptions 0 1 2 3 y x x  x 1 1;1 1;2 1;p B C 6 7 y x x  x 2 2;1 2;2 2;p B C 6 7 Data y =B C and X =6 7 . . . . . . . . . . A 4 5 . . . . . y x x  x n n;1 n;2 p;n follow a linear model satisfying the Gauss-Markov Assumptions T if y is an observation of random vector Y = (Y ;Y ;:::Y ) and 1 2 N T E(Yj X; ) = X ; where = ( ; ;::: ) is the 1 2 p p-vector of regression parameters. 2 2 Cov(Yj X; ) = I ; for some  0. n I.e., the random variables generating the observations are 2 uncorrelated and have constant variance  (conditional on X, and ). MIT 18.S096 Regression Analysis 15Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Gauss-Markov Theorem For known constants c ;c ;:::;c ;c , consider the problem of 1 2 p p+1 estimating  = c +c +c +c : 1 1 2 2 p p p+1 Under the Gauss-Markov assumptions, the estimator  = c +c +c +c ; 1 1 2 2 p p p+1 where ; ;::: are the least squares estimates is 1 2 p 1) An Unbiased Estimator of  2) A Linear Estimator of ; that is P n  = b y , for some known (given X) constants b . i i i i=1 Theorem: Under the Gauss-Markov Assumptions, the estimator  has the smallest (Best) variance among all Linear Unbiased Estimators of ; i.e.,  is BLUE: MIT 18.S096 Regression Analysis 16Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Gauss-Markov Theorem: Proof Proof: Without loss of generality, assume c = 0 and p+1 T de ne c =(c ;c ;:::;c ) : 1 2 p T The Least Squares Estimate of  = c is: T T T T T 1  = c = c (X X) X y d y T a linear estimate in y given by coecients d = (d ;d ;:::;d ) . 1 2 n Consider an alternative linear estimate of : T  = b y T with xed coecients given by b = (b ;:::;b ) : 1 n De ne f = b d and note that T T T  = b y = (d + f) y = + f y If  is unbiased then because  is unbiased T T T p 0 = E(f y) = d E(y) = f (X ) for all 2 R =) f is orthogonal to column space of X T 1 =) f is orthogonal to d = X(X X) c MIT 18.S096 Regression Analysis 17Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation If  is unbiased then The orthogonality of f to d implies T T T Var() = Var(b y) = Var(d y + f y) T T T T = Var(d y) +Var(f y) + 2Cov(d y; f y) T T = Var() +Var(f y) + 2d Cov(y)f T T 2 = Var() +Var(f y) + 2d ( I )f n T T 2 = Var() +Var(f y) + 2 d f T 2 = Var() +Var(f y) + 2  0  Var() MIT 18.S096 Regression Analysis 18Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Outline 1 Regression Analysis Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation MIT 18.S096 Regression Analysis 19Linear Regression: Overview Ordinary Least Squares (OLS) Gauss-Markov Theorem Regression Analysis Generalized Least Squares (GLS) Distribution Theory: Normal Regression Models Maximum Likelihood Estimation Generalized M Estimation Generalized Least Squares (GLS) Estimates Consider generalizing the Gauss-Markov assumptions for the linear regression model to Y = X + 0 2 where the random n-vector : E  = 0 and E  = . n 2  is an unknown scale parameter  is a known (nn) positive de nite matrix specifying the relative variances and correlations of the component observations. 1 1   2 2 Transform the data (Y; X) to Y =  Y and X =  X and the model becomes       0 2 Y = X + , where E  = 0 and E  ( ) = I n n By the Gauss-Markov Theorem, the BLUE (`GLS') of is  T  1  T  T 1 1 T 1 = (X ) (X ) (X ) (Y ) = X  X (X  Y) MIT 18.S096 Regression Analysis 20

Advise: Why You Wasting Money in Costly SEO Tools, Use World's Best Free SEO Tool Ubersuggest.