[R] Regression Identity
peter dalgaard
pdalgd at gmail.com
Wed Jul 18 10:26:09 CEST 2012
On Jul 18, 2012, at 05:11 , darnold wrote:
> Hi,
>
> I see a lot of folks verify the regression identity SST = SSE + SSR
> numerically, but I cannot seem to find a proof. I wonder if any folks on
> this list could guide me to a mathematical proof of this fact.
>
Wrong list, isn't it?
http://stats.stackexchange.com/ is -----> _that_ way...
Anyways: Any math stats book should have it somewhere inside. There are two basic approaches, depending on what level of abstraction one expects from students.
First principles: Write out SST=sum((y-yhat)+(yhat-ybar))^2 and use the normal equations to show that the sum of product terms is zero. This is a bit tedious, but straightforward in principle.
Linear algebra: The least squares fitted values are the orthogonal projection onto a subspace of R^N (N=number of observations). Hence the vector of residuals is orthogonal to the vector (yhat - ybar) and the N-dimensional version of the Pythagorean theorem is
||yhat - ybar||^2 + ||y - yhat||^2 == ||y - ybar||^2
since the three vectors involved form a right-angled triangle. (http://en.wikipedia.org/wiki/Pythagorean_theorem, scroll down to "Inner product spaces".)
--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
More information about the R-help
mailing list