[R] OLS standard errors
Daniel Malter
daniel at umd.edu
Tue Feb 26 08:25:05 CET 2008
Hi,
the standard errors of the coefficients in two regressions that I computed
by hand and using lm() differ by about 1%. Can somebody help me to identify
the source of this difference? The coefficient estimates are the same, but
the standard errors differ.
####Simulate data
happiness=0
income=0
gender=(rep(c(0,1,1,0),25))
for(i in 1:100){
happiness[i]=1000+i+rnorm(1,0,40)
income[i]=2*i+rnorm(1,0,40)
}
####Run lm()
reg=lm(happiness~income+factor(gender))
summary(reg)
####Compute coefficient estimates "by hand"
x=cbind(income,gender)
y=happiness
z=as.matrix(cbind(rep(1,100),x))
beta=solve(t(z)%*%z)%*%t(z)%*%y
####Compare estimates
cbind(reg$coef,beta) ##fine so far, they both look the same
reg$coef[1]-beta[1]
reg$coef[2]-beta[2]
reg$coef[3]-beta[3] ##differences are too small to cause a 1%
difference
####Check predicted values
estimates=c(beta[2],beta[3])
predicted=estimates%*%t(x)
predicted=as.vector(t(as.double(predicted+beta[1])))
cbind(reg$fitted,predicted) ##inspect fitted values
as.vector(reg$fitted-predicted) ##differences are marginal
#### Compute errors
e=NA
e2=NA
for(i in 1:length(happiness)){
e[i]=y[i]-predicted[i] ##for "hand-computed" regression
e2[i]=y[i]-reg$fitted[i] ##for lm() regression
}
#### Compute standard error of the coefficients
sqrt(abs(var(e)*solve(t(z)%*%z))) ##for "hand-computed" regression
sqrt(abs(var(e2)*solve(t(z)%*%z))) ##for lm() regression using e2 from
above
##Both are the same
####Compare to lm() standard errors of the coefficients again
summary(reg)
The diagonal elements of the variance/covariance matrices should be the
standard errors of the coefficients. Both are identical when computed by
hand. However, they differ from the standard errors reported in
summary(reg). The difference of 1% seems nonmarginal. Should I have
multiplied var(e)*solve(t(z)%*%z) by n and divided by n-1? But even if I do
this, I still observe a difference. Can anybody help me out what the source
of this difference is?
Cheers,
Daniel
-------------------------
cuncta stricte discussurus
More information about the R-help
mailing list