[R] How to use lm() with both subset and weights argument

Alp Atıcı alpatici at gmail.com
Fri Oct 19 21:47:27 CEST 2007


I'd like to fit a linear model on a subset of a data frame with given weights.

I am curious how the lm() works when both subset and weights argument
is specified.

Let me give an example: filter is a boolean vector of length the same
as one column of df, my dataframe. What I want is the linear
regression weight to be ==0 on those elements for which filter is
FALSE and ==df$wght otherwise. But when I do
summary(lm(formula,df,weights=df$wght,subset=filter))
the outcome is exactly the same as
summary(lm(formula,df,weights=df$wght))

However when I do
summary(lm(formula,df,subset=filter,weights=df$wght)) the result I get
is different. Is this result what I intend to achieve or not?

I am worried that the result might be different than what I want to
achieve -- for instance it might be the case that first the elements
for which the subset == TRUE is taken but then the weights vector is
taken as df$wght[1:sum(filter)] instead of df$wght[filter].

All help is much appreciated. Thank you.



More information about the R-help mailing list