[R] survey weights in sample with replacement
Mehtabul Azam
mazam at smu.edu
Wed Oct 31 17:51:14 CET 2007
Thanks Thomas ! I am trying to draw random sample from a household survey
which has 80,000 observations.
rural is name of the dataset, while iwt is survey weights assigned to each
observation.
the resulting error are :
> z=sample(rural,5000,replace=TRUE, Prob=rural$iwt)
Error in sample(rural, 5000, replace = TRUE, Prob = rural$iwt) :
unused argument(s) (Prob = c(133, 133, 166, 166, 166, 166, 1047,
1047, 1047, 1047, 288, 623, 623, 240, 240, 432, 144, 144, 719, 719, 316,
342, 342, 816, 816, 105, 158, 158, 1105, 1105, 101, 557, 557, 405, 405, 101,
304, 304, 1165, 1165, 193, 771, 771, 1060, 1060, 482, 530, 530, 2024, 2024,
254, 254, 241, 241, 241, 241, 674, 674, 674, 674, 137, 137, 623, 623, 623,
623, 603, 603, 603, 603, 285, 556, 556, 970, 970, 285, 728, 728, 499, 499,
272, 1349, 1349, 218, 218, 272, 1240, 1240, 95, 95, 307, 307, 307, 307, 307,
> iwt=rural[,"iwt"]
> z=sample(rural,5000,replace=TRUE, Prob=iwt)
Error in sample(rural, 5000, replace = TRUE, Prob = iwt) :
unused argument(s) (Prob = c(133, 133, 166, 166, 166, 166, 1047,
1047, 1047, 1047, 288, 623, 623, 240, 240, 432, 144, 144, 719, 719, 316,
342, 342, 816, 816, 105, 158, 158, 1105, 1105, 101, 557, 557, 405, 405, 101,
304, 304, 1165, 1165, 193, 771, 771, 1060, 1060, 482, 530, 530, 2024, 2024,
254, 254, 241, 241, 241, 241, 674, 674, 674, 674, 137, 137, 623, 623, 623,
623, 603, 603, 603, 603, 285, 556, 556, 970, 970, 285, 728, 728, 499, 499,
272, 1349, 1349, 218, 218, 272, 1240, 1240, 95, 95, 307, 307, 307, 307, 307,
> iwt=as.vector(rural[,"iwt"])
> z=sample(rural,5000,replace=TRUE, Prob=iwt)
Error in sample(rural, 5000, replace = TRUE, Prob = iwt) :
unused argument(s) (Prob = c(133, 133, 166, 166, 166, 166, 1047,
1047, 1047, 1047, 288, 623, 623, 240, 240, 432, 144, 144, 719, 719, 316,
342, 342, 816, 816, 105, 158, 158, 1105, 1105, 101, 557, 557, 405, 405, 101,
304, 304, 1165, 1165, 193, 771, 771, 1060, 1060, 482, 530, 530, 2024, 2024,
254, 254, 241, 241, 241, 241, 674, 674, 674, 674, 137, 137, 623, 623, 623,
623, 603, 603, 603, 603, 285, 556, 556, 970, 970, 285, 728, 728, 499, 499,
272, 1349, 1349, 218, 218, 272, 1240, 1240, 95, 95, 307, 307, 307, 307, 307,
summary(rural$iwt)
Min. 1st Qu. Median Mean 3rd Qu. Max.
1 400 1078 1894 2981 54320
>
I just want that random sample look as close as possible to population (
weighted proportions generated from sample)
I thought it should automatically normalize probablity vector.I am not sure,
i am reading this right // I might be totally off the track.
Regards,
Mehtab
-----Original Message-----
From: Thomas Lumley [mailto:tlumley at u.washington.edu]
Sent: Wednesday, October 31, 2007 9:44 AM
To: Azam, Mehtabul
Cc: r-help at stat.math.ethz.ch
Subject: Re: [R] survey weights in sample with replacement
On Tue, 30 Oct 2007, Azam, Mehtabul wrote:
>>> Hi,
> I am trying to draw a random sample from an household survey with
> sample weight. Is there any function in R or Splus which allows this.
>
It depends on exactly what you want.
The sample() function will draw unequal probability samples with
replacement.
sample() will also draw samples without replacement, but (as documented)
it uses sequential sampling and so does not actually generate
probabilities proportional to the specified weights for sample sizes
greater than 1.
The error in sequential sampling is pretty small, but it has attracted a
lot of creativity in the survey literature (probably more than it
deserves). The 'sampling' package implements several algorithms for
drawing unequal probability samples without replacement that really are
proportional to the specified weights where this is achievable.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list