[R] test for whether dataset comes from a known MVN
Desmond Campbell
desmondcampbell at yahoo.com
Thu Oct 11 20:28:28 CEST 2007
Dear Ben Bolker,
Thanks for replying and offering advice, unfortunately it doesn't solve my
problem.
1) The mshapiro.test() in the mvnormtest package appears only applicable
for datasets containing 3-5000 samples, whereas my dataset contains 100,000
samples.
2) As you said in your email if my data is from the real world then any
test is likely to reject the null hypothesis, because of the power of such a
large dataset.
However my data is not from the real world. I am conducting validation
studies, and if the program I am testing is working correctly then the dataset
will be perfectly normally distributed.
Thanks anyway.
regards
Desmond Campbell
> Campbell, Desmond wrote:
>
> Dear all,
>
> I
have a multivariate dataset containing 100,000 or more points.
> I want
find the p-value for the dataset of points coming from a
> particular
multivariate normal distribution
> With
> mean vector u
>
Covariance matrix s2
> So
> H0: points ~ MVN( u, s2)
> H1:
points not ~ MVN( u, s2)
> How do I find the p-value in R?
>
> Ben Bolker wrote:
> > Googling for "Shapiro-Wilk multivariate" brings up
mshapiro.test()
> > in the mvnormtest package. However, I would
strongly suspect that
> > if your data are from the real world that you
will reject the null
> > hypothesis
> > of multivariate
normality when you have 100,000 points -- the power
> > to detect tiny
(unimportant?) deviations from MVN will be very
high.
> >
> > cheers
> > Ben Bolker
It's about the oil, stupid!
("`-/")_.-'"``-._
. . `; -._ )-;-,_`)
(v_,)' _ )`-.\ ``-'
_.- _..-_/ / ((.'
((,.-' ((,/
___________________________________________________________
Want ideas for reducing your carbon footprint? Visit Yahoo! For Good http://uk.promotions.yahoo.com/forgood/environment.html
More information about the R-help
mailing list