[R] Tricky vectorization problem
Mike Lawrence
Mike.Lawrence at DAL.CA
Sun Oct 7 01:53:04 CEST 2007
Hi all,
I'm using the code below within a loop that I run thousands of times
and even with the super-computing resources at my disposal this is
just too slow. The snippet below takes about 10s on my machines,
which is an order of magnitude or two slower than would be
preferable; in the end I'd like to set the number of monte carlo
experiments to 1e4 or even 1e5 to ensure stable results. I've tried
my hand at vectorizing it myself but I'm finding it tricky. Any help
would be greatly appreciated!
This code attempts to find the distribution of observed correlation
values between A & B when each are measured imperfectly:
start = proc.time()[1] #start a timer
rho = .5 #set the true correllation
Ns = 100 #set the number of Subjects
a.No = 1:100 #for each subject, set a number of observations in A
(1:100 chosen arbitrarily for demonstration)
b.No = 1:100 #for each subject, set a number of observations in B
v.a = 1 #set the between Ss variance in condition A
v.b = 2 #set the between Ss variance in condition B
s.a.w = 1:100 #for each subject, set a standard deviation in A (1:100
chosen arbitrarily for demonstration)
s.b.w = 1:100 #for each subject, set a standard deviation in B
mce = 1e3 #set the number of monte carlo experiments
sim.r = vector(length=mce) #set up a collector for the simulated
correlations
Sigma=matrix(c(v.a,sqrt(v.a*v.b)*rho,sqrt(v.a*v.b)*rho,v.b),2,2)
#define a covariance matrix for use in generating the correlated data
a = vector(length=Ns) #set up a collector for subject means in A
b = vector(length=Ns) #set up a collector for subject means in B
for(i in 1:mce){ #start a monte carlo experiment
sub.means=mvrnorm(Ns,rep(0,2),Sigma) #generate correlated ideal
means for for each subject in each condition
for(s in 1:Ns){ #for each subject
a[s] = mean(rnorm(a.No[s],sub.means[s,1],s.a.w[s])) #generate some
data for A and grab the mean
b[s] = mean(rnorm(b.No[s],sub.means[s,2],s.b.w[s])) #generate some
data for B and grab the mean
}
sim.r[i] = cor(a,b) #store the observed correlation between A and B
}
print(proc.time()[1]-start) #show the total time this took
--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University
Website: http://memetic.ca
Public calendar: http://icalx.com/public/informavore/Public
"The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less."
- Piet Hein
More information about the R-help
mailing list