[R-sig-ME] parallel MCMCglmm, RNGstreams, starting values & priors
Jarrod Hadfield
j.hadfield at ed.ac.uk
Fri Aug 29 09:04:42 CEST 2014
Hi Ruben,
Can you share your data and I will take a look. Its definitely not
Monte Carlo error.
Cheers,
Jarrod
Quoting Ruben Arslan <rubenarslan at gmail.com> on Thu, 28 Aug 2014
22:44:47 +0200:
> Hi Jarrod,
>
> those two matched up quite well yes. I just completed another 20
> chains, using more variable
> starting values. There's still two fixed effects
> traitza_children:spouses and :male which haven't converged
> according to multi-chain (gelman), but have according to geweke.
> The offending traces: http://imgur.com/Qm6Ovfr
> These specific effects aren't of interest to me, so if this doesn't
> affect the rest of my estimates, I can be happy
> with this, but I can't conclude that, can I?
>
> I'm now also doing a run to see how it deals with the more intensely
> zero-inflated data when including
> the unmarried.
>
> Thanks a lot for all that help,
>
> Ruben
>
>> gelman.diag(mcmclist)
> Potential scale reduction factors:
>
> Point est. Upper C.I.
> (Intercept) 1.00 1.00
> traitza_children 1.40 1.65
> male 1.00 1.00
> spouses 1.00 1.00
> paternalage.mean 1.00 1.00
> paternalage.factor(25,30] 1.00 1.00
> paternalage.factor(30,35] 1.00 1.00
> paternalage.factor(35,40] 1.00 1.00
> paternalage.factor(40,45] 1.00 1.00
> paternalage.factor(45,50] 1.00 1.00
> paternalage.factor(50,55] 1.00 1.00
> paternalage.factor(55,90] 1.00 1.00
> traitza_children:male 1.33 1.54
> traitza_children:spouses 2.21 2.83
> traitza_children:paternalage.mean 1.01 1.02
> traitza_children:paternalage.factor(25,30] 1.05 1.08
> traitza_children:paternalage.factor(30,35] 1.08 1.13
> traitza_children:paternalage.factor(35,40] 1.15 1.25
> traitza_children:paternalage.factor(40,45] 1.15 1.26
> traitza_children:paternalage.factor(45,50] 1.26 1.43
> traitza_children:paternalage.factor(50,55] 1.15 1.25
> traitza_children:paternalage.factor(55,90] 1.14 1.23
>
> Multivariate psrf
>
> 8.99
>
>> summary(mcmclist)
>
> Iterations = 100001:149951
> Thinning interval = 50
> Number of chains = 20
> Sample size per chain = 1000
>
> 1. Empirical mean and standard deviation for each variable,
> plus standard error of the mean:
>
> Mean SD Naive
> SE Time-series SE
> (Intercept) 1.36326 0.04848
> 0.0003428 0.0003542
> traitza_children -0.76679 0.28738
> 0.0020321 0.0016682
> male 0.09980 0.01633
> 0.0001155 0.0001222
> spouses 0.12333 0.01957
> 0.0001384 0.0001414
> paternalage.mean 0.07215 0.02194
> 0.0001551 0.0001596
> paternalage.factor(25,30] -0.03381 0.04184
> 0.0002959 0.0003066
> paternalage.factor(30,35] -0.08380 0.04270
> 0.0003019 0.0003118
> paternalage.factor(35,40] -0.16502 0.04569
> 0.0003231 0.0003289
> paternalage.factor(40,45] -0.16738 0.05090
> 0.0003599 0.0003697
> paternalage.factor(45,50] -0.18383 0.05880
> 0.0004158 0.0004242
> paternalage.factor(50,55] -0.18241 0.07277
> 0.0005146 0.0005302
> paternalage.factor(55,90] -0.40612 0.09875
> 0.0006983 0.0007467
> traitza_children:male 0.12092 0.08223
> 0.0005815 0.0004697
> traitza_children:spouses 0.64881 0.21132
> 0.0014942 0.0008511
> traitza_children:paternalage.mean -0.02741 0.08550
> 0.0006046 0.0006221
> traitza_children:paternalage.factor(25,30] -0.17296 0.18680
> 0.0013209 0.0013750
> traitza_children:paternalage.factor(30,35] -0.19027 0.19267
> 0.0013624 0.0013901
> traitza_children:paternalage.factor(35,40] -0.24911 0.21282
> 0.0015049 0.0014391
> traitza_children:paternalage.factor(40,45] -0.29772 0.23403
> 0.0016548 0.0015956
> traitza_children:paternalage.factor(45,50] -0.51782 0.28589
> 0.0020215 0.0017602
> traitza_children:paternalage.factor(50,55] -0.46126 0.32064
> 0.0022673 0.0021397
> traitza_children:paternalage.factor(55,90] -0.38612 0.41461
> 0.0029317 0.0027396
>
> 2. Quantiles for each variable:
>
> 2.5% 25%
> 50% 75% 97.5%
> (Intercept) 1.26883 1.33106
> 1.36322 1.39575 1.4589722
> traitza_children -1.20696 -0.95751
> -0.81076 -0.63308 -0.0365042
> male 0.06785 0.08878
> 0.09970 0.11085 0.1320168
> spouses 0.08467 0.11030
> 0.12343 0.13643 0.1617869
> paternalage.mean 0.02950 0.05751
> 0.07202 0.08683 0.1153881
> paternalage.factor(25,30] -0.11581 -0.06174
> -0.03397 -0.00574 0.0473783
> paternalage.factor(30,35] -0.16656 -0.11250
> -0.08358 -0.05519 0.0003065
> paternalage.factor(35,40] -0.25518 -0.19530
> -0.16500 -0.13440 -0.0757366
> paternalage.factor(40,45] -0.26887 -0.20164
> -0.16675 -0.13335 -0.0677407
> paternalage.factor(45,50] -0.30080 -0.22320
> -0.18339 -0.14440 -0.0687967
> paternalage.factor(50,55] -0.32663 -0.23034
> -0.18227 -0.13317 -0.0415547
> paternalage.factor(55,90] -0.60202 -0.47303
> -0.40454 -0.33994 -0.2139128
> traitza_children:male -0.01083 0.06634
> 0.11024 0.16109 0.3295892
> traitza_children:spouses 0.37857 0.51072
> 0.59398 0.71395 1.2127940
> traitza_children:paternalage.mean -0.19138 -0.08250
> -0.02985 0.02493 0.1468989
> traitza_children:paternalage.factor(25,30] -0.57457 -0.28481
> -0.16489 -0.05151 0.1728148
> traitza_children:paternalage.factor(30,35] -0.61499 -0.30350
> -0.17736 -0.06299 0.1555147
> traitza_children:paternalage.factor(35,40] -0.74251 -0.36752
> -0.22966 -0.10777 0.1151897
> traitza_children:paternalage.factor(40,45] -0.84165 -0.42691
> -0.27729 -0.14322 0.1032436
> traitza_children:paternalage.factor(45,50] -1.21782 -0.66568
> -0.48420 -0.32873 -0.0476720
> traitza_children:paternalage.factor(50,55] -1.21327 -0.63623
> -0.43432 -0.24957 0.0955360
> traitza_children:paternalage.factor(55,90] -1.33772 -0.62227
> -0.35364 -0.11050 0.3361684
>
>> effectiveSize(mcmclist)
> (Intercept)
> traitza_children
> 18814.05
> 16359.33
> male
> spouses
> 18132.98
> 19547.05
> paternalage.mean
> paternalage.factor(25,30]
> 19238.72
> 18974.81
> paternalage.factor(30,35]
> paternalage.factor(35,40]
> 18874.33
> 19406.63
> paternalage.factor(40,45]
> paternalage.factor(45,50]
> 19075.18
> 19401.77
> paternalage.factor(50,55]
> paternalage.factor(55,90]
> 18960.11
> 17893.23
> traitza_children:male
> traitza_children:spouses
> 18545.55
> 14438.51
> traitza_children:paternalage.mean
> traitza_children:paternalage.factor(25,30]
> 18464.09
> 16943.43
> traitza_children:paternalage.factor(30,35]
> traitza_children:paternalage.factor(35,40]
> 16827.44
> 17230.04
> traitza_children:paternalage.factor(40,45]
> traitza_children:paternalage.factor(45,50]
> 17144.78
> 18191.67
> traitza_children:paternalage.factor(50,55]
> traitza_children:paternalage.factor(55,90]
> 17466.60
> 18540.59
>
>
> ### current script:
>
> # bsub -q mpi -W 24:00 -n 21 -R np20 mpirun -H localhost -n 21 R
> --slave -f "/usr/users/rarslan/rpqa/krmh_main/children.R"
> setwd("/usr/users/rarslan/rpqa/")
> library(doMPI)
> cl <- startMPIcluster(verbose=T,workdir="/usr/users/rarslan/rpqa/krmh_main/")
> registerDoMPI(cl)
> Children = foreach(i=1:clusterSize(cl),.options.mpi =
> list(seed=1337) ) %dopar% {
> library(MCMCglmm);library(data.table)
> setwd("/usr/users/rarslan/rpqa/krmh_main/")
> source("../1 - extraction functions.r")
> load("../krmh1.rdata")
>
> krmh.1 = recenter.pat(na.omit(krmh.1[spouses>0, list(idParents,
> children, male, spouses, paternalage)]))
>
> samples = 1000
> thin = 50; burnin = 100000
> nitt = samples * thin + burnin
>
> prior <- list(
> R=list(V=diag(2), nu=1.002, fix=2),
> G=list(G1=list(V=diag(2), nu=1, alpha.mu=c(0,0), alpha.V=diag(2)*1000))
> )
>
> start <- list(
> liab=c(rnorm( nrow(krmh.1)*2 )),
> R = list(R1 = rIW(diag(2), 10 )),
> G = list(G1 = rIW(diag(2), 10 ))
> )
>
> ( m1 = MCMCglmm( children ~ trait * (male + spouses +
> paternalage.mean + paternalage.factor),
> rcov=~idh(trait):units,
> random=~idh(trait):idParents,
> family="zapoisson",
> start = start,
> prior = prior,
> data=krmh.1,
> pr = F, saveX = F, saveZ = F,
> nitt=nitt,thin=thin,burnin=burnin)
> )
> m1$Residual$nrt<-2
> m1
> }
>
> save(Children,file = "Children.rdata")
> closeCluster(cl)
> mpi.quit()
>
> On 28 Aug 2014, at 20:59, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>
>> Hi,
>>
>> The posteriors for the two models look pretty close to me. Are the
>> scale reduction factors really as high as previously reported?
>> Before you had 1.83 for traitza_children:spouses, but the plot
>> suggests that it should be close to 1?
>>
>> Cheers,
>>
>> Jarrod
>>
>>
>>
>>
>> Quoting Ruben Arslan <rubenarslan at gmail.com> on Thu, 28 Aug 2014
>> 19:59:16 +0200:
>>
>>> Sure! Thanks a lot.
>>> I am using ~idh(trait):units already, sorry for saying that
>>> incorrectly in my last email.
>>> These models aren't the final thing, I will replace the
>>> paternalage.factor variable
>>> with its linear equivalent if that seems defensible (does so far)
>>> and in this model it seems
>>> okay to remove the za-effects for all predictors except spouses.
>>> So a final model would have fewer fixed effects. I also have
>>> datasets of 200k+ and 5m+,
>>> but I'm learning MCMCglmm with this smaller one because my wrong
>>> turns take less time.
>>>
>>> I've uploaded a comparison coef plot of two models:
>>> http://i.imgur.com/sHUfnmd.png
>>> m7 is with the default starting values, m1 is with the
>>> specification I sent in my last email. I don't
>>> know if such differences are something to worry about.
>>>
>>> I don't know what qualifies as highly overdispersed, here's a plot
>>> of the outcome for ever
>>> married people (slate=real data):
>>> http://imgur.com/14MywgZ
>>> here's with everybody born (incl. some stillborn etc.):
>>> http://imgur.com/knRGa1v
>>> I guess my approach (generating an overdispersed poisson with the
>>> parameters from
>>> the data and checking if it has as excess zeroes) is not the best
>>> way to diagnose zero-inflation,
>>> but especially in the second case it seems fairly clear-cut.
>>>
>>> Best regards,
>>>
>>> Ruben
>>>
>>>> summary(m1)
>>>
>>> Iterations = 50001:149951
>>> Thinning interval = 50
>>> Sample size = 2000
>>>
>>> DIC: 31249.73
>>>
>>> G-structure: ~idh(trait):idParents
>>>
>>> post.mean l-95% CI u-95% CI eff.samp
>>> children.idParents 0.006611 4.312e-08 0.0159 523.9
>>> za_children.idParents 0.193788 7.306e-02 0.3283 369.3
>>>
>>> R-structure: ~idh(trait):units
>>>
>>> post.mean l-95% CI u-95% CI eff.samp
>>> children.units 0.1285 0.1118 0.1452 716.1
>>> za_children.units 0.9950 0.9950 0.9950 0.0
>>>
>>> Location effects: children ~ trait * (male + spouses +
>>> paternalage.mean + paternalage.factor)
>>>
>>> post.mean l-95% CI
>>> u-95% CI eff.samp pMCMC
>>> (Intercept) 1.3413364 1.2402100
>>> 1.4326099 1789 <5e-04 ***
>>> traitza_children -0.8362879 -1.2007980
>>> -0.5016730 1669 <5e-04 ***
>>> male 0.0994902 0.0679050
>>> 0.1297394 2000 <5e-04 ***
>>> spouses 0.1236033 0.0839000
>>> 0.1624939 2000 <5e-04 ***
>>> paternalage.mean 0.0533892 0.0119569
>>> 0.0933960 2000 0.015 *
>>> paternalage.factor(25,30] -0.0275822 -0.1116421
>>> 0.0537359 1842 0.515
>>> paternalage.factor(30,35] -0.0691025 -0.1463214
>>> 0.0122393 1871 0.097 .
>>> paternalage.factor(35,40] -0.1419933 -0.2277379
>>> -0.0574678 1845 <5e-04 ***
>>> paternalage.factor(40,45] -0.1364952 -0.2362714
>>> -0.0451874 1835 0.007 **
>>> paternalage.factor(45,50] -0.1445342 -0.2591767
>>> -0.0421178 1693 0.008 **
>>> paternalage.factor(50,55] -0.1302972 -0.2642965
>>> 0.0077061 2000 0.064 .
>>> paternalage.factor(55,90] -0.3407879 -0.5168972
>>> -0.1493652 1810 <5e-04 ***
>>> traitza_children:male 0.0926888 -0.0147379
>>> 0.2006142 1901 0.098 .
>>> traitza_children:spouses 0.5531197 0.3870616
>>> 0.7314289 1495 <5e-04 ***
>>> traitza_children:paternalage.mean 0.0051463 -0.1279396
>>> 0.1460099 1617 0.960
>>> traitza_children:paternalage.factor(25,30] -0.1538957 -0.4445749
>>> 0.1462955 1781 0.321
>>> traitza_children:paternalage.factor(30,35] -0.1747883 -0.4757851
>>> 0.1162476 1998 0.261
>>> traitza_children:paternalage.factor(35,40] -0.2261843 -0.5464379
>>> 0.0892582 1755 0.166
>>> traitza_children:paternalage.factor(40,45] -0.2807543 -0.6079678
>>> 0.0650281 1721 0.100 .
>>> traitza_children:paternalage.factor(45,50] -0.4905843 -0.8649214
>>> -0.1244174 1735 0.010 **
>>> traitza_children:paternalage.factor(50,55] -0.4648579 -0.9215759
>>> -0.0002083 1687 0.054 .
>>> traitza_children:paternalage.factor(55,90] -0.3945406 -1.0230155
>>> 0.2481568 1793 0.195
>>> ---
>>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>>>
>>>> describe(krmh.1[spouses>0,])
>>> vars n mean sd median trimmed mad min
>>> max range skew kurtosis se
>>> children 2 6829 3.81 2.93 4.00 3.61 2.97 0.00
>>> 16.00 16.00 0.47 -0.46 0.04
>>> male 3 6829 0.46 0.50 0.00 0.45 0.00 0.00
>>> 1.00 1.00 0.14 -1.98 0.01
>>> spouses 4 6829 1.14 0.38 1.00 1.03 0.00 1.00
>>> 4.00 3.00 2.87 8.23 0.00
>>> paternalage 5 6829 3.65 0.80 3.57 3.60 0.80 1.83
>>> 7.95 6.12 0.69 0.70 0.01
>>> paternalage_c 6 6829 0.00 0.80 -0.08 -0.05 0.80 -1.82
>>> 4.30 6.12 0.69 0.70 0.01
>>> paternalage.mean 7 6829 0.00 0.68 -0.08 -0.05 0.59 -1.74
>>> 4.30 6.04 0.95 1.97 0.01
>>> paternalage.diff 8 6829 0.00 0.42 0.00 -0.01 0.38 -1.51
>>> 1.48 2.99 0.17 0.17 0.01
>>>
>>>> table(krmh.1$paternalage.factor)
>>>
>>> [0,25] (25,30] (30,35] (35,40] (40,45] (45,50] (50,55] (55,90]
>>> 309 1214 1683 1562 1039 623 269 130
>>>
>>> On 28 Aug 2014, at 19:05, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>>>
>>>> Hi Ruben,
>>>>
>>>> It might be hard to detect (near) ECPs with so many fixed effects
>>>> (can you post the model summary (and give us the mean and
>>>> standard deviation of any continuous covariates)). Also, the
>>>> complementary log-log link (which is the za specification) is
>>>> non-symmetric and runs into problems outside the range -35 to 3.5
>>>> so there may be a problem there, particularly if you use
>>>> rcov=~trait:units and the Poisson part is highly over-dispersed.
>>>> You could try rcov=~idh(trait):units and fix the non-identifiable
>>>> za residual variance to something smaller than 1 (say 0.5) - it
>>>> will mix slower but it will reduce the chance of over/underflow.
>>>>
>>>> Cheers,
>>>>
>>>> Jarrod
>>>>
>>>>
>>>>
>>>>
>>>> Quoting Ruben Arslan <rubenarslan at gmail.com> on Thu, 28 Aug 2014
>>>> 18:45:30 +0200:
>>>>
>>>>> Hi Jarrod,
>>>>>
>>>>>> 1) it did not return an error with rcov = ~trait:units because
>>>>>> you used R1=rpois(2,1)+1 and yet this specification only fits a
>>>>>> single variance (not a 2x2 covariance matrix). R1=rpois(2,1)+1
>>>>>> is a bit of a weird specification since it has to be integer. I
>>>>>> would obtain starting values using rIW().
>>>>>
>>>>> I agree it's a weird specification, I was a bit lost and thought
>>>>> I could get away with just putting some random numbers in the
>>>>> starting value.
>>>>> I didn't do R1=rpois(2,1)+1 though, I did R1=diag(rpois(2,1)+1),
>>>>> so I got a 2x2 matrix, but yes, bound to be integer.
>>>>> I didn't know starting values should come from a conjugate
>>>>> distribution, though that probably means I didn't think about it
>>>>> much.
>>>>>
>>>>> I'm now doing
>>>>> start <- list(
>>>>> liab=c(rnorm( nrow(krmh.1)*2 )),
>>>>> R = list(R1 = rIW( diag(2), nrow(krmh.1)) ),
>>>>> G = list(G1 = rIW( diag(2), nrow(krmh.1)) )
>>>>> )
>>>>>
>>>>> Is this what you had in mind?
>>>>> I am especially unsure if I am supposed to use such a low
>>>>> sampling variability (my sample size is probably not even
>>>>> relevant for the starting values) and if I should start from
>>>>> diag(2).
>>>>>
>>>>> And, I am still happily confused that this specification still
>>>>> doesn't lead to errors with respect to rcov = ~trait:units .
>>>>> Does this mean I'm doing it wrong?
>>>>>
>>>>>> 3) a) how many effective samples do you have for each
>>>>>> parameter? and b) are you getting extreme category
>>>>>> problems/numerical issues? If you store the latent variables
>>>>>> (pl=TUE) what is their range for the Zi/za part?
>>>>>
>>>>> My parallel run using the above starting values isn't finished yet.
>>>>> a) After applying the above starting values I get, for the
>>>>> location effects 1600-2000 samples for a 2000 sample chain (with
>>>>> thin set to 50). G and R-structure are from 369
>>>>> (za_children.idParents) to 716 (and 0 for the fixed part).
>>>>> Effective sample sizes were similar for my run using the
>>>>> starting values for G/R that I drew from rpois, and using 40
>>>>> chains I of course get
>>>>> b) I don't think I am getting extreme categories. I would
>>>>> probably be getting extreme categories if I included the
>>>>> forever-alones (they almost never have children), but this way no.
>>>>> I wasn't sure how to examine the range of the latents separately
>>>>> for the za part, but for a single chain it looks okay:
>>>>>> quantile(as.numeric(m1$Liab),probs = c(0,0.01,0,0.99,1))
>>>>> 0% 1% 0% 99% 100%
>>>>> -4.934111 -1.290728 -4.934111 3.389847 7.484206
>>>>>
>>>>> Well, all considered now that I use the above starting value
>>>>> specification I get slightly different estimates for all
>>>>> za-coefficients. Nothing major, but still leading me to
>>>>> think my estimates aren't exactly independent of the starting
>>>>> values I use. I'll see what the parallel run yields.
>>>>>
>>>>> Thanks a lot,
>>>>>
>>>>> Ruben
>>>>>
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Jarrod
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Quoting Ruben Arslan <rubenarslan at gmail.com> on Wed, 27 Aug
>>>>>> 2014 19:23:42 +0200:
>>>>>>
>>>>>>> Hi Jarrod,
>>>>>>>
>>>>>>> thanks again. I was able to get it running with your advice.
>>>>>>> Some points of confusion remain:
>>>>>>>
>>>>>>> - You wrote that zi/za models would return an error with rcov
>>>>>>> = ~trait:units + starting values. This did not happen in my
>>>>>>> case, so I didn't build MCMCglmm myself with your suggested
>>>>>>> edits. Also, have you considered putting your own MCMCglmm
>>>>>>> repo on Github? Your users would be able to install
>>>>>>> pre-releases and I'd think you'd get some time-saving pull
>>>>>>> requests too.
>>>>>>> - In my attempts to get my models to run properly, I messed up
>>>>>>> a prior and did not use fix=2 in my prior specification for my
>>>>>>> za models. This led to crappy convergence, it's much better
>>>>>>> now and for some of my simpler models I think I won't need
>>>>>>> parallel chains. I'm reminded of Gelman's folk theorem of
>>>>>>> statistical computing.
>>>>>>> - I followed your advice, but of course I could not set the
>>>>>>> true values as starting values, but wanted to set random, bad
>>>>>>> starting values. I pasted below what I arrived at, I'm
>>>>>>> especially unsure whether I specified the starting values for
>>>>>>> G and R properly (I think not).
>>>>>>> start <- list(
>>>>>>> liab=c(rnorm( nrow(krmh.1)*2 )),
>>>>>>> R = list(R1 = diag(rpois(2, 1)+1)),
>>>>>>> G = list(G1 = diag(rpois(2, 1)+1))
>>>>>>> )
>>>>>>>
>>>>>>>
>>>>>>> However, even though I may not need multiple chains for some
>>>>>>> of my simpler models, I've now run into conflicting
>>>>>>> diagnostics. The geweke.diag for each chain (and examination
>>>>>>> of the traces) gives
>>>>>>> satisfactory diagnostics. Comparing multiple chains using
>>>>>>> gelman.diag, however, leads to one bad guy, namely the
>>>>>>> traitza_children:spouses interaction.
>>>>>>> I think this implies that I've got some starting value
>>>>>>> dependence for this parameter, that won't be easily rectified
>>>>>>> through longer burnin?
>>>>>>> Do you have any ideas how to rectify this?
>>>>>>> I am currently doing sequential analyses on episodes of
>>>>>>> selection and in historical human data only those who marry
>>>>>>> have a chance at having kids. I exclude the unmarried
>>>>>>> from my sample where I predict number of children, because I
>>>>>>> examine that in a previous model and the zero-inflation (65%
>>>>>>> zeros, median w/o unmarried = 4) when including the unmarried
>>>>>>> is so excessive.
>>>>>>> Number of spouses is easily the strongest predictor in the
>>>>>>> model, but only serves as a covariate here. Since my other
>>>>>>> estimates are stable across chains and runs and agree well
>>>>>>> across models and with theory, I'm
>>>>>>> inclined to shrug this off. But probably I shouldn't ignore
>>>>>>> this sign of non-convergence?
>>>>>>>
>>>>>>>> gelman.diag(mcmc_1)
>>>>>>> Potential scale reduction factors:
>>>>>>>
>>>>>>> Point est. Upper C.I.
>>>>>>> (Intercept) 1.00 1.00
>>>>>>> traitza_children 1.27 1.39
>>>>>>> male 1.00 1.00
>>>>>>> spouses 1.00 1.00
>>>>>>> paternalage.mean 1.00 1.00
>>>>>>> paternalage.factor(25,30] 1.00 1.00
>>>>>>> paternalage.factor(30,35] 1.00 1.00
>>>>>>> paternalage.factor(35,40] 1.00 1.00
>>>>>>> paternalage.factor(40,45] 1.00 1.00
>>>>>>> paternalage.factor(45,50] 1.00 1.00
>>>>>>> paternalage.factor(50,55] 1.00 1.00
>>>>>>> paternalage.factor(55,90] 1.00 1.00
>>>>>>> traitza_children:male 1.22 1.32
>>>>>>> traitza_children:spouses 1.83 2.13
>>>>>>> traitza_children:paternalage.mean 1.02 1.02
>>>>>>> traitza_children:paternalage.factor(25,30] 1.03 1.05
>>>>>>> traitza_children:paternalage.factor(30,35] 1.05 1.08
>>>>>>> traitza_children:paternalage.factor(35,40] 1.10 1.15
>>>>>>> traitza_children:paternalage.factor(40,45] 1.12 1.17
>>>>>>> traitza_children:paternalage.factor(45,50] 1.19 1.28
>>>>>>> traitza_children:paternalage.factor(50,55] 1.12 1.18
>>>>>>> traitza_children:paternalage.factor(55,90] 1.11 1.17
>>>>>>>
>>>>>>> Multivariate psrf
>>>>>>>
>>>>>>> 7.27
>>>>>>>
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>> Ruben
>>>>>>>
>>>>>>>
>>>>>>> On 26 Aug 2014, at 13:04, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
>>>>>>>
>>>>>>>> Hi Ruben,
>>>>>>>>
>>>>>>>> There are 400 liabilities in a zapoisson model (2 per datum).
>>>>>>>> This code should work:
>>>>>>>>
>>>>>>>> g <-sample(letters[1:10], size = 200, replace = T)
>>>>>>>> pred <- rnorm(200)
>>>>>>>>
>>>>>>>> l1<-rnorm(200, -1, sqrt(1))
>>>>>>>> l2<-rnorm(200, -1, sqrt(1))
>>>>>>>>
>>>>>>>> y<-VGAM::rzapois(200, exp(l1), exp(-exp(l2)))
>>>>>>>>
>>>>>>>> # generate zero-altered data with an intercept of -1 (because
>>>>>>>> the intercept and variance are the same for both processes
>>>>>>>> this is just standard Poisson)
>>>>>>>>
>>>>>>>> dat<-data.frame(y=y, g = g, pred = pred)
>>>>>>>>
>>>>>>>>
>>>>>>>> start.1<-list(liab=c(l1,l2), R = list(R1=diag(2)), G=list(G1=diag(2)))
>>>>>>>> prior.1<-list(R=list(V=diag(2), nu=1.002, fix=2),
>>>>>>>> G=list(G1=list(V=diag(2), nu=2, alpha.mu=c(0,0),
>>>>>>>> alpha.V=diag(2)*1000)))
>>>>>>>>
>>>>>>>> m1<-MCMCglmm(y~trait + pred:trait, random=~us(trait):g,
>>>>>>>> family="zapoisson",rcov=~idh(trait):units, data=dat,
>>>>>>>> prior=prior.1, start= start.1)
>>>>>>>>
>>>>>>>> However, there are 2 bugs in the current version of MCMCglmm
>>>>>>>> that return an error message when the documentation implies
>>>>>>>> it should be fine:
>>>>>>>>
>>>>>>>> a) it should be possible to have R=diag(2) rather than R =
>>>>>>>> list(R1=diag(2)). This bug cropped up when I implemented
>>>>>>>> block-diagonal R structures. It can be fixed by inserting:
>>>>>>>>
>>>>>>>> if(!is.list(start$R)){
>>>>>>>> start$R<-list(R1=start$R)
>>>>>>>> }
>>>>>>>>
>>>>>>>> on L514 of MCMCglmm.R below
>>>>>>>>
>>>>>>>> if(!is.list(prior$R[[1]])){
>>>>>>>> prior$R<-list(R1=prior$R)
>>>>>>>> }
>>>>>>>>
>>>>>>>> b) rcov=~trait:units models for zi/za models will return an
>>>>>>>> error when passing starting values. To fix this insert
>>>>>>>>
>>>>>>>> if(diagR==3){
>>>>>>>> if(dim(start)[1]!=1){
>>>>>>>> stop("V is the wrong dimension for some
>>>>>>>> strart$G/start$R elements")
>>>>>>>> }
>>>>>>>> start<-diag(sum(nfl))*start[1]
>>>>>>>> }
>>>>>>>>
>>>>>>>> on L90 of priorfromat.R below
>>>>>>>>
>>>>>>>> if(is.matrix(start)==FALSE){
>>>>>>>> start<-as.matrix(start)
>>>>>>>> }
>>>>>>>>
>>>>>>>> I will put these in the new version.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> Jarrod
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Quoting Ruben Arslan <rubenarslan at gmail.com> on Mon, 25 Aug
>>>>>>>> 2014 21:52:30 +0200:
>>>>>>>>
>>>>>>>>> Hi Jarrod,
>>>>>>>>>
>>>>>>>>> thanks for these pointers.
>>>>>>>>>
>>>>>>>>>>> You will need to provide over-dispersed starting values
>>>>>>>>>>> for multiple-chain convergence diagnostics to be useful
>>>>>>>>>>> (GLMM are so simple I am generally happy if the output of
>>>>>>>>>>> a single run looks reasonable).
>>>>>>>>>
>>>>>>>>> Oh, I would be happy with single chains, but since
>>>>>>>>> computation would take weeks this way, I wanted to
>>>>>>>>> parallelise and I would use the multi-chain convergence as a
>>>>>>>>> criterion that my parallelisation was proper
>>>>>>>>> and is as informative as a single long chain. There don't
>>>>>>>>> seem to be any such checks built-in – I was analysing my 40
>>>>>>>>> chains for a bit longer than I like to admit until I noticed
>>>>>>>>> they were identical (effectiveSize
>>>>>>>>> and summary.mcmc.list did not yell at me for this).
>>>>>>>>>
>>>>>>>>>>> # use some very bad starting values
>>>>>>>>> I get that these values are bad, but that is the goal for my
>>>>>>>>> multi-chain aim, right?
>>>>>>>>>
>>>>>>>>> I can apply this to my zero-truncated model, but am again
>>>>>>>>> getting stuck with the zero-altered one.
>>>>>>>>> Maybe I need only specify the Liab values for this?
>>>>>>>>> At least I'm getting nowhere with specifying R and G
>>>>>>>>> starting values here. When I got an error, I always
>>>>>>>>> went to the MCMCglmm source to understand why the checks
>>>>>>>>> failed, but I didn't always understand
>>>>>>>>> what was being checked and couldn't get it to work.
>>>>>>>>>
>>>>>>>>> Here's a failing example:
>>>>>>>>>
>>>>>>>>> l<-rnorm(200, -1, sqrt(1))
>>>>>>>>> t<-(-log(1-runif(200)*(1-exp(-exp(l)))))
>>>>>>>>> g = sample(letters[1:10], size = 200, replace = T)
>>>>>>>>> pred = rnorm(200)
>>>>>>>>> y<-rpois(200,exp(l)-t)
>>>>>>>>> y[1:40] = 0
>>>>>>>>> # generate zero-altered data with an intercept of -1
>>>>>>>>>
>>>>>>>>> dat<-data.frame(y=y, g = g, pred = pred)
>>>>>>>>> set.seed(1)
>>>>>>>>> start_true = list(Liab=l, R = 1, G = 1 )
>>>>>>>>> m1<-MCMCglmm(y~1 + pred,random = ~ g,
>>>>>>>>> family="zapoisson",rcov=~us(trait):units, data=dat, start=
>>>>>>>>> start_true)
>>>>>>>>>
>>>>>>>>> # use true latent variable as starting values
>>>>>>>>> set.seed(1)
>>>>>>>>> # use some very bad starting values
>>>>>>>>> start_rand = list(Liab=rnorm(200), R = rpois(1, 1)+1, G =
>>>>>>>>> rpois(1, 1)+1 )
>>>>>>>>> m2<-MCMCglmm(y~1 + pred,random = ~ g,rcov=~us(trait):units,
>>>>>>>>> family="zapoisson", data=dat, start = start_rand)
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Ruben
>>>>>>>>>
>>>>>>>>> On 25 Aug 2014, at 18:29, Jarrod Hadfield
>>>>>>>>> <j.hadfield at ed.ac.uk> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Ruben,
>>>>>>>>>>
>>>>>>>>>> Sorry - I was wrong when I said that everything is Gibbs
>>>>>>>>>> sampled conditional on the latent variables. The location
>>>>>>>>>> effects (fixed and random effects) are also sampled
>>>>>>>>>> conditional on the (co)variance components so you should
>>>>>>>>>> add them to the starting values. In the case where the true
>>>>>>>>>> values are used:
>>>>>>>>>>
>>>>>>>>>> m1<-MCMCglmm(y~1, family="ztpoisson", data=dat,
>>>>>>>>>> start=list(Liab=l,R=1))
>>>>>>>>>>
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>> Jarrod
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Quoting Jarrod Hadfield <j.hadfield at ed.ac.uk> on Mon, 25
>>>>>>>>>> Aug 2014 17:14:14 +0100:
>>>>>>>>>>
>>>>>>>>>>> Hi Ruben,
>>>>>>>>>>>
>>>>>>>>>>> You will need to provide over-dispersed starting values
>>>>>>>>>>> for multiple-chain convergence diagnostics to be useful
>>>>>>>>>>> (GLMM are so simple I am generally happy if the output of
>>>>>>>>>>> a single run looks reasonable).
>>>>>>>>>>>
>>>>>>>>>>> With non-Gaussian data everything is Gibbs sampled
>>>>>>>>>>> conditional on the latent variables, so you only need to
>>>>>>>>>>> pass them:
>>>>>>>>>>>
>>>>>>>>>>> l<-rnorm(200, -1, sqrt(1))
>>>>>>>>>>> t<-(-log(1-runif(200)*(1-exp(-exp(l)))))
>>>>>>>>>>> y<-rpois(200,exp(l)-t)+1
>>>>>>>>>>> # generate zero-truncated data with an intercept of -1
>>>>>>>>>>>
>>>>>>>>>>> dat<-data.frame(y=y)
>>>>>>>>>>> set.seed(1)
>>>>>>>>>>> m1<-MCMCglmm(y~1, family="ztpoisson", data=dat, start=list(Liab=l))
>>>>>>>>>>> # use true latent variable as starting values
>>>>>>>>>>> set.seed(1)
>>>>>>>>>>> m2<-MCMCglmm(y~1, family="ztpoisson", data=dat,
>>>>>>>>>>> start=list(Liab=rnorm(200)))
>>>>>>>>>>> # use some very bad starting values
>>>>>>>>>>>
>>>>>>>>>>> plot(mcmc.list(m1$Sol, m2$Sol))
>>>>>>>>>>> # not identical despite the same seed because of different
>>>>>>>>>>> starting values but clearly sampling the same posterior
>>>>>>>>>>> distribution:
>>>>>>>>>>>
>>>>>>>>>>> gelman.diag(mcmc.list(m1$Sol, m2$Sol))
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>>
>>>>>>>>>>> Jarrod
>>>>>>>>>>>
>>>>>>>>>>> Quoting Ruben Arslan <rubenarslan at gmail.com> on Mon, 25
>>>>>>>>>>> Aug 2014 18:00:08 +0200:
>>>>>>>>>>>
>>>>>>>>>>>> Dear Jarrod,
>>>>>>>>>>>>
>>>>>>>>>>>> thanks for the quick reply. Please, don't waste time
>>>>>>>>>>>> looking into doMPI – I am happy that I
>>>>>>>>>>>> get the expected result, when I specify that reproducible
>>>>>>>>>>>> seed, whyever that may be.
>>>>>>>>>>>> I'm pretty sure that is the deciding factor, because I
>>>>>>>>>>>> tested it explicitly, I just have no idea
>>>>>>>>>>>> how/why it interacts with the choice of family.
>>>>>>>>>>>>
>>>>>>>>>>>> That said, is setting up different RNG streams for my
>>>>>>>>>>>> workers (now that it works) __sufficient__
>>>>>>>>>>>> so that I get independent chains and can use
>>>>>>>>>>>> gelman.diag() for convergence diagnostics?
>>>>>>>>>>>> Or should I still tinker with the starting values myself?
>>>>>>>>>>>> I've never found a worked example of supplying starting
>>>>>>>>>>>> values and am thus a bit lost.
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry for sending further questions, I hope someone else
>>>>>>>>>>>> takes pity while
>>>>>>>>>>>> you're busy with lectures.
>>>>>>>>>>>>
>>>>>>>>>>>> Best wishes
>>>>>>>>>>>>
>>>>>>>>>>>> Ruben
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 25 Aug 2014, at 17:29, Jarrod Hadfield
>>>>>>>>>>>> <j.hadfield at ed.ac.uk> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Ruben,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I do not think the issue is with the starting values,
>>>>>>>>>>>>> because even if the same starting values were used the
>>>>>>>>>>>>> chains would still differ because of the randomness in
>>>>>>>>>>>>> the Markov Chain (if I interpret your `identical' test
>>>>>>>>>>>>> correctly). This just involves a call to GetRNGstate()
>>>>>>>>>>>>> in the C++ code (L 871 ofMCMCglmm.cc) so I think for
>>>>>>>>>>>>> some reason doMPI/foreach is not doing what you expect.
>>>>>>>>>>>>> I am not familiar with doMPI and am in the middle of
>>>>>>>>>>>>> writing lectures so haven't got time to look into it
>>>>>>>>>>>>> carefully. Outside of the context of doMPI I get the
>>>>>>>>>>>>> behaviour I expect:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> l<-rnorm(200, -1, sqrt(1))
>>>>>>>>>>>>> t<-(-log(1-runif(200)*(1-exp(-exp(l)))))
>>>>>>>>>>>>> y<-rpois(200,exp(l)-t)+1
>>>>>>>>>>>>> # generate zero-truncated data with an intercept of -1
>>>>>>>>>>>>>
>>>>>>>>>>>>> dat<-data.frame(y=y)
>>>>>>>>>>>>> set.seed(1)
>>>>>>>>>>>>> m1<-MCMCglmm(y~1, family="ztpoisson", data=dat)
>>>>>>>>>>>>> set.seed(2)
>>>>>>>>>>>>> m2<-MCMCglmm(y~1, family="ztpoisson", data=dat)
>>>>>>>>>>>>> set.seed(2)
>>>>>>>>>>>>> m3<-MCMCglmm(y~1, family="ztpoisson", data=dat)
>>>>>>>>>>>>>
>>>>>>>>>>>>> plot(mcmc.list(m1$Sol, m2$Sol))
>>>>>>>>>>>>> # different, as expected
>>>>>>>>>>>>> plot(mcmc.list(m2$Sol, m3$Sol))
>>>>>>>>>>>>> # the same, as expected
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Quoting Ruben Arslan <rubenarslan at gmail.com> on Mon, 25
>>>>>>>>>>>>> Aug 2014 16:58:06 +0200:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Dear list,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sorry for bumping my old post, I hope to elicit a
>>>>>>>>>>>>>> response with a more focused question:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When does MCMCglmm automatically start from different
>>>>>>>>>>>>>> values when using doMPI/foreach?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have done some tests with models of varying
>>>>>>>>>>>>>> complexity. For example, the script in my last
>>>>>>>>>>>>>> post (using "zapoisson") yielded 40 identical chains:
>>>>>>>>>>>>>>> identical(mcmclist[1], mcmclist[30])
>>>>>>>>>>>>>> TRUE
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> A simpler (?) model (using "ztpoisson" and no specified
>>>>>>>>>>>>>> prior), however, yielded different chains
>>>>>>>>>>>>>> and I could use them to calculate gelman.diag()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Changing my script to the version below, i.e. seeding
>>>>>>>>>>>>>> foreach using .options.mpi=list( seed= 1337)
>>>>>>>>>>>>>> so as to make RNGstreams reproducible (or so I
>>>>>>>>>>>>>> thought), led to different chains even for the
>>>>>>>>>>>>>> "zapoisson" model.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In no case have I (successfully) tried to supplant the
>>>>>>>>>>>>>> default of MCMCglmm's "start" argument.
>>>>>>>>>>>>>> Is starting my models from different RNGsubstreams
>>>>>>>>>>>>>> inadequate compared to manipulating
>>>>>>>>>>>>>> the start argument explicitly? If so, is there any
>>>>>>>>>>>>>> worked example of explicit starting value manipulation
>>>>>>>>>>>>>> in parallel computation?
>>>>>>>>>>>>>> I've browsed the MCMCglmm source to understand how the
>>>>>>>>>>>>>> default starting values are generated,
>>>>>>>>>>>>>> but didn't find any differences with respect to RNG for
>>>>>>>>>>>>>> the two families "ztpoisson" and "zapoisson"
>>>>>>>>>>>>>> (granted, I did not dig very deep).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Ruben Arslan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> # bsub -q mpi -W 12:00 -n 41 -R np20 mpirun -H
>>>>>>>>>>>>>> localhost -n 41 R --slave -f
>>>>>>>>>>>>>> "/usr/users/rarslan/rpqa/rpqa_main/rpqa_children_parallel.R"
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> library(doMPI)
>>>>>>>>>>>>>> cl <-
>>>>>>>>>>>>>> startMPIcluster(verbose=T,workdir="/usr/users/rarslan/rpqa/rpqa_main/")
>>>>>>>>>>>>>> registerDoMPI(cl)
>>>>>>>>>>>>>> Children_mcmc1 =
>>>>>>>>>>>>>> foreach(i=1:clusterSize(cl),.options.mpi =
>>>>>>>>>>>>>> list(seed=1337) ) %dopar% {
>>>>>>>>>>>>>> library(MCMCglmm);library(data.table)
>>>>>>>>>>>>>> load("/usr/users/rarslan/rpqa/rpqa1.rdata")
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> nitt = 130000; thin = 100; burnin = 30000
>>>>>>>>>>>>>> prior.m5d.2 = list(
>>>>>>>>>>>>>> R = list(V = diag(c(1,1)), nu = 0.002),
>>>>>>>>>>>>>> G=list(list(V=diag(c(1,1e-6)),nu=0.002))
>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> rpqa.1 = na.omit(rpqa.1[spouses>0, list(idParents,
>>>>>>>>>>>>>> children, male, urban, spouses, paternalage.mean,
>>>>>>>>>>>>>> paternalage.factor)])
>>>>>>>>>>>>>> (m1 = MCMCglmm( children ~ trait * (male + urban +
>>>>>>>>>>>>>> spouses + paternalage.mean + paternalage.factor),
>>>>>>>>>>>>>> rcov=~us(trait):units,
>>>>>>>>>>>>>> random=~us(trait):idParents,
>>>>>>>>>>>>>> family="zapoisson",
>>>>>>>>>>>>>> prior = prior.m5d.2,
>>>>>>>>>>>>>> data=rpqa.1,
>>>>>>>>>>>>>> pr = F, saveX = F, saveZ = F,
>>>>>>>>>>>>>> nitt=nitt,thin=thin,burnin=burnin))
>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> library(coda)
>>>>>>>>>>>>>> mcmclist =
>>>>>>>>>>>>>> mcmc.list(lapply(Children_mcmc1,FUN=function(x) {
>>>>>>>>>>>>>> x$Sol}))
>>>>>>>>>>>>>> save(Children_mcmc1,mcmclist, file =
>>>>>>>>>>>>>> "/usr/users/rarslan/rpqa/rpqa_main/rpqa_mcmc_kids_za.rdata")
>>>>>>>>>>>>>> closeCluster(cl)
>>>>>>>>>>>>>> mpi.quit()
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On 04 Aug 2014, at 20:25, Ruben Arslan
>>>>>>>>>>>>>> <rubenarslan at gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Dear list,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> would someone be willing to share her or his efforts
>>>>>>>>>>>>>>> in parallelising a MCMCglmm analysis?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I had something viable using harvestr that seemed to
>>>>>>>>>>>>>>> properly initialise
>>>>>>>>>>>>>>> the starting values from different random number
>>>>>>>>>>>>>>> streams (which is desirable,
>>>>>>>>>>>>>>> as far as I could find out), but I ended up being
>>>>>>>>>>>>>>> unable to use harvestr, because
>>>>>>>>>>>>>>> it uses an old version of plyr, where parallelisation
>>>>>>>>>>>>>>> works only for multicore, not for
>>>>>>>>>>>>>>> MPI.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I pasted my working version, that does not do anything
>>>>>>>>>>>>>>> about starting values or RNG
>>>>>>>>>>>>>>> at the end of this email. I can try to fumble further
>>>>>>>>>>>>>>> in the dark or try to update harvestr,
>>>>>>>>>>>>>>> but maybe someone has gone through all this already.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'd also appreciate any tips for elegantly
>>>>>>>>>>>>>>> post-processing such parallel data, as some of my usual
>>>>>>>>>>>>>>> extraction functions and routines are hampered by the
>>>>>>>>>>>>>>> fact that some coda functions
>>>>>>>>>>>>>>> do not aggregate results over chains. (What I get from
>>>>>>>>>>>>>>> a single-chain summary in MCMCglmm
>>>>>>>>>>>>>>> is a bit more comprehensive, than what I managed to
>>>>>>>>>>>>>>> cobble together with my own extraction
>>>>>>>>>>>>>>> functions).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The reason I'm parallelising my analyses is that I'm
>>>>>>>>>>>>>>> having trouble getting a good effective
>>>>>>>>>>>>>>> sample size for any parameter having to do with the
>>>>>>>>>>>>>>> many zeroes in my data.
>>>>>>>>>>>>>>> Any pointers are very appreciated, I'm quite
>>>>>>>>>>>>>>> inexperienced with MCMCglmm.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best wishes
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Ruben
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> # bsub -q mpi-short -W 2:00 -n 42 -R np20 mpirun -H
>>>>>>>>>>>>>>> localhost -n 41 R --slave -f
>>>>>>>>>>>>>>> "rpqa/rpqa_main/rpqa_children_parallel.r"
>>>>>>>>>>>>>>> library(doMPI)
>>>>>>>>>>>>>>> cl <- startMPIcluster()
>>>>>>>>>>>>>>> registerDoMPI(cl)
>>>>>>>>>>>>>>> Children_mcmc1 = foreach(i=1:40) %dopar% {
>>>>>>>>>>>>>>> library(MCMCglmm)
>>>>>>>>>>>>>>> load("rpqa1.rdata")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> nitt = 40000; thin = 100; burnin = 10000
>>>>>>>>>>>>>>> prior = list(
>>>>>>>>>>>>>>> R = list(V = diag(c(1,1)), nu = 0.002),
>>>>>>>>>>>>>>> G=list(list(V=diag(c(1,1e-6)),nu=0.002))
>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> MCMCglmm( children ~ trait -1 +
>>>>>>>>>>>>>>> at.level(trait,1):male + at.level(trait,1):urban +
>>>>>>>>>>>>>>> at.level(trait,1):spouses +
>>>>>>>>>>>>>>> at.level(trait,1):paternalage.mean +
>>>>>>>>>>>>>>> at.level(trait,1):paternalage.factor,
>>>>>>>>>>>>>>> rcov=~us(trait):units,
>>>>>>>>>>>>>>> random=~us(trait):idParents,
>>>>>>>>>>>>>>> family="zapoisson",
>>>>>>>>>>>>>>> prior = prior,
>>>>>>>>>>>>>>> data=rpqa.1,
>>>>>>>>>>>>>>> pr = F, saveX = T, saveZ = T,
>>>>>>>>>>>>>>> nitt=nitt,thin=thin,burnin=burnin)
>>>>>>>>>>>>>>> }
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> library(coda)
>>>>>>>>>>>>>>> mcmclist =
>>>>>>>>>>>>>>> mcmc.list(lapply(Children_mcmc1,FUN=function(x) {
>>>>>>>>>>>>>>> x$Sol}))
>>>>>>>>>>>>>>> save(Children_mcmc1,mcmclist, file = "rpqa_mcmc_kids_za.rdata")
>>>>>>>>>>>>>>> closeCluster(cl)
>>>>>>>>>>>>>>> mpi.quit()
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Ruben C. Arslan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Georg August University G�ttingen
>>>>>>>>>>>>>>> Biological Personality Psychology and Psychological Assessment
>>>>>>>>>>>>>>> Georg Elias M�ller Institute of Psychology
>>>>>>>>>>>>>>> Go�lerstr. 14
>>>>>>>>>>>>>>> 37073 G�ttingen
>>>>>>>>>>>>>>> Germany
>>>>>>>>>>>>>>> Tel.: +49 551 3920704
>>>>>>>>>>>>>>> https://psych.uni-goettingen.de/en/biopers/team/arslan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [[alternative HTML version deleted]]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> R-sig-mixed-models at r-project.org mailing list
>>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>>>> Scotland, with registration number SC005336.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> The University of Edinburgh is a charitable body, registered in
>>>>>> Scotland, with registration number SC005336.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> The University of Edinburgh is a charitable body, registered in
>>>> Scotland, with registration number SC005336.
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> The University of Edinburgh is a charitable body, registered in
>> Scotland, with registration number SC005336.
>>
>>
>
>
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
More information about the R-sig-mixed-models
mailing list