[R] Parallelizing GBM
Lorenzo Isella
lorenzo.isella at gmail.com
Sun Mar 24 14:28:24 CET 2013
Thanks a lot for the quick answer.
However, from what I see, the parallelization affects only the
cross-validation part in the gbm interface (but it changes nothing when
you call gbm.fit).
Am I missing anything here?
Is there any fundamental reason why gbm.fit cannot be parallelized?
Lorenzo
On Sun, 24 Mar 2013 12:45:39 +0100, Max Kuhn <mxkuhn at gmail.com> wrote:
> See this:
>
> https://code.google.com/p/gradientboostedmodels/issues/detail?id=3
>
>
> and this:
>
> https://code.google.com/p/gradientboostedmodels/source/browse/?name=parallel
>
>
>
> Max
>
>
> On Sun, Mar 24, 2013 at 7:31 AM, Lorenzo Isella
> <lorenzo.isella at gmail.com> wrote:
>
>> Dear All,
>>
>> I am far from being a guru about parallel programming.
>>
>> Most of the time, I rely or randomForest for data mining large datasets.
>>
>> I would like to give a try also to the gradient boosted methods in GBM,
>> but I have a need for parallelization.
>>
>> I normally rely on gbm.fit for speed reasons, and I usually call it
>> this way
>>
>>
>>
>>
>>
>>
>>
>> gbm_model <- gbm.fit(trainRF,prices_train,
>>
>> offset = NULL,
>>
>> misc = NULL,
>>
>> distribution = "multinomial",
>>
>> w = NULL,
>>
>> var.monotone = NULL,
>>
>> n.trees = 50,
>>
>> interaction.depth = 5,
>>
>> n.minobsinnode = 10,
>>
>> shrinkage = 0.001,
>>
>> bag.fraction = 0.5,
>>
>> nTrain = (n_train/2),
>>
>> keep.data = FALSE,
>>
>> verbose = TRUE,
>>
>> var.names = NULL,
>>
>> response.name = NULL)
>>
>>
>>
>>
>>
>> Does anybody know an easy way to parallelize the model (in this case it
>> means simply having 4 cores on the same >>machine working on the
>> problem)?
>>
>> Any suggestion is welcome.
>>
>> Cheers
>>
>>
>>
>> Lorenzo
>>
>>
>>
>> ______________________________________________
>>
>> R-help at r-project.org mailing list
>>
>> https://stat.ethz.ch/mailman/listinfo/r-help
>>
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Max
More information about the R-help
mailing list