[R] Using cv.tree to assign cases to specific cv-groups
jshuter at uoguelph.ca
jshuter at uoguelph.ca
Fri Feb 8 22:07:34 CET 2008
Hello,
I would like to use cv.tree to run a 10-fold cross-validation
experiment on a tree object to help me choose a tree size.
Many users seem to allow their cases to be assigned to CV groups
randomly, but I have assigned each case to one of 10 cv groups, such
that the data from each of my experimental units is included in only
one cv-group.
According to the manual for the tree Package (Ripley 2007), the
cv.tree argument "rand" [cv.tree(object, rand, FUN = prune.tree,
K=10)], allows the user the option to specify an integer vector of
the length the number of cases used to create object, assigning the
cases to different groups for cross-validation (Ripley 2007).
However, after searching the R-archives and various online sources, I
have been unable to find an example of code in which someone has
exercised this option, so I am unsure how to proceed.
Specifically, should I:
1. Create a 1 column dataframe, with each case containing a number
from 1-10, with the order corresponding to the order of cases in the
original dataset used to generate the tree object.
2.Call that dataset using the rand argument when I run the full
syntax for cv.tree
OR should I:
1.List the integers used for case assignment directly in the syntax
for cv.tree, following the rand argument?
If anyone has any experience using cv.tree (or another function) to
assign specific cv-groups, any advice would be greatly appreciated!
Jen Shuter
University of Guelph
More information about the R-help
mailing list