[R] How to split a factor (unique identifier) into severalothers?

Tribo Laboy tribolaboy at gmail.com
Fri Feb 8 15:33:58 CET 2008


Hi Greg,

The short example you gave cleared it up. I still have some issues
with getting used to R indexing. I was desperately trying to do:

> zzz <- rbind(fctrs_list[1], fctrs_list[2])

and was getting:

> zzz
     [,1]
[1,] Character,3
[2,] Character,3

instead of the

> zzz <- rbind(fctrs_list[[1]], fctrs_list[[2]])
> zzz
     [,1]      [,2]        [,3]
[1,] "Sample1" "condition1" "place1"
[2,] "Sample1" "condition1" "place2"

Thanks for the help, both to you and to Dimitris.

Regards,
TL

On Thu, Feb 7, 2008 at 7:02 PM, Greg Snow <Greg.Snow at imail.org> wrote:
> The essence of do.call is to call the named function (rbind in this
>  case) with the elements of the list as it's arguments.
>
>  In this case with a list without named elements the following:
>
>  > do.call('myfunction',mylist)
>
>  Is equivalent to
>
>  > myfuncion( mylist[[1]], mylist[[2]], mylist[[3]], ..., mylist[[n]] )
>
>  With the ... Replaced by however many additional elements are there (you
>  can see how it can save lots of typing).
>
>  So using rbind, it just rbinds together the elements of the list, or
>  uses each element (the split from the original strings) as a row of a
>  new object, in this case a matrix.  The as.data.frame then converts the
>  columns to factors.
>
>  Does this help the understanding?
>
>  --
>  Gregory (Greg) L. Snow Ph.D.
>  Statistical Data Center
>  Intermountain Healthcare
>  greg.snow at imail.org
>  (801) 408-8111
>
>
>
>
>
>  > -----Original Message-----
>  > From: r-help-bounces at r-project.org
>  > [mailto:r-help-bounces at r-project.org] On Behalf Of Tribo Laboy
>  > Sent: Thursday, February 07, 2008 2:33 AM
>  > To: Dimitris Rizopoulos
>  > Cc: r-help at r-project.org
>  > Subject: Re: [R] How to split a factor (unique identifier)
>  > into severalothers?
>  >
>  > Hi Dimitris,
>  >
>  >
>  > Your code works like charm, but I don't really understand
>  > how. If you have some time I'll appreciate if you can explain
>  > some more.
>  >
>  > The contents of "vals" in your example is equivalent to the
>  > contents of "splitfctr" in mine.
>  >
>  > "as.data.frame" is quite clear, but "do.call("rbind", vals)"
>  > has me puzzled.
>  >
>  > I checked the "do.call" help, but I could not replicate the
>  > results on the command line by directly using "rbind".
>  >
>  > If I had to do it by directly using "rbind" can you show me
>  > how to do it?
>  >
>  >
>  > I really appreciate your help.
>  >
>  >
>  > In the meantime I came up with another solution, which is
>  > much more clunky than yours, but at least I can understand
>  > how it works. I am putting it here, just as an additional
>  > thing for the archives.
>  >
>  > after the "splitfctr" ( or "vals" in Dimitris example is obtained)
>  >
>  > I use the "unlist" function on the list and then make new
>  > factors like that:
>  >
>  > all_fctrs <- unlist(splitfctr)
>  > sample_fctr <- factor(all_fctrs[seq(1, length(all_fctrs),
>  > 3)]) condition_fctr <- factor(all_fctrs[seq(2,
>  > length(all_fctrs), 3)]) place_fctr <- factor(all_fctrs[seq(3,
>  > length(all_fctrs), 3)])
>  >
>  > then I bundle the factors into the data frame by "cbind".
>  >
>  >
>  > Thanks for the help.
>  >
>  > TL
>  >
>  >
>  >
>  > On Thu, Feb 7, 2008 at 5:20 PM, Dimitris Rizopoulos
>  > <dimitris.rizopoulos at med.kuleuven.be> wrote:
>  > > try the following:
>  > >
>  > >  dat <- data.frame(x = c("sample1_condition1_place1",
>  > >     "sample2_condition1_place1", "sample3_condition1_place1",
>  > >     "sample1_condition2_place1", "sample1_condition2_place1"))
>  > >
>  > >  vals <- strsplit(as.character(dat$x), "_")
>  > > as.data.frame(do.call("rbind", vals))
>  > >
>  > >
>  > >  I hope it helps.
>  > >
>  > >  Best,
>  > >  Dimitris
>  > >
>  > >  ----
>  > >  Dimitris Rizopoulos
>  > >  Ph.D. Student
>  > >  Biostatistical Centre
>  > >  School of Public Health
>  > >  Catholic University of Leuven
>  > >
>  > >  Address: Kapucijnenvoer 35, Leuven, Belgium
>  > >  Tel: +32/(0)16/336899
>  > >  Fax: +32/(0)16/337015
>  > >  Web: http://med.kuleuven.be/biostat/
>  > >      http://www.student.kuleuven.be/~m0390867/dimitris.htm
>  > >
>  > >
>  > >
>  > >
>  > >  ----- Original Message -----
>  > >  From: "Tribo Laboy" <tribolaboy at gmail.com>
>  > >  To: <r-help at r-project.org>
>  > >  Sent: Thursday, February 07, 2008 7:44 AM
>  > >  Subject: [R] How to split a factor (unique identifier)
>  > into several
>  > > others?
>  > >
>  > >
>  > >  > Hello,
>  > >  >
>  > >  > I have a data frame with a factor column, which uniquely
>  > identifies
>  > > > the observations in the data frame and it looks like this:
>  > >  >
>  > >  > sample1_condition1_place1
>  > >  > sample2_condition1_place1
>  > >  > sample3_condition1_place1
>  > >  > .
>  > >  > .
>  > >  > .
>  > >  > sample3_condition3_place3
>  > >  >
>  > >  > I want to turn it into three separate factor columns
>  > "sample",  >
>  > > "condition" and "place".
>  > >  >
>  > >  > This is what I did so far:
>  > >  >
>  > >  > # generate a factor column for the example  > fctr<-
>  > > factor(c("sample1_condition1_place1",
>  > >  > "sample2_condition1_place1", "sample3_condition1_place1"))  >
>  > > splitfctr <- strsplit(as.character(fctr),"_")  >  >> splitfctr  >
>  > > [[1]]
>  > >  > [1] "sample1"    "condition1" "place1"
>  > >  >
>  > >  > [[2]]
>  > >  > [1] "sample2"    "condition1" "place1"
>  > >  >
>  > >  > [[3]]
>  > >  > [1] "sample3"    "condition1" "place1"
>  > >  >
>  > >  >
>  > >  > Now this is all fine, but how do I make three separate
>  > factors of
>  > > > this?
>  > >  > The object "splitfctr" is a list of character vectors, each  >
>  > > character  > vector being composed of the words after spitting the
>  > > long original  > world.
>  > >  > Now I want to form new character vectors, which contain
>  > the first
>  > > > component of each list entry, then another vector for the
>  > second  >
>  > > component, etc.
>  > >  > I don't want to use loops, unless that's the only way to
>  > do it.I  >
>  > > guess  > I have some difficulty with understanding how R indexing
>  > > works...
>  > >  >
>  > >  > ______________________________________________
>  > >  > R-help at r-project.org mailing list
>  > >  > https://stat.ethz.ch/mailman/listinfo/r-help
>  > >  > PLEASE do read the posting guide
>  > >  > http://www.R-project.org/posting-guide.html
>  > >  > and provide commented, minimal, self-contained,
>  > reproducible code.
>  > >  >
>  > >
>  > >
>  > >  Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
>  > >
>  > >
>  >
>  > ______________________________________________
>  > R-help at r-project.org mailing list
>  > https://stat.ethz.ch/mailman/listinfo/r-help
>  > PLEASE do read the posting guide
>  > http://www.R-project.org/posting-guide.html
>  > and provide commented, minimal, self-contained, reproducible code.
>  >
>
>



More information about the R-help mailing list