[R] problem extracting data from a set of list vectors
    Vining, Kelly 
    Kelly.Vining at oregonstate.edu
       
    Wed Apr 18 23:19:36 CEST 2012
    
    
  
Thanks for your help, Milan, in showing me a way to exhibit the structure of my data. 
# Again, my data sets are:
> [1] "res.Callus.Explant" "res.Callus.Regen"   "res.Explant.Regen"
# Structure is:
> str(res.Callus.Explant)
List of 18
 $ name         : chr "two group comparison"
 $ group1       : chr "Callus"
 $ group2       : chr "Explant"
 $ alternative  : chr "two.sided"
 $ rows         : int [1:39009] 1 2 3 4 5 6 7 8 9 10 ...
 $ counts       : num [1:39009, 1:6] 0 121 237 6 7 116 6 2 860 0 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:39009] "POPTR_0018s00200" "POPTR_0008s00200" "POPTR_0004s00200" "POPTR_0019s00200" ...
  .. ..$ : chr [1:6] "Callus_BiolRep1" "Callus_BiolRep2" "Callus_BiolRep3" "Explant_BiolRep1" ...
 $ eff.lib.sizes: Named num [1:6] 3120288 2788297 2425164 3653109 3810261 ...
  ..- attr(*, "names")= chr [1:6] "V3" "V4" "V5" "V6" ...
 $ dispersion   : num [1:39009, 1:6] NA 0.0743 0.0434 0.6423 0.3554 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:39009] "POPTR_0018s00200" "POPTR_0008s00200" "POPTR_0004s00200" "POPTR_0019s00200" ...
  .. ..$ : chr [1:6] "Callus_BiolRep1" "Callus_BiolRep2" "Callus_BiolRep3" "Explant_BiolRep1" ...
 $ x            : num [1:6, 1:2] 1 1 1 1 1 1 1 1 1 0 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:6] "Callus" "Callus" "Callus" "Explant" ...
  .. ..$ : chr [1:2] "Intercept" "Callus-Explant"
 $ beta0        : num [1:2] NA 0
 $ beta.hat     : num [1:39009, 1:2] NA -10.13 -9.65 -13 -12.2 ...
 $ beta.tilde   : num [1:39009, 1:2] NA -10.26 -9.74 -13.11 -12.33 ...
 $ e            : num [1:39009] NA 35.08 58.82 2.03 4.43 ...
 $ e1           : num [1:39009] NA 30.23 53.77 1.78 3.89 ...
 $ e2           : num [1:39009] NA 39.83 64.46 2.27 5.01 ...
 $ log.fc       : num [1:39009] NA 0.398 0.262 0.353 0.366 ...
 $ p.values     : num [1:39009] NA 0.246 0.33 0.748 0.645 ...
 $ q.values     : num [1:39009] NA 1 1 1 1 1 1 1 1 1 ...
_______________________________________
From: Milan Bouchet-Valat [nalimilan at club.fr]
Sent: Wednesday, April 18, 2012 2:04 PM
To: Vining, Kelly
Cc: r-help at r-project.org
Subject: Re: [R] problem extracting data from a set of list vectors
Le mercredi 18 avril 2012 à 13:13 -0700, Vining, Kelly a écrit :
> Dear useRs,
>
> A colleague has sent me several batches of output I need to process, and I'm struggling with the format to the point that I don't even know how to extract a test set to upload here. My apologies, but I think that my issue is straightforward enough (for some of you, not for me!) that you can help in the absence of a test set. Here is the scenario:
>
> # Data sets are lists:
> > ls()
> [1] "res.Callus.Explant" "res.Callus.Regen"   "res.Explant.Regen"
> > is.list(res.Callus.Explant)
> [1] TRUE
>
> # The elements of each list look like this:
> > names(res.Callus.Explant)
>  [1] "name"          "group1"        "group2"        "alternative"   "rows"          "counts"
>  [7] "eff.lib.sizes" "dispersion"    "x"             "beta0"         "beta.hat"      "beta.tilde"
> [13] "e"             "e1"            "e2"            "log.fc"        "p.values"      "q.values"
>
> I want to 1) extract specific fields from this data structure into a data frame, 2) subset from this data frame into a new data frame based on selection criteria. What I've done is this:
>
> all.comps <- ls(pattern="^res")
> for(i in all.comps){
> obj = i;
> gene.ids = rownames(obj$counts);
> x = data.frame(gene.ids = gene.ids, obj$counts, obj$e1, obj$e2, obj$log.fc,
> obj$p.value, obj$q.value);
> DiffGenes.i = subset(x, x$obj.p.value<0.05 | x$obj.q.value<=0.1)
> }
>
> Obviously, this doesn't work because pattern searching in the first line is not feeding the entire data structure into the all.comps variable. But how can I accomplish feeding the whole data structure for each one of these lists into the loop?  Should I be able to use sapply here? If so, how? Also, I suspect that "DiffGenes.i" is not going to give me the data frame I want, which in the example I'm showing would be "DiffGenes.res.Callus.Explant." How should I name output data frames from a loop like this (if a loop is even the best way to do this)?
>
> Any help with this will be greatly appreciated.
You did not tell us exactly how you imported your data, and how your
data sets are structured. str(res.Callus.Explant) would help.
Specifically, I suspect your objects are already data frames, which are
a special case of lists. You can check that with is.data.frame(). If
they aren't, but their elements are all of the same length, you can use
as.data.frame() to convert them to data frames.
Then, you would simply do something like:
sets <- list(res.Callus.Explant, res.Callus.Regen, res.Explant.Regen)
sets <- lapply(sets, subset, p.values<0.05 | q.values<=0.1))
# When I take your suggestion, I get the following error:
> sets <- list(res.Callus.Explant, res.Callus.Regen, res.Explant.Regen)
> sets <- lapply(sets, subset, p.values<0.05 | q.values<=0.1)
Error in subset.default(X[[1L]], ...) : object 'p.values' not found
Merci for your help, but I'm still not quite there...
--Kelly
Hope this helps
    
    
More information about the R-help
mailing list