[Bioc-sig-seq] Genominator: strategy for combining multiple AlignedRead objects

joseph franklin joseph.franklin at yale.edu
Mon Apr 19 17:06:26 CEST 2010


I'm addressing this to Jim Bullard, who has been really helpful answering some of my questions, as well as the list, in case anyone has some advice for me.

I've started using Genominator (I'm using the release version right now) to quantitate and analyze RNA-seq data, and have been really successful aggregating AlignedRead objects with my own annotation tables to produce per-gene counts.  I've done this with sets of 2-3 AlignedRead objects (each representing an Illumina lane), but I'd like to extend the approach to a few dozen lanes.  Since this is far too much data to fit in memory, I need an efficient way to combine many AlignedRead objects at once that doesn't rely on them being loaded as objects at the same time.  

I imagine that I need to load the objects into tables using the importFromAlignedReads, and then join the appropriate columns, either before or after aggregation (the manual hints that afterwards is preferable).  However, there are a few points I'm confused with (probably resulting from my limited experience with SQLite):

- I've been unable load to load a SQLite database file that was previously created with the importFromAlignedReads--what is the best way to load the database connection--for instance, during a new R session?

-Can AlignedRead objects only be imported (via importFromAlignedReads) as named lists of two or more objects?  What about single AlignedRead objects?  I would imagine that a solution to my problem would be to create a separate table in a database file for each of my AlignedRead objects (I made a loop to do this), and then join these tables (as long as I can create a connection to the database).

I think my problems could be solved if I could load the AlignedRead objects from multiple lanes into tables in database file, load it, and join the appropriate columns from the various tables (and then aggregate with the annotations in a single step--this would seem to be the most straightforward).  Any advice on accomplishing these steps would be much appreciated.

Thanks again,
Joe Franklin

________________________________
Joseph Franklin
Department of Cell Biology
Yale University
295 Congress Ave, BCMM 137
New Haven, CT 06519
USA



More information about the Bioc-sig-sequencing mailing list