[Bioc-sig-seq] size of DNAString object
Hans-Ulrich Klein
h.klein at uni-muenster.de
Thu May 27 17:36:30 CEST 2010
Thank you! That helped me a lot.
A further question: Is there any way to access the complete DNAStringSet
"dnaS" after I removed it using the rm() function? If not, keeping the
complete DNAStringSet in memory does not make much sense to me.
Thank you,
Hans-Ulrich
Vincent Carey wrote:
> see the information on compact() method in XStringSet-class
> package:Biostrings R Documentation
>
> to rationalize this you need to think about the difference between a
> view and a concrete instance. typically you do not want a copy to be
> made on each view
>
> On Thu, May 27, 2010 at 10:21 AM, Hans-Ulrich Klein
> <h.klein at uni-muenster.de <mailto:h.klein at uni-muenster.de>> wrote:
>
> Hi all,
>
> I observed that some DNAStrings (and also DNAStringSets) objects
> are to large after subsetting:
>
> > library("Rsamtools")
> > parameters = ScanBamParam()
> > bam = scanBam("data/N01.bam", param=parameters)
> > ss = bam[[1]]$seq
> > ss
> A DNAStringSet instance of length 230980
> [...]
> > print(object.size(ss), units="Mb")
> 83.3 Mb
> > dnaS = ss[[5]]
> > dnaS
> 128-letter "DNAString" instance
> seq:
> TAGCGTGGATACAGAGGGACATCTATTGACCAGCTA...AAAGTTGTGCTTTATTTGATGAATAAGTATTGAACA
> > print(object.size(dnaS), units="Mb")
> 80.7 Mb
> > print(object.size(as.character(dnaS)), units="Kb")
> 0.2 Kb
>
> When I write the 128-letter DNAString to disk, it remains quite
> large (~ 20Mb).
>
> Best wishes,
> Hans-Ulrich
>
>
>
>
> > sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-pc-linux-gnu
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> other attached packages:
> [1] Rsamtools_1.0.1 Biostrings_2.16.2 GenomicRanges_1.0.1
> [4] IRanges_1.6.4
>
> loaded via a namespace (and not attached):
> [1] Biobase_2.8.0
>
>
> --
> Hans-Ulrich Klein
> Department of Medical Informatics and Biomathematics
> University of Münster
> Domagkstrasse 9
> 48149 Münster, Germany
> Tel.: +49 (0)251 83-58405
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> <mailto:Bioc-sig-sequencing at r-project.org>
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
--
Hans-Ulrich Klein
Department of Medical Informatics and Biomathematics
University of Münster
Domagkstrasse 9
48149 Münster, Germany
Tel.: +49 (0)251 83-58405
More information about the Bioc-sig-sequencing
mailing list