[Bioc-sig-seq] scanBam Error

Dario Strbenac D.Strbenac at garvan.org.au
Mon Dec 13 23:00:37 CET 2010


Hi,

Yes, that works fine, thanks. It must've been a size issue I was having.

---- Original message ----
>Date: Mon, 13 Dec 2010 17:31:24 +1000
>From: Paul Leo <p.leo at uq.edu.au>  
>Subject: Re: [Bioc-sig-seq] scanBam Error  
>To: D.Strbenac at garvan.org.au
>Cc: bioc-sig-sequencing at r-project.org
>
>   Do you need all the sequence data at once?
>
>   Instead of using a smaller bam file can you read in
>   a smaller portion of your large bamfile ?
>
>   data.gr<-GRanges(seqnames
>   =paste("chr",13,sep=""),ranges =
>   IRanges(start=as.numeric(28608234),end=as.numeric(28608363)),strand="+")
>
>   which<-  data.gr
>   params<-ScanBamParam(which=which,flag=scanBamFlag(isUnmappedQuery=FALSE,isDuplicate=NA,isValidVendorRead=TRUE),simpleCigar
>   = FALSE,reverseComplement =
>   FALSE,what=c("qname","flag","rname","seq","strand","pos","mpos","qwidth","cigar","qual","mapq","isize",
>   "mrnm" ),tag="RG" ) # change to what you want
>   aln1 <- scanBam("HS1808.bam",param=params)
>
>   aln1[[1]]
>
>   That should work fine?
>
>--                                                                     
>Dr Paul Leo                                                            
>Bioinformatician                                                       
>UQ Diamantina Institute for Cancer, Immunology and Metabolic Medicine  
>---------------------------------------------------------------------  
>Level 4, R Wing                                                        
>Princess Alexandra Hospital                                            
>Ipswich Rd                                                             
>Woolloongabba QLD 4102                                                 
>Tel: +61 7 3240 7740  Mob: 041 303 8691  Fax: +61 7 3240 5946          
>Email: p.leo at uq.edu.au   Web: http://www.di.uq.edu.au                  
>
>   -----Original Message-----
>   From: Dario Strbenac <D.Strbenac at garvan.org.au>
>   Reply-to: D.Strbenac at garvan.org.au
>   To: bioc-sig-sequencing at r-project.org
>   Subject: Re: [Bioc-sig-seq] scanBam Error
>   Date: Mon, 13 Dec 2010 17:15:38 +1100
>
> I tried it out by making a smaller bam file with only reads from one chromosome, and it worked fine. The full bam file is 4 GB and has 75 million reads in it. Could the size be a problem ? Could you test out a bam file of this size on your end, without me sending you one that big ? Also, the error is different after I put the scamBamParam in the right spot :
>
> Error in .Call(func, file, index, "rb", NULL, flag, simpleCigar, ...) :
>   negative length vectors are not allowed
>
> Integer overflow somewhere, maybe ?
>
> - Dario.
>
> ---- Original message ----
> >Date: Sun, 12 Dec 2010 20:59:23 -0800
> >From: Martin Morgan <mtmorgan at fhcrc.org> 
> >Subject: Re: [Bioc-sig-seq] scanBam Error 
> >To: D.Strbenac at garvan.org.au
> >Cc: bioc-sig-sequencing at r-project.org
> >
> >On 12/12/2010 08:00 PM, Dario Strbenac wrote:
> >> Hello,
> >>
> >
> >> I'm having trouble reading in a BAM file when "seq" is one of the
> >strings passed to the what argument of ScanBamParam. If it's not, then
> >the the reading completes successfully. I don't understand what the
> >error means. It is :
> >>
> >> Error in .io_bam(.scan_bam, file, index, reverseComplement, tmpl, param = param) :
> >>   INTEGER() can only be applied to a 'integer', not a 'closure'
> >>
> >> The traceback is :
> >>
> >>> traceback()
> >> 4: .Call(func, file, index, "rb", NULL, flag, simpleCigar, ...)
> >> 3: .io_bam(.scan_bam, file, index, reverseComplement, tmpl, param = param)
> >> 2: scanBam("HS1808.bam", flag = ScanBamFlag(isDuplicate = FALSE),
> >>        param = ScanBamParam(reverseComplement = TRUE, what = c("rname",
> >>            "strand", "pos", "seq")))
> >> 1: scanBam("HS1808.bam", flag = ScanBamFlag(isDuplicate = FALSE),
> >>        param = ScanBamParam(reverseComplement = TRUE, what = c("rname",
> >>            "strand", "pos", "seq")))
> >>
> >> and the environment is :
> >>
> >> R version 2.12.0 (2010-10-15)
> >> Platform: x86_64-pc-mingw32/x64 (64-bit)
> >>
> >> locale:
> >> [1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252 LC_NUMERIC=C                       LC_TIME=English_Australia.1252   
> >>
> >> attached base packages:
> >> [1] stats     graphics  grDevices utils     datasets  methods   base    
> >>
> >> other attached packages:
> >> [1] Rsamtools_1.2.1     Biostrings_2.18.0   GenomicRanges_1.2.0 IRanges_1.8.2     
> >>
> >> loaded via a namespace (and not attached):
> >> [1] Biobase_2.8.0
> >
> >Hi Dario -- this is some kind of error in Rsamtools' C code, but I'm not
> >able to reproduce it on my end so can't track it down. Is there any way
> >of producing and sharing with me an example file that has this problem?
> >
> >One thing (not causing the bug) in your traceback is that 'flag' should
> >be an argument to ScanBamParam; as it is I think it is being silently
> >ignored.
> >
> >Martin
> >
> >>
> >> --------------------------------------
> >> Dario Strbenac
> >> Research Assistant
> >> Cancer Epigenetics
> >> Garvan Institute of Medical Research
> >> Darlinghurst NSW 2010
> >> Australia
> >>
> >> _______________________________________________
> >> Bioc-sig-sequencing mailing list
> >> Bioc-sig-sequencing at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
> >
> >
> >--
> >Computational Biology
> >Fred Hutchinson Cancer Research Center
> >1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
> >
> >Location: M1-B861
> >Telephone: 206 667-2793
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia



More information about the Bioc-sig-sequencing mailing list