[Bioc-sig-seq] Fwd: Re: Question about sbanBam() with AB SOLID data

Martin Morgan mtmorgan at fhcrc.org
Fri Nov 12 00:03:05 CET 2010


On 11/11/2010 02:25 PM, ivan.borozan at utoronto.ca wrote:
> Bioscope pipeline was used to align the data. In order to get reads in
> color space and their quality
> 
> param<- ScanBamParam(tag=c("CQ"))
> 
> and
> 
> param<- ScanBamParam(tag=c("CS"))
> 
> 
> should be specified.

Hi Ivan -- I'm a little confused about where this leaves you. Can you
actually read the sequences / quality strings in to R? Are they
represented in sequence space? Is there a BAM file available somewhere
to experiment with?

Martin

> 
> Best,
> 
> Ivan
> 
> Quoting James MacDonald <jmacdon at med.umich.edu>:
> 
>> What aligner are you using that returns the sequences in color  
>> space? The SAM format specifies that:
>>
>> "Color alignments are stored as normal nucleotide alignments with  
>> additional tags describing the raw color sequences, ..."
>>
>> So in general I wouldn't expect the seq to be color space, but  
>> nucleotide space. Depending on the aligner, you may get a CS:Z: tag 
>>  of color space sequence, but I don't believe scanBam will parse that.
>>
>> Best,
>>
>> Jim
>> -- 
>>
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>>
>>>>> <ivan.borozan at utoronto.ca> wrote:
>>> Hello list,
>>>
>>> Can scanBam() be used with AB SOLID data (bam files) so that it can
>>> return sequences in color space and with the right lengths?
>>>
>>> My read sequences are 50 bp in lengths however scanBam() is returning
>>> sequences of length between 25 - 27 (they seem to be clipped) and
>>> which are not in color space.
>>>
>>> Many thanks for any suggestions,
>>>
>>> Ivan
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should  
>> not be used for urgent or sensitive issues
>>
>>
> 
> 
> 
> 
> 
> 
> 
> ----- End forwarded message -----
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-sig-sequencing mailing list