[Bioc-sig-seq] Fwd: Re: Question about sbanBam() with AB SOLID data
Martin Morgan
mtmorgan at fhcrc.org
Fri Nov 12 00:03:05 CET 2010
On 11/11/2010 02:25 PM, ivan.borozan at utoronto.ca wrote:
> Bioscope pipeline was used to align the data. In order to get reads in
> color space and their quality
>
> param<- ScanBamParam(tag=c("CQ"))
>
> and
>
> param<- ScanBamParam(tag=c("CS"))
>
>
> should be specified.
Hi Ivan -- I'm a little confused about where this leaves you. Can you
actually read the sequences / quality strings in to R? Are they
represented in sequence space? Is there a BAM file available somewhere
to experiment with?
Martin
>
> Best,
>
> Ivan
>
> Quoting James MacDonald <jmacdon at med.umich.edu>:
>
>> What aligner are you using that returns the sequences in color
>> space? The SAM format specifies that:
>>
>> "Color alignments are stored as normal nucleotide alignments with
>> additional tags describing the raw color sequences, ..."
>>
>> So in general I wouldn't expect the seq to be color space, but
>> nucleotide space. Depending on the aligner, you may get a CS:Z: tag
>> of color space sequence, but I don't believe scanBam will parse that.
>>
>> Best,
>>
>> Jim
>> --
>>
>> James W. MacDonald, M.S.
>> Biostatistician
>> Douglas Lab
>> University of Michigan
>> Department of Human Genetics
>> 5912 Buhl
>> 1241 E. Catherine St.
>> Ann Arbor MI 48109-5618
>> 734-615-7826
>>
>>
>>>>> <ivan.borozan at utoronto.ca> wrote:
>>> Hello list,
>>>
>>> Can scanBam() be used with AB SOLID data (bam files) so that it can
>>> return sequences in color space and with the right lengths?
>>>
>>> My read sequences are 50 bp in lengths however scanBam() is returning
>>> sequences of length between 25 - 27 (they seem to be clipped) and
>>> which are not in color space.
>>>
>>> Many thanks for any suggestions,
>>>
>>> Ivan
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>> **********************************************************
>> Electronic Mail is not secure, may not be read every day, and should
>> not be used for urgent or sensitive issues
>>
>>
>
>
>
>
>
>
>
> ----- End forwarded message -----
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109
Location: M1-B861
Telephone: 206 667-2793
More information about the Bioc-sig-sequencing
mailing list