[Bioc-sig-seq] readBamGappedAlignments() cigar error

Hervé Pagès hpages at fhcrc.org
Fri Jun 25 18:34:06 CEST 2010


Hi Craig,

Sorry for taking so long to answer this. Yes some tools seem to generate 
SAM/BAM output with zero-length operations in the CIGAR field
and the official SAM Format Spec 0.1.2-draft apparently have no
problem with that:

   http://samtools.sourceforge.net/SAM1.pdf

(  see regular expression for the CIGAR string: ([0-9]+[MIDNSHP])+|\*  )

I've modified the CIGAR handling code in GenomicRanges for
allowing zero-length operations in CIGAR strings.

This will be available via biocLite() in the next 24 hours in
GenomicRanges 1.0.5 (release) and GenomicRanges 1.1.15 (devel).

Cheers,
H.

On 05/27/2010 03:29 PM, Craig Johnson wrote:
> I have two BAM files generated from ABI's Bioscope aligner that I want to import using readBamGappedAlignments(). One of the files imports without issue but the second gives me this error:
>
> Error in cigarToIRangesListByAlignment(x at cigar, x at start) :
>    in 'cigar' element 3315124: invalid CIGAR operation length at char 5
>
> The cigar at line 3315124 of the SAM file is 9M0N32M9H
>
> Can anyone suggest what causes this error? I have reads earlier in the SAM/BAM file that have 'ON' in them so even though I don't know what 'ON' means that doesn't seem to be the issue.
>
>> sessionInfo()
> R version 2.11.0 (2010-04-22)
> x86_64-unknown-linux-gnu
>
> locale:
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
>   [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
>   [1] org.Hs.eg.db_2.4.1    RSQLite_0.9-0         DBI_0.2-5
>   [4] AnnotationDbi_1.10.1  Biobase_2.8.0         rtracklayer_1.8.1
>   [7] RCurl_1.4-2           bitops_1.0-4.1        GenomicFeatures_1.0.0
> [10] Rsamtools_1.0.1       Biostrings_2.16.0     GenomicRanges_1.0.1
> [13] IRanges_1.6.2
>
> loaded via a namespace (and not attached):
> [1] biomaRt_2.4.0   BSgenome_1.16.1 tools_2.11.0    XML_3.1-0
>
> Thank you,
> Craig
>
> **********************************************************
> Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



More information about the Bioc-sig-sequencing mailing list