[Bioc-sig-seq] Biostrings: problem to access indel-details form pairwiseAlignment()
Patrick Aboyoun
paboyoun at fhcrc.org
Tue Jul 21 18:16:55 CEST 2009
Wolfgang,
Below is code that retrieves the indel locations you are looking for. I
like your attempts at using indel, insertion, and deletion for
PairwiseAlignment objects and I'll add the methods for PairwiseAlignment
objects to BioC 2.5 (devel) shortly using the conventions that I specify
below.
> suppressMessages(library(Biostrings))
> ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
> samp1 <-
DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG"))
> # 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter ...
>
> align <- pairwiseAlignment(samp1,ref1)
>
> nindel(align)
An object of class “InDel”
Slot "insertion":
Length WidthSum
[1,] 0 0
[2,] 1 1
[3,] 0 0
Slot "deletion":
Length WidthSum
[1,] 0 0
[2,] 0 0
[3,] 0 0
> deletions <- indel(pattern(align))
> deletions
CompressedIRangesList: 3 elements
> insertions <- indel(subject(align))
> insertions
CompressedIRangesList: 3 elements
> insertions[[2]]
IRanges instance:
start end width
[1] 10 10 1
> sessionInfo()
R version 2.10.0 Under development (unstable) (2009-06-28 r48863)
i386-apple-darwin9.7.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] Biostrings_2.13.26 IRanges_1.3.41
loaded via a namespace (and not attached):
[1] Biobase_2.5.4
Wolfgang Raffelsberger wrote:
> Dear list,
>
> previously I've been extracting indel-information from sequences
> aligned by the Biostrings function pairwiseAlignment(), which is
> probably not the best way since the class
> 'PairwiseAlignedFixedSubject' has evoled & changed and my old code
> won't work any more. Now trying to use the library-provided functions
> to access the information/details about indels (ie their localization
> on the pattern and possibly the indel sequence ). However, I can't
> find a function to extract this information, that is (to the best of
> my knowledge) part of the aligned object.
>
> ## here an example :
> library(Biostrings)
> ref1 <- DNAString("GGGATACTTCACCAGCTCCCTGGC") # my pattern
> samp1 <-
> DNAStringSet(c("GGGATACTACACCAGCTCCCTGGC","GGGATACTTACACCAGCTCCCTGGC","ATACTTCACCAGCTCCCTG"))
>
> # 1st has a mutation, 2nd has an insertion, the 3rd is simply shorter ...
>
> align <- pairwiseAlignment(samp1,ref1)
>
> nindel(align) # insertion was found properly but I can't see at which
> nt position the indel was found (neither if it's an insertion or
> deletion)
> indel(align) # Error in function (classes, fdef, mtable) unable to
> find an inherited method for function...
> insertion(align) # Error in function (classes, fdef, mtable) unable to
> find an inherited method for function ...
> deletion(align) # neither ...
> ?AlignedXStringSet # says under 'Accessor methods' that indel() exists ..
>
> ## ideally I'd be looking for something like
> mismatchTable(align) # but addressing indels ...
>
>
> ## for completeness :
> > sessionInfo()
> R version 2.9.1 (2009-06-26)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
>
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
> other attached packages:
> [1] ShortRead_1.2.1 lattice_0.17-25 BSgenome_1.12.3 Biostrings_2.12.7
> IRanges_1.2.3
> loaded via a namespace (and not attached):
> [1] Biobase_2.4.1 grid_2.9.1 hwriter_1.1
>
> Thank's in advance,
> Wolfgang Raffelsberger
>
> . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
> Wolfgang Raffelsberger, PhD
> Laboratoire de BioInformatique et Génomique Intégratives
> CNRS UMR7104, IGBMC, 1 rue Laurent Fries, 67404 Illkirch Strasbourg,
> France
> Tel (+33) 388 65 3300 Fax (+33) 388 65 3276
> wolfgang.raffelsberger (at) igbmc.fr
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
More information about the Bioc-sig-sequencing
mailing list