[Bioc-sig-seq] BLAST from within R

Ivan Gregoretti ivangreg at gmail.com
Thu Sep 30 21:39:58 CEST 2010


Hello Martin,

You have answered my question. I'll give it a try now.

Regarding your comment on how to "cast the alignments onto common
reference coordinates", there is a start: read.blast() from the
RFLPtools package (http://cran.r-project.org/web/packages/RFLPtools).

Thank you,

Ivan





On Thu, Sep 30, 2010 at 3:25 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> On 09/30/2010 10:53 AM, Ivan Gregoretti wrote:
>> Hello everybody,
>>
>> How do you perform a BLAST alignment of 1 sequence within R?
>> Any pointer would be appreciated.
>>
>> The sequence in question is about 40kb and the pertinent genome is mm9.
>
> Hi Ivan
>
> One possibility comes from
>
>  http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
>
> where one could load XML
>
>  library(XML)
>
> then formulate a query
>
>  baseUrl <- "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi"
>  query <- paste("QUERY=555&DATABASE=nr&HITLIST_SIZE=10&FILTER=L",
>                 "&EXPECT=10&PROGRAM=blastn", sep="")
>  url0 <- sprintf("%s?%s&CMD=Put", baseUrl, query)
>
> Then submit the query and extract the return result id (RID) using XML
> and XPATH queries (see http://www.w3.org/TR/xpath/, especially section 2.5)
>
>  post <- htmlTreeParse(url0, useInternalNodes=TRUE)
>  rid <- xpathApply(post, '//input[@name="RID"][@id="rid"]',
>                    xmlAttrs)[[1]][["value"]]
>
> and then finally retrieve the result
>
>  url1 <- sprintf("%s?RID=%s&FORMAT_TYPE=XML&CMD=Get", baseUrl, rid)
>  result <- xmlTreeParse(url1, useInternalNodes=TRUE)
>
> what to do now depends on the info you want out, e.g.,
>
>  qseq <- xpathApply(result, "//Hsp_qseq", xmlValue)
>  hseq <- xpathApply(result, "//Hsp_hseq", xmlValue)
>  midline <- xpathApply(result, "//Hsp_midline", xmlValue)
>  for (i in 1:5)
>      cat(qseq[[i]], hseq[[i]], midline[[i]], sep="\n")
>
> Might be fun (though a bit of work, to cast the alignments onto common
> reference coordinates) to use this in the new
> Biostrings::DNAMultipleAlignment constructor.
>
> Martin
>
>>
>> Thank you,
>>
>> Ivan
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>



More information about the Bioc-sig-sequencing mailing list