[Bioc-sig-seq] BLAST from within R
Ivan Gregoretti
ivangreg at gmail.com
Thu Sep 30 21:39:58 CEST 2010
Hello Martin,
You have answered my question. I'll give it a try now.
Regarding your comment on how to "cast the alignments onto common
reference coordinates", there is a start: read.blast() from the
RFLPtools package (http://cran.r-project.org/web/packages/RFLPtools).
Thank you,
Ivan
On Thu, Sep 30, 2010 at 3:25 PM, Martin Morgan <mtmorgan at fhcrc.org> wrote:
> On 09/30/2010 10:53 AM, Ivan Gregoretti wrote:
>> Hello everybody,
>>
>> How do you perform a BLAST alignment of 1 sequence within R?
>> Any pointer would be appreciated.
>>
>> The sequence in question is about 40kb and the pertinent genome is mm9.
>
> Hi Ivan
>
> One possibility comes from
>
> http://www.ncbi.nlm.nih.gov/BLAST/Doc/urlapi.html
>
> where one could load XML
>
> library(XML)
>
> then formulate a query
>
> baseUrl <- "http://www.ncbi.nlm.nih.gov/blast/Blast.cgi"
> query <- paste("QUERY=555&DATABASE=nr&HITLIST_SIZE=10&FILTER=L",
> "&EXPECT=10&PROGRAM=blastn", sep="")
> url0 <- sprintf("%s?%s&CMD=Put", baseUrl, query)
>
> Then submit the query and extract the return result id (RID) using XML
> and XPATH queries (see http://www.w3.org/TR/xpath/, especially section 2.5)
>
> post <- htmlTreeParse(url0, useInternalNodes=TRUE)
> rid <- xpathApply(post, '//input[@name="RID"][@id="rid"]',
> xmlAttrs)[[1]][["value"]]
>
> and then finally retrieve the result
>
> url1 <- sprintf("%s?RID=%s&FORMAT_TYPE=XML&CMD=Get", baseUrl, rid)
> result <- xmlTreeParse(url1, useInternalNodes=TRUE)
>
> what to do now depends on the info you want out, e.g.,
>
> qseq <- xpathApply(result, "//Hsp_qseq", xmlValue)
> hseq <- xpathApply(result, "//Hsp_hseq", xmlValue)
> midline <- xpathApply(result, "//Hsp_midline", xmlValue)
> for (i in 1:5)
> cat(qseq[[i]], hseq[[i]], midline[[i]], sep="\n")
>
> Might be fun (though a bit of work, to cast the alignments onto common
> reference coordinates) to use this in the new
> Biostrings::DNAMultipleAlignment constructor.
>
> Martin
>
>>
>> Thank you,
>>
>> Ivan
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
More information about the Bioc-sig-sequencing
mailing list