[BioC] extracting sequence from a genome
Iain Gallagher
iaingallagher at btopenworld.com
Thu Mar 15 16:23:10 CET 2012
Hello List
I have a dataframe of miRNA genomic positions and I would like to get sequence for 200bp upstream of each microRNA.
library(BSgenome.Rnorvegicus.UCSC.rn4)# get genome
ftpAddr <- "ftp://mirbase.org/pub/mirbase/CURRENT/genomes/rno.gff" # get miR coords
mirInfo <- read.table(ftpAddr) # as dataframe
seqs <- list() # holder
for (i in 1:nrow(mirInfo)){
seq <- getSeq(Rnorvegicus, paste('chr', mirInfo[i,1], sep=''), start = mirInfo[i,4], end = mirInfo[i,4]+200)
seqs <- c(seqs, seq)
}
This works but seems to be pretty inefficient in terms of computing power as my pc locks up during the loop.
Could someone point me to a better way?
Thanks
iain
More information about the Bioconductor
mailing list