[BioC] R: How can I trace back transcription target genes from the miRNAs file downloadable from miRbase ?

Chao-Jen Wong cwon2 at fhcrc.org
Fri Oct 2 20:15:45 CEST 2009


Hi, Maura,

You can possibly find the microRNA package useful for your task.  It
contains the database from miRBase along with the target genes. You can
then use  this database to find the sequence using biomaRt.  This
package does not have a vignette. To find the help and the target genes
list of hsa-miR-xxx, you can do the following:

>help(package="microRNA")
>?hsTarget
>?s3utr
>?hsSeqs
> head(hsTargets)
            name          target chrom     start       end strand
1    hsa-miR-647 ENST00000295228     2 120824263 120824281      +
2   hsa-miR-130a ENST00000295228     2 120825363 120825385      +
3 hsa-miR-423-3p ENST00000295228     2 120825191 120825213      +
4 hsa-miR-423-5p ENST00000295228     2 120824821 120824843      +
5   hsa-miR-130b ENST00000295228     2 120825364 120825385      +
6 hsa-miR-767-3p ENST00000295228     2 120824258 120824278      +

As for the difference between mirBase and miRecord (i.e., hsa-miR-26a*),
unless miRecord can update their database,  there is no resolution for
hsa-miR-26a-01 and -02 are totally different microRNA located at
different chromosome.

Regards,
Chao-Jen

mauede at alice.it wrote:
> My first task  was to download a set (as big as posssible) of experimentally Validated miRNAs from miRecords with their relative target genes
> and the 3'UTR sequences., limited to Homo sapiens.
> The XLS file  from miRecords related the miRNA identier ("hsa-miR-xxx) with its target genes identifier. I never found a clear way to download 
> the miRNA sequence and the relative target 3'UTR sequence from miRecords. The many different links  bring to pages of sequences that 
> are not expressively stated to be what I need.  Therefore I downloaded the Validated  miRNAs file from miRbase, matched the miRNA identifier
> with miRecords to get the miRNA sequence. Then I used the gene identifier (NM_yyyy) from miRecords to quey BioMart and get the 3'UTR sequences.
> There are many unresolved miRNAs because I cannot find an exact match between the miRecords and miRbase. For example in mirBase I
> found two miRNAs whose identifiers differ only by the last digit but their sequences are different beyond the seed region so their are (I think)
> two different entities:
>
> hsa-miR-26a-1* MIMAT0004499 Homo sapiens miR-26a-1*
>                            "CCUAUUCUUGGUUACUUGCACG"
>   
>> val_miRNA[830]
>>     
> hsa-miR-26a-2* MIMAT0004681 Homo sapiens miR-26a-2*
>                            "CCUAUUCUUGAUUACUUGUUUC"
>
> miRecords XLS file only contains "hsa-miR-26a"  that I cannot match to either one above mentioned.
>  I can only use the Validated miRNAs from miRecords for which I find a match in mirBase.
>
> My question is: if I restrict my search to mirBase, where can I find the experimentally Validated (not just predicted)
> target genes associated to the miRNAs in the downloadable files containing records like the above shown ones ?
> The data MIMATsssss does not seem to bring me anywhere ....
> To complete my task I have to find the Validated target identifiers (for instance NM_xxxxxx) and then use this data to
> query BioMart and get the 3'UTR sequences.
>
> Thank you in advance,
> Maura 
>
>
> -----Messaggio originale-----
> Da: Sean Davis [mailto:seandavi at gmail.com]
> Inviato: ven 02/10/2009 4.22
> A: mauede at alice.it
> Cc: Bioconductor List
> Oggetto: Re: [BioC] How can I trace back transcription target genes from the miRNAs file downloadable from miRbase ?
>  
> On Thu, Oct 1, 2009 at 9:27 PM,  <mauede at alice.it> wrote:
>   
>> I downloaded the Validated miRNAs files (mirbase/CURRENT/mature.fa.gz , maturestar.fa).
>> How can I trace back to the gene  transcription sequence for the genes that targeted any specifi miRNA '
>> Thank you in advance,
>> Maura
>>     
>
> Hi, Maura.  Are you asking to find the targets of miRNAs?  Or are you
> asking for the sequences of transcripts?
>
> Sean
>
>
>
> tutti i telefonini TIM!
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>   


-- 
Chao-Jen Wong
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Avenue N., M2-B876
PO Box 19024
Seattle, WA 98109
206.667.4485
cwon2 at fhcrc.org



More information about the Bioconductor mailing list