[BioC] BioPAX parsing

Oliver Ruebenacker curoli at gmail.com
Sat Jun 16 17:03:21 CEST 2012


     Hello Michael,

  I'm planning to use RJava to drive OpenRDf Sesame, with which I am
very familiar.

     Take care
     Oliver

On Sat, Jun 16, 2012 at 9:54 AM, Michael Lawrence
<lawrence.michael at gene.com> wrote:
> Were you guys planning on using Rredland for this?
>
>
> On Sat, Jun 16, 2012 at 3:10 AM, Oliver Ruebenacker <curoli at gmail.com>
> wrote:
>>
>>     Hello,
>>
>>  Thanks a lot for the endorsement!
>>
>>  I will try to create a prototype in the next days, and then you can
>> probably advice me on how to turn that into a package of desired
>> quality.
>>
>>     Take care
>>     Oliver
>>
>> On Fri, Jun 15, 2012 at 6:08 PM, Paul Shannon <pshannon at fhcrc.org> wrote:
>> > Oliver and Martin,
>> >
>> > It would be very helpful to have easy access to BioPAX data in
>> > Biocondcutor.
>> >
>> > Just now, at the weekly Bioconductor dev-team meeting, we discussed your
>> > ideas, and want to endorse them.  Oliver's proposal to parse the RDF triples
>> > into a data.frame has lots to recommend it.  It would be immediately useful,
>> > and yet also allow for more sophisticated uses later.  With these
>> > relationships in R, annotated as BioPAX data often are, we can imagine
>> > interested parties writing S4 classes which use the data, which might
>> > provide flexible querying capabilities, and be able to transform those
>> > triples into graphs and networks, for further computation and display.
>> >
>> > Please let us know if we can help.
>> >
>> > - Paul
>> >
>> >
>> > On Jun 15, 2012, at 12:23 PM, Oliver Ruebenacker wrote:
>> >
>> >>     Hello Martin,
>> >>
>> >>  I don't have code in R to test yet, but I do have extensive
>> >> experience handling BioPAX in Java, so I'm assuming reading BioPAX
>> >> using RJava should not be too difficult.
>> >>
>> >>  The best target format depends on what people would like to do with
>> >> the data. For visualization, a bi-partite graph in a popular
>> >> graph-layout package should be best. Is there any particular graph
>> >> package in BioConductor or R in general you would recommend?
>> >>
>> >>  For actual analysis, people probably have more specific requirements.
>> >>
>> >>  BioPAX is a format based on RDF/OWL, which in turn is based on
>> >> organizing data in triples, which could be stored in a three-column
>> >> data frame (or perhaps a fourth column for data type). For example
>> >> (incomplete, for illustration only):
>> >>
>> >>  ex:mapPhosphorylization   rdf:type   bp:BiochemicalReaction.
>> >>  ex:atp   rdf:type   bp:SmallMolecule.
>> >>  ex:adp   rdf:type   bp:SmallMolecule.
>> >>  ex:map   rdf:type   bp:Protein.
>> >>  ex:mapPhosphorylized   rdf:type   bp:Protein.
>> >>  ex:mapPhosphorylization   bp:left   ex:atp.
>> >>  ex:mapPhosphorylization   bp:left   ex:map.
>> >>  ex:mapPhosphorylization   bp:right   ex:adp.
>> >>  ex:mapPhosphorylization   bp:right   ex:mapPhosphorylized.
>> >>
>> >>     Take care
>> >>     Oliver
>> >>
>> >> On Fri, Jun 15, 2012 at 3:03 PM, Martin Preusse
>> >> <martin.preusse at googlemail.com> wrote:
>> >>> Hi Oliver,
>> >>>
>> >>> I think there is a lot interest in a bioconductor package!
>> >>>
>> >>> Personally, I would like to read pathways stored in the BioPAX format
>> >>> into any kind of graph. It's a philosophical question if reactions should
>> >>> have nodes or should sit on the edges :) So far I have not used any R graph
>> >>> package. But I assume there are some very generic packages which are
>> >>> flexible enough to support both direct and bi-partite pathway structure. I
>> >>> used e.g. the JUNG graph API for JAVA extensively.
>> >>>
>> >>> I'm not sure what you mean with RDF/OWL triples. For me BioPAX is only
>> >>> a format to store a pathway. And I would like to bring it back into its
>> >>> natural form: a network!
>> >>>
>> >>> Do you have any code to test? I have used RJava before. All this RDF
>> >>> and XML file format stuff kind of puzzles me though … :)
>> >>>
>> >>> Cheers
>> >>> Martin
>> >>>
>> >>>
>> >>>
>> >>> Am Freitag, 15. Juni 2012 um 18:32 schrieb Oliver Ruebenacker:
>> >>>
>> >>>> Hello Martin,
>> >>>>
>> >>>> I'm currently looking into reading BioPAX into R using RJava and
>> >>>> OpenRDF Sesame. If there is interest, I may be looking into
>> >>>> submitting
>> >>>> a package to BioConductor.
>> >>>>
>> >>>> It would be very helpful if you could tell me what you need the
>> >>>> BioPAX data for, and in what form it would be best for you. Possible
>> >>>> options are:
>> >>>>
>> >>>> - A data frame of the RDF/OWL triples
>> >>>> - A graph of the RDF/OWL triples
>> >>>> - A data frame with one row for each reaction-participant
>> >>>> - A bi-partite graph with nodes for reactions and nodes for
>> >>>> substances
>> >>>> - A with nodes for substances only, with edges for interactions
>> >>>> - A genetic interaction graph
>> >>>>
>> >>>> This list is roughly sorted form the one most easy to the most
>> >>>> difficult to provide.
>> >>>>
>> >>>> Take care
>> >>>> Oliver
>> >>>>
>> >>>> On Thu, Jun 14, 2012 at 10:01 AM, Martin Preusse
>> >>>> <martin.preusse at googlemail.com
>> >>>> (mailto:martin.preusse at googlemail.com)> wrote:
>> >>>>> Many biological pathway resourced provide their data in the BioPAX
>> >>>>> format (http://www.biopax.org/index.php), a special XML format for
>> >>>>> biological interaction networks. Examples are pathway commons
>> >>>>> (http://www.pathwaycommons.org/pc/) and Reactome (http://www.reactome.org
>> >>>>> (http://www.reactome.org/)).
>> >>>>>
>> >>>>> A JAVA library for parsing BioPAX files exists:
>> >>>>> http://www.biopax.org/paxtools.php
>> >>>>>
>> >>>>> Has anybody used BioPAX files with R? Is it possible to read BioPAX
>> >>>>> files in any R based graph structure? A solution similar to the KEGGgraph
>> >>>>> package for KEGG pahways would be great, since more and more databases start
>> >>>>> using BioPAX.
>> >>>>>
>> >>>>>
>> >>>>> Any ideas are appreciated!
>> >>>>>
>> >>>>> Cheers
>> >>>>> Martin
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Bioconductor mailing list
>> >>>>> Bioconductor at r-project.org (mailto:Bioconductor at r-project.org)
>> >>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >>>>> Search the archives:
>> >>>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Oliver Ruebenacker
>> >>>> Bioinformatics Consultant
>> >>>> (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>> >>>> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
>> >>>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Oliver Ruebenacker
>> >> Bioinformatics Consultant
>> >> (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>> >> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
>> >> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
>> >>
>> >> _______________________________________________
>> >> Bioconductor mailing list
>> >> Bioconductor at r-project.org
>> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> >> Search the archives:
>> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
>> >
>>
>>
>>
>> --
>> Oliver Ruebenacker
>> Bioinformatics Consultant
>> (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
>> Knowomics, The Bioinformatics Network (http://www.knowomics.com)
>> SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>



-- 
Oliver Ruebenacker
Bioinformatics Consultant (http://www.knowomics.com/wiki/Oliver_Ruebenacker)
Knowomics, The Bioinformatics Network (http://www.knowomics.com)
SBPAX: Turning Bio Knowledge into Math Models (http://www.sbpax.org)



More information about the Bioconductor mailing list