Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing Interactions from GetInteractions call #20

Open
RyAMiller opened this issue May 4, 2016 · 11 comments
Open

Missing Interactions from GetInteractions call #20

RyAMiller opened this issue May 4, 2016 · 11 comments

Comments

@RyAMiller
Copy link

RyAMiller commented May 4, 2016

Issue with the GetInteractions api call for pathways.
When testing, Peter noticed that especially on Reactome pathways only the metabolites were being returned and not the Reactome complex IDs. As a result, the GetInteractions call was only returning interactions that were connected to the metabolites in the pathway tested.
pathway used - Pathway being tested (WP2704)

The data loaded should be the same that is on the WP SPARQL endpoint. It is probably not a data problem. Looking at the documentation for the call, I think the issue is with the SPARQL query used. I am hoping it is as simple as fixing the query.

For reference I have included a query that works on the WP endpoint and I think solves the above issue. Any feedback is appreciated.

http://sparql.wikipathways.org/

PREFIX wp:    <http://vocabularies.wikipathways.org/wp#>
PREFIX dcterms: <http://purl.org/dc/terms/>

SELECT DISTINCT ?pathway ?interaction ?target ?source 
WHERE {

   ?pathway a wp:Pathway . 
   ?pathway dc:identifier <http://identifiers.org/wikipathways/WP2704> .

   ?interaction dcterms:isPartOf ?pathway .
   ?interaction a wp:Interaction .

   ?interaction wp:target ?target .
   ?interaction wp:source ?source .
}
@stain
Copy link
Contributor

stain commented May 5, 2016

I can confirm that I get the same results from your query on our SPARQL endpoint , so the data is loaded.

Could you help me understand this a bit more..

https://ops2.few.vu.nl/2.1/pathway/getInteractions?uri=http%3A%2F%2Fidentifiers.org%2Fwikipathways%2FWP2704&app_id=161aeb7d&app_key=333c09ae195d777b68a117bb42f29b1c&_format=ttl

returns just three interactions:

<http://rdf.wikipathways.org/Pathway/WP2704_r81439/WP/Interaction/b0e38> rdf:type ns0:DirectedInteraction ;
  void:inDataset <http://www.wikipathways.org> ;
  ns0:source <http://identifiers.org/chebi/CHEBI:15422> ;
  ns0:target <http://identifiers.org/chebi/CHEBI:16761> .

<http://rdf.wikipathways.org/Pathway/WP2704_r81439/WP/Interaction/f460a> rdf:type ns0:DirectedInteraction ;
   void:inDataset <http://www.wikipathways.org> ;
   ns0:source <http://identifiers.org/chebi/CHEBI:15422> ;
   ns0:target <http://identifiers.org/chebi/CHEBI:16761> .

<http://rdf.wikipathways.org/Pathway/WP2704_r81439/WP/Interaction/f6b8d> rdf:type ns0:DirectedInteraction ;
  void:inDataset <http://www.wikipathways.org> ;
  ns0:source <http://identifiers.org/chebi/CHEBI:15422> ;
  ns0:target <http://identifiers.org/chebi/CHEBI:16761> .

but you would have hoped for additional interactions with other sources/targets, as found in sparql.wikipathways.org ?

e.g. you would want also:

<http://rdf.wikipathways.org/Pathway/WP2704_r81439/WP/Interaction/d8c8a> rdf:type ns0:DirectedInteraction ;
  void:inDataset <http://www.wikipathways.org> ;
  ns0:source <http://identifiers.org/uniprot/P40189-2> ;
  ns0:target <http://identifiers.org/reactome/R-HSA-1067674> .

etc?

I think the API's query filters out all the http://identifiers.org/reactome/* identifiers -- it seems they are generally duplicates on the ?source and ?target - or do the interactions have multiple sources and multiple targets? This is a bit confusing to me.

@stain
Copy link
Contributor

stain commented May 5, 2016

BTW - all of the reactome identifiers fail in the browser - e.g. http://identifiers.org/reactome/R-HSA-1067691 says:

The data in the URL can't fit into a state

@PeterWoollard
Copy link

Yes Ryan has flagged that the REACTOME identifiers fail.
What is the action? Flag this to the REACTOME team at EBI and Toronto to ask them to fix this?

@PeterWoollard
Copy link

In many pathways, the "entity" is actually a complex, e.g. several proteins, metal ions, small molecule ligands (e.g. short peptides or simple ), ATP etc.. In the REACTOME they represent this has using sets(=lists), and recursive subsets. In pathways we typically simplify this for our overloaded brains as binary interactions. In simple terms if complex A directly causes the phosphorylation of proteinB, then we are simplifying it with all the proteins in ComplexA(the source) as directly interacting with proteinB(the target). Does this help or confuse you further? Ryan, Chris or I can explain more.

@PeterWoollard
Copy link

I believe there are missing linksets with the REACTOME data. When querying the wikipathways sparql endpoint directly for interactions, the REACTOME is missing proteins, just getting metabolites.

@RyAMiller
Copy link
Author

RyAMiller commented May 9, 2016

Yes, Stian. We can have multiple sources and targets. This is correct behavior. For example...

A           C
 \          ^
   \      /
    ----
   /      \
  /         ∨
B           D

Something like this is possible and shows up quite often in reactome pathways.

@RyAMiller
Copy link
Author

Peter, I am not sure which data you are missing. The interactions are connected from say complex A to complex B using an interaction and you are right. You have to examine what complex A and complex B are individually since like you said we simplify things to say one group as a whole is affecting another group as a whole.

@RyAMiller
Copy link
Author

As far as the REACTOME IDs, I do think we need to raise this as an issue. Because the ID is valid and it is the right entity, but for some reason in the pathway viewer, it does not work.
For example, the one that Stian gave earlier R-HSA-1067691 using identifiers.org resolves to http://www.reactome.org/PathwayBrowser/#R-HSA-1067691 this should be the correct pattern, but interestingly enough, it is not. I am not sure why this is happening.

http://www.reactome.org/content/detail/R-HSA-1067691 does work though, but it is not in the context of the pathwaybrowser. I think we want it in the context of the pathway browser, correct? Because if we use the 'content/detail/ID' link, then it is a real pain to then get to the correct place in the pathwayviewer. (you have to expand the pathway viewer section, go the last link and then dig around the patwhay to find the right ID). I think this is an issue for the Reactome team to address.

@stain
Copy link
Contributor

stain commented May 9, 2016

So as far as I understand we don't know anything more about the reactome in the wikipathways RDF beyond what pathways and interaction it is part of - is there another RDF source we need to load?

You mentioned we need additional linksets? What is the source and destination of the linkset?

@RyAMiller
Copy link
Author

Peter mentioned this and I am not sure I am following. I will try and ask him.

@Chris-Evelo
Copy link

Two separate things in this thread I think.

  1. Reactome content doesn't show in the browser but URL is correct, yes should be discussed with Reactome as suggested by Peter. Ryan, can you do that?

  2. Sometimes (for Reactome often) interactions involve complexes. That is fine, but there should be a separate call to get the content of a complex (and the reverse get all complexes a gene product participates in) . That should probably be part of the API calls that we have for the same issue about ChEBI complexes. Do we have all the content needed to make that call work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants