Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support in API making the cache an #altmetrics provider #12

Open
egonw opened this issue Sep 12, 2014 · 6 comments
Open

support in API making the cache an #altmetrics provider #12

egonw opened this issue Sep 12, 2014 · 6 comments

Comments

@egonw
Copy link
Member

egonw commented Sep 12, 2014

Together with eNanoMapper partners (https://github.com/enanomapper) and @andrawaag , started writing something up:

http://specs.enanomapper.net/altmetrics/

I think it would be great of Open PHACTS would support the following API calls:

  • number of data sets for DOI
  • number of data points for DOI (like number of measurements, compounds, ...)

The first would be more like formal citations
(cito:citesAsDataSource), the second more like "page views".

ChEMBL is, of course, a big resource, but has a mix of PubMed and DOI.
Here too, we could use BridgeDb and make a linkset... there are some
PubMed<->DOI services...

@Christian-B
Copy link
Member

I agree that a PubMed<->DOI linkset is a great idea and the correct use of the IMS/BridgeDd.
These both actually point to the same paper/journal ect.

Like all IMS/BridgeDd mapping the fact that their may be many alternative prefixes for PubMed URIs and DOI Uris is not a problem for the OPS branch of IMS/BridgeDd at all.

However data about what is covered by a paper like number of measurements, chemicals described ect, is not mapping data and should not be included in IMS/BridgeDd.

@antonisloizou
Copy link
Contributor

The SPARQL queries themselves are relatively easy to write, once we specify which API calls we want, and what they should return.

One way to go would be generic "Entities for Document: List", "Entities for Document: Count" and inverse "Documents for Entity" methods, where:

  • Document is either a Publication or a Patent
  • Entity is one of : compound, target, pathway, disease, tissue, activity
  • There is a filter to specify which type of entity to return

This gives 4 generic methods to maintain, similar to the Hierarchy APIs.

At the other extreme we could have 4 (List + Count, both ways) methods per entity type pair, e.g.

  • "Compounds for Patent: List", "Compounds for Patent: Count", "Patents for Compound: List" and "Patents for Compound: Count"
    and also
  • "Compounds for Publication: List", "Compounds for Publication: Count", "Publications for Compound: List" and "Publications for Compound: Count"

Here we end up with over 40(!!!) individual new methods. Obviously my preference is for the generic methods, however specific ones will end up executing faster by definition.

We can also have a mix of the generic ones + a subset of the specific methods we expect to be used more frequently, to allow those to be quicker.

I'll look into putting some first versions of the generic queries on the dev API , so we can get a feel for performance - of course when patents come in we'll have to re-evaluate.

@antonisloizou
Copy link
Contributor

...sorry, clicked "Closed and comment" rather than just comment ...

@AlasdairGray
Copy link
Member

I agree with @Christian-B division as to what should be in the Cache and what should be in the IMS here. This is as we discussed at the SureCHEMBL meeting.

@egonw
Copy link
Member Author

egonw commented Sep 12, 2014

It was not my intent to mix this with the discussions around the patent-$foo links. While that is important, and more important than this request, I here really just wanted to request two #altmetric calls, with it's own #altmetrics use case: the two listed in the request.

BTW, the query for the WPRDF seems to be something like:

prefix wp:  <http://vocabularies.wikipathways.org/wp#>
prefix dcterms: <http://purl.org/dc/terms/>

SELECT distinct ?object WHERE {
  <http://identifiers.org/pubmed/11252892> a wp:PublicationReference ;
    dcterms:isPartOf ?object .
} ORDER BY ?object

Note that it needs a PubMed...

@egonw
Copy link
Member Author

egonw commented Sep 12, 2014

Some follow up discussions reminded me that we used this before for ChEMBL already, with Andra's CitedIn. See the relevant section in our ChEMBL-RDF paper: http://www.jcheminf.com/content/5/1/23

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants