-
Notifications
You must be signed in to change notification settings - Fork 20
Description
Thanks for making the MeSH RDF SPARQL API. It's been convenient for quick access to MeSH.
I'd like to do a query that returns over 1000 results, and therefore need to figure out how to use pagination with the SPARQL API at https://id.nlm.nih.gov/mesh/sparql
. Here's my query to return a table of descriptors:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX meshv: <http://id.nlm.nih.gov/mesh/vocab#>
SELECT *
FROM <http://id.nlm.nih.gov/mesh/2020>
WHERE {
?mesh_uri a meshv:Descriptor .
?mesh_uri meshv:identifier ?mesh_id.
?mesh_uri rdfs:label ?mesh_label .
}
ORDER BY ?mesh_uri
But I'm having trouble incrementing limit
and offset
to retrieve all results.
In search of a more reproducible example, I've simplified it to the this API call, generated by this python code:
import requests
query = """
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX meshv: <http://id.nlm.nih.gov/mesh/vocab#>
SELECT *
FROM <http://id.nlm.nih.gov/mesh/2020>
WHERE {
?mesh_uri a meshv:Descriptor .
?mesh_uri meshv:identifier ?mesh_id.
?mesh_uri rdfs:label ?mesh_label .
}
ORDER BY ?mesh_uri
LIMIT 5
"""
params = {
"query": query,
"format": "json",
"inference": True,
"limit": 10,
"offset": 4,
"year": 2020,
}
api_url = "https://id.nlm.nih.gov/mesh/sparql"
response = requests.get(api_url, params)
print(response.url)
len(response.json()["results"]["bindings"])
The expected result is to receive a single record (the 5th record), because the query should return 5 records, and the offset is 4. Instead, 5 records are returned. The returned records under results.bindings start with:
{
"mesh_uri": { "type": "uri" , "value": "http://id.nlm.nih.gov/mesh/2020/D000005" } ,
"mesh_id": { "type": "literal" , "value": "D000005" } ,
"mesh_label": { "type": "literal" , "xml:lang": "en" , "value": "Abdomen" }
} ,
So it looks like offset was respected, but something about the SPARQL LIMIT 5
or API parameter limit=10
does not work.