This is an continuous effort towards enabling automatic support for executing SPARQL queries over Graph systems via Gremlin query language. This is achieved by converting sparql queries to gremlin pattern matching traversals/queries.
We would like to acknowledge Daniel Kupitz who laid the early foundation of work that follows. Many thanks getting us started three-cheers :)
This work is a sub-task of a bigger goal: LITMUS, an open extensible framework for benchmarking diverse data management solutions Proposal - [https://arxiv.org/pdf/1608.02800.pdf] | First working prototype - [https://hub.docker.com/r/litmusbenchmarksuite/litmus/]
##The proposed extentions are listed as follows:
-
enable support for Union queries [Done]
-
enable support for Order-By queries [Done]
-
enable support for Group-By queries [Done]
-
enable support for LIMIT-OFFSET modifiers [Done]
-
adding support for ASK queries [Pending, Postponed temporarily]
-
enable support (translation) for BSBM dataset [http://wifo5-03.informatik.uni-mannheim.de/bizer/berlinsparqlbenchmark/] (exeuting SPARQL queries over BSBM dataset [property-graph] ) [Done]
-
enable support (translation) for Northwind dataset (SPARQL queries over Northwind dataset [property-graph] ) [Done]
-
enable support for dataset independent query translation [work in progress] (This is allow a user to execute SPARQL queries over almost any dataset without being bothered about the internal mappings and configuration settings)
-
enable support (translation) of QALD datasets (SPARQL queries over DBpedia) [work in progress]
Note: SPARQL-to-Gremlin work is currently under progress
SPARQL-Gremlin is a compiler used to transform SPARQL queries into Gremlin traversals. It is based on the Apache Jena SPARQL processor ARQ, which provides access to a syntax tree of a SPARQL query.
The current version of SPARQL-Gremlin only uses a subset of the features provided by Apache Jena. The examples below show each implemented feature.
The project contains a console application that can be used to compile SPARQL queries and evaluate the resulting Gremlin traversals. For usage examples simply run ${PROJECT_HOME}/bin/sparql-gremlin.sh
.
To use Gremlin-SPARQL as a Gremlin shell plugin, run the following commands (be sure sparql-gremlin-xyz.jar
is in the classpath):
gremlin> :install com.datastax sparql-gremlin 0.1
==>Loaded: [com.datastax, sparql-gremlin, 0.1]
gremlin> :plugin use datastax.sparql
==>datastax.sparql activated
Once the plugin is installed and activated, establish a remote connection to execute SPARQL queries:
gremlin> :remote connect datastax.sparql graph
==>SPARQL[graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]]
gremlin> :> SELECT ?name ?age WHERE { ?person v:name ?name . ?person v:age ?age }
==>[name:marko, age:29]
==>[name:vadas, age:27]
==>[name:josh, age:32]
==>[name:peter, age:35]
SPARQL-Gremlin supports the following prefixes to traverse the graph:
Prefix | Purpose |
---|---|
|
out-edge traversal |
|
property traversal |
|
property-value traversal |
Note that element IDs and labels are treated like normal properties, hence they can be accessed using the same pattern:
SELECT ?name ?id ?label WHERE { ?element v:name ?name . ?element v:id ?id . ?element v:label ?label }
person
.SELECT * WHERE {
?person v:label "person"
}
name
and age
for each person
vertex.SELECT ?name ?age
WHERE {
?person v:label "person" .
?person v:name ?name .
?person v:age ?age
}
SELECT ?name ?age
WHERE {
?person v:label "person" .
?person v:name ?name .
?person v:age ?age .
?person e:created ?project
}
SELECT ?name ?age
WHERE {
?person v:label "person" .
?person v:name ?name .
?person v:age ?age .
?person e:created ?project .
FILTER (?age > 30)
}
SELECT DISTINCT ?name
WHERE {
?person v:label "person" .
?person e:created ?project .
?project v:name ?name .
FILTER (?age > 30)
}
SELECT DISTINCT ?name
WHERE {
?person v:label "person" .
?person e:created ?project .
?project v:name ?name .
?project v:lang ?lang .
FILTER (?age > 30 && ?lang == "java")
}
SELECT ?name
WHERE {
?person v:label "person" .
?person v:name ?name .
FILTER EXISTS { ?person e:created ?project }
}
SELECT ?name
WHERE {
?person v:label "person" .
?person v:name ?name .
FILTER NOT EXISTS { ?person e:created ?project }
}
SELECT ?name ?startTime
WHERE {
?person v:name "daniel" .
?person p:location ?location .
?location v:value ?name .
?location v:startTime ?startTime
}
SELECT * WHERE {
{?person e:created ?software .}
UNION
{?software v:lang "java" .}
}
person
and order by their age.SELECT * WHERE {
?person v:label "person" .
?person v:age ?age .
} ORDER BY (?age)