Table of Content
-
Building Neo4j Applications with Go (my implementation here)
To finish:
Pattern:
- nodes with
():(Person) - labels with
::(:Person) - relationships with
--or greater or less for direction (->,<-):(:Person)--(:Movie)or(:Person)->(:Movie) - type of relationship with
[]:[:ACTED_IN] - properties are specified in JSON like syntax:
{name: 'Tom Hanks'}
Example of pattern: (m:Movie {title: 'Cloud Atlas'})<-[:ACTED_IN]-(p:Person)
- labels, property keys and variables are case-sensitive
- cypher keywords are not case-sensitive
- best practices:
- name labels with
CamelCase - property keys and variables with
camelCase - cypher keywords with
UPPERCASE - relationships are
UPPERCASEwith_characters - have at least one label for a node but no more than four (labels should help with most of the use cases)
- labels should have nothing to do with one another
- better not to use the same type of label in different contexts
- don't label the nodes to represent hierarchies
- eliminate duplicate data. Create new nodes and relationships if necessary. Queries related to the information in the nodes require that all nodes be retrieved.
- name labels with
- read data
- similar to the
FROMclause in an SQL statement - need to return something
- you don't need to specify direction in the
MATCHpattern, the query engine will look for all nodes that are connected, regardless of the direction of the relationship
Code examples
Return all nodes:
MATCH (n)
RETURN nReturn all nodes with the label Person:
MATCH (p:Person)
RETURN pReturn a person based on a property:
MATCH (p:Person {name: 'Tom Hanks'})
RETURN pReturn a property:
MATCH (p:Person {name: 'Tom Hanks'})
RETURN p.bornReturn a property based on a relation:
MATCH (p:Person {name: 'Tom Hanks'})-[:ACTED_IN]->(m:Movie)
RETURN m.titleCode examples
Filter by specifying the property value:
MATCH (p:Person)
WHERE p.name = 'Tom Hanks' OR p.name = 'Rita Wilson'
RETURN p.name, p.bornFilter by node labels:
MATCH (p)-[:ACTED_IN]->(m)
WHERE p:Person AND m:Movie AND m.title='The Matrix'
RETURN p.nameis the same as:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE m.title='The Matrix'
RETURN p.nameFilter with ranges:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE 2000 <= m.released <= 2003
RETURN p.name, m.title, m.releasedFilter by existence of a property:
MATCH (p:Person)-[:ACTED_IN]->(m:Movie)
WHERE p.name='Jack Nicholson' AND m.tagline IS NOT NULL
RETURN m.title, m.taglineFilter strings:
- partial strings (
STARTS WITH,ENDS WITH,CONTAINS):
MATCH (p:Person)-[:ACTED_IN]->()
WHERE p.name STARTS WITH 'Michael'
RETURN p.name- string tests are case-sensitive
toLower(),toUpper()functions
MATCH (p:Person)-[:ACTED_IN]->()
WHERE toLower(p.name) STARTS WITH 'michael'
RETURN p.nameFilter by patterns in the graph:
// Find all people who wrote a movie but not directed it
MATCH (p:Person)-[:WROTE]->(m:Movie)
WHERE NOT exists( (p)-[:DIRECTED]->(m) )
RETURN p.name, m.titleFilter using lists:
- of numeric or string values
MATCH (p:Person)
WHERE p.born IN [1965, 1970, 1975]
RETURN p.name, p.born- existing lists in the graph
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE 'Neo' IN r.roles AND m.title='The Matrix'
RETURN p.name, r.rolesFilter based on the existence of a relationship:
MATCH (p:Person)
WHERE exists ((p)-[:ACTED_IN]-()) // or WHERE NOT exists ((p)-[:ACTED_IN]-())
SET p:Actor- the
MERGEoperations work by first trying to find a pattern in the graph. If the pattern is found then the data already exists and is not created. If the pattern is not found, then the data can be created - when using
MERGEyou need to add at least a property that will make the unique primary key for the node
Code examples
MERGE (p:Person {name: 'Michael Cain'})Can merge multiple MERGE clauses together:
MERGE (p:Person {name: 'Katie Holmes'})
MERGE (m:Movie {title: 'The Dark Knight'})
RETURN p, mCreate a relationship based on 2 existing nodes:
MATCH (p:Person {name: 'Michael Cain'})
MATCH (m:Movie {title: 'The Dark Knight'})
MERGE (p)-[:ACTED_IN]->(m)Create the nodes and the relationship
- using multiple clauses:
MERGE (p:Person {name: 'Chadwick Boseman'})
MERGE (m:Movie {title: 'Black Panther'})
MERGE (p)-[:ACTED_IN]-(m)(if the direction of the relationship is not set, it is assumed to be left-to-right)
- in single clause
MERGE (p:Person {name: 'Emily Blunt'})-[:ACTED_IN]->(m:Movie {title: 'A Quiet Place'})
RETURN p, m- set behavior at runtime to set properties when the node is created or when it is found with
ON CREATE SET,ON MATCH SETorSET
Code example
// Find or create a person with this name
MERGE (p:Person {name: 'McKenna Grace'})
// Only set the `createdAt` property if the node is created during this query
ON CREATE SET p.createdAt = datetime()
// Only set the `updatedAt` property if the node was created previously
ON MATCH SET p.updatedAt = datetime()
// Set the `born` property regardless
SET p.born = 2006
RETURN p- it doesn't look up the primary key before adding the node
- provides greater speed during import
MERGEeliminates duplication of nodes
Code examples
Create nodes:
CREATE (n);
CREATE (n:Person);
CREATE (n:Person {name: 'Andy', title: 'Developer'});Create relationships:
MATCH
(a:Person),
(b:Person)
WHERE a.name = 'A' AND b.name = 'B'
CREATE (a)-[r:RELTYPE]->(b)
RETURN type(r)- set a property value
- this can be done with
MERGEas well
Code examples
Set one or more properties:
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name = 'Michael Cain' AND m.title = 'The Dark Knight'
SET r.roles = ['Alfred Penny'], r.year = 2008
RETURN p, r, mUpdate existing properties:
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name = 'Michael Cain' AND m.title = 'The Dark Knight'
SET r.roles = ['Mr. Alfred Penny']
RETURN p, r, mAdd new label to a node:
MATCH (p:Person {name: 'Jane Doe'})
SET p:Developer
RETURN pCode example
Remove property:
MATCH (p:Person)
WHERE p.name = 'Gene Hackman'
SET p.born = null
RETURN pCode examples
Remove a property:
MATCH (p:Person)-[r:ACTED_IN]->(m:Movie)
WHERE p.name = 'Michael Cain' AND m.title = 'The Dark Knight'
REMOVE r.roles
RETURN p, r, mRemove a label from a node:
MATCH (p:Person {name: 'Jane Doe'}) // Same as MATCH (p:Person:Developer {name: 'Jane Doe'})
REMOVE p:Developer
RETURN p- attempting to delete a node with a relationship will throw an error - Neo4j prevents orphaned relationships in the graph
Code examples
MATCH (p:Person)
WHERE p.name = 'Jane Doe'
DELETE pRemove a relationship:
MATCH (p:Person {name: 'Jane Doe'})-[r:ACTED_IN]->(m:Movie {title: 'The Matrix'})
DELETE r
RETURN p, mCode examples
Delete a node and all its relationships:
MATCH (p:Person {name: 'Jane Doe'})
DETACH DELETE pDelete all nodes and all relationships in the graph:
MATCH (n)
DETACH DELETE n(this will exhaust memory on a large db)
- expand a list into a sequence of rows
- nothing is returned if the list is empty or the expression is not a list
Code examples
UNWIND [1, 2, 3, null] AS x // null is returned as well
RETURN x, 'val' AS y Create a distinct list:
WITH [1, 1, 2, 2] AS coll
UNWIND coll AS x
WITH DISTINCT x
RETURN collect(x) AS setOfVals // [1,2]Using UNWIND with any expression returning a list:
WITH
[1, 2] AS a,
[3, 4] AS b
UNWIND (a + b) AS x
RETURN x // the lists are concatenated and 4 rows are returnedUse multiple UNWIND clauses with a nested list:
WITH [[1, 2], [3, 4], 5] AS nested
UNWIND nested AS x
UNWIND x AS y
RETURN y // 5 rows
Replace empty list with null with CASE:
WITH [] AS list
UNWIND
CASE
WHEN list = [] THEN [null]
ELSE list
END AS emptylist
RETURN emptylistExample of splitting the languages from movies to own nodes:
MATCH (m:Movie)
UNWIND m.languages AS language
WITH language, collect(m) AS movies
MERGE (l:Language {name:language})
WITH l, movies
UNWIND movies AS m
WITH l,m
MERGE (m)-[:IN_LANGUAGE]->(l);
MATCH (m:Movie)
SET m.languages = nullExample of splitting genres to own nodes:
MATCH (m:Movie)
UNWIND m.genres AS genre
MERGE (g:Genre {name: genre})
MERGE (m)-[:IN_GENRE]->(g)
SET m.genres = nullkeys()- get the properties of a node
MATCH (p:Person)
RETURN p.name, keys(p)
- get all node labels defined in the graph
CALL db.labels()- get all property keys defined (even if there are no nodes or relationships with them anymore)
CALL db.propertyKeys()-
date specific uses
datetime()- current date and timedate("2019-09-30")=2019-09-29datetime({epochmillis: ms})=2019-09-25T06:29:39Z- use APOC functions for more specific needs (apoc.temporal)
-
use transactions by wrapping the queries with
:BEGINand:COMMIT:
:BEGIN
MATCH (u:User)
SET u.name = "Steve"
:COMMIT - produce a query plan showing the operations that occurred during a query:
PROFILE MATCH (p:Person)-[:ACTED_IN]-()
WHERE p.born < '1950'
RETURN p.name - use APOC for creating new and specialized relationships
MATCH (n:Actor)-[r:ACTED_IN]->(m:Movie)
CALL apoc.merge.relationship(n,
'ACTED_IN_' + left(m.released,4),
{},
m ) YIELD rel
RETURN COUNT(*) AS `Number of relationships merged`-
view the schema with
:schema -
visualize:
CALL db.schema.visualization
The process to create a graph data model:
-
understand the domain and define use cases
- describe the app in details
- identify the users of the app (people, systems)
- identify the use cases
- rank them based on importance
-
develop the initial model
- model the nodes (the entities)
- model the relationships between nodes
Types of models:
- data model - describe the labels, relationships and properties of the graph
- instance model - sample data used to test against the use cases
The node properties are used to uniquely identify a node, answer specific details of the use cases and / or return data.
They are defined based on the use cases and the steps required to answer them. Examples:
- What
peopleacted in amovie?- Retrieve a movie by its
title. - Return the
namesof the actors.
- Retrieve a movie by its
- What
moviesdid apersonact in?- Retrieve a person by their
name. - Return the
titlesof the movies.
- Retrieve a person by their
- What is the highest rated movie in a particular year according to imDB?
- Retrieve all movies
releasedin a particular year. - Evaluate the
imDB ratings. - Return the movie
title.
- Retrieve all movies
Relationships are usually between 2 different nodes, but they can also be to the same node.
Can add specialized relationships if that will filter fewer nodes but keeping the original generic relationships as well. For eg., besides
ACTED_INcan addACTED_IN_2023as wel.Can create intermediate nodes when you need to:
- connect more than 2 nodes in a single context (hyperedges, n-ary relationships)
- relate something to a relationship
- share data in the graph between entities
-
test the use cases against the initial data model
-
create the instance model with test data using Cypher
-
test the use cases including performance against the graph
-
refactor the graph data model in case of changes in the key use cases or for performance reasons
-
implement the refactoring on the graph and retest using Cypher
-
Cypher has a built-in clause (
LOAD CSV), for importing JSON or XML need to use the APOC library -
default field terminator is
, -
the types of data that you can store as properties in Neo4j include:
- String
- Long (integer values)
- Double (decimal values)
- Boolean
- Date/Datetime
- Point (spatial)
- StringArray (comma-separated list of strings)
- LongArray (comma-separated list of integer values)
- DoubleArray (comma-separated list of decimal values)
- Neo4j’s Cypher statement language is optimized for node traversal so that relationships are not traversed multiple times
- each relationship must have a direction in the graph. The relationship can be queried in either direction, or ignored completely at query time
- Neo4j stores nodes and relationships as objects that are linked to each other via pointers
index-free adjacency- a reference to the relationship is stored with both start and end nodes
go install -tags 'neo4j' github.com/golang-migrate/migrate/v4/cmd/migrate@latest
migrate -h
# ext specifies the file extension to use when creating migrations file.
# dir specifies which directory to create the migrations in.
migrate create -ext cypher -dir db/migrations <filename>
# neo4j://user:password@host:port/
export DB_URL='...'
# run migrations
migrate -database ${DB_URL} -path db/migrations up
migrate -database <db> -path db/migrations up
# rollback migrations
migrate -database <db> -path db/migrations down
# run the first two migrations
migrate -source db/migrations -database <db> up 2
# migrations hosted on github
migrate -source github://mattes:personal-access-token@mattes/migrate_test \
-database <db> down 2
# docker usage
docker run -v {{ migration dir }}:/migrations --network host migrate/migrate
-path=/migrations/ -database <db> up
# drop everything inside the db (verbose)
migrate -database <db> -path db/migrations -verbose droperror: Server error: [Neo.ClientError.Statement.SyntaxError] Invalid constraint syntax, ON and ASSERT should not be used. Replace ON with FOR and ASSERT with REQUIRE. (line 1, column 1 (offset: 0))
"CREATE CONSTRAINT ON (a:SchemaMigration) ASSERT a.version IS UNIQUE"
Fix:
Create the constraint manually:
CREATE CONSTRAINT FOR (a:SchemaMigration) REQUIRE a.version IS UNIQUE
Issue coming from here.
Dirty database version xxx. Fix and force version.
Check schema migration:
MATCH(sm:SchemaMigration) RETURN sm
This will return something like this with dirty = true:
{
"identity": 0,
"labels": [
"SchemaMigration"
],
"properties": {
"dirty": true,
"version": 20230120122715,
"ts": "2023-01-20T13:52:44.802000000Z"
},
"elementId": "0"
}
Fix:
Clean up the database and then change the dirty flag on SchemaMigration and rollback version number to last migration that was successfully applied.
MATCH(sm:SchemaMigration) SET sm.dirty = false, sm.version = <previous-version> RETURN sm
Can set version with:
migrate force V # Set version V but don't run migration (ignores dirty state)
migrate -database <db> -path db/migrations -verbose version <version>