Skip to content

Pagination on full text queries leads to OOM errors #6999

@ty-salter-bsl

Description

@ty-salter-bsl

Describe the bug
Full text queries with pagination applied via first and after cause OOM errors to be thrown if there are lots of nodes in the database and the after is large enough (seems to throw around 1000). We have around 40,000 nodes in our database.
Based on the logs it also appears as though the generated Cypher is being executed multiple times.

Type definitions

type Address
@node
@fulltext(
  indexes: [
    {
      indexName: "AddressSearch"
      queryName: "AddressFulltextSearch"
      fields: ["uuid", "streetAddress", "suburb", "state", "postcode", "country"]
    }
  ]
)
{
  uuid: ID!

  streetAddress: String!

  suburb: String!

  state: String!

  postcode: String! 

  country: String!
}

To Reproduce

  1. Create a full text index conforming to the @fulltext directive configuration above.
  2. Insert ~40,000 Address nodes into Neo4j. The following Cypher can achieve this easily:
    UNWIND range(1, 40000) as idx
    CREATE (c:Address{uuid: randomUUID(), streetAddress: toString(idx) + " Fake Street", suburb: "Foo", state: "Bar", postcode: "12345", country: "TEST" })
    
  3. Then run the following GraphQL query:
    query {
      AddressFulltextSearch(
        first: 1000
        after: "YXJyYXljb25uZWN0aW9uOjEwMDA="
        phrase: "*"
        sort: [{ node: { streetAddress: ASC } }]
      ) {
        edges {
          node {
            uuid
            streetAddress
            suburb
            state
            country
            postcode
          }
        }
      }
    }
    The phrase is a wildcard for demonstration purposes but any broad enough search will cause the issue to occur.
  4. See error
    @neo4j/graphql:execution executing cypher +14s
    @neo4j/graphql:execution CYPHER 5
    @neo4j/graphql:execution CALL db.index.fulltext.queryNodes("AddressSearch", $param0) YIELD node AS this0, score AS var1
    @neo4j/graphql:execution WHERE $param1 IN labels(this0)
    @neo4j/graphql:execution WITH collect({ node: this0 }) AS edges
    @neo4j/graphql:execution CALL (edges) {
    @neo4j/graphql:execution     UNWIND edges AS edge
    @neo4j/graphql:execution     WITH edge.node AS this0
    @neo4j/graphql:execution     WITH *
    @neo4j/graphql:execution     ORDER BY this0.streetAddress ASC
    @neo4j/graphql:execution     SKIP $param2
    @neo4j/graphql:execution     LIMIT $param3
    @neo4j/graphql:execution     RETURN collect({ node: { uuid: this0.uuid, streetAddress: this0.streetAddress, suburb: this0.suburb, state: this0.state, country: this0.country, postcode: this0.postcode, __resolveType: "Address" } }) AS var2
    @neo4j/graphql:execution }
    @neo4j/graphql:execution RETURN { edges: var2 } AS this +0ms
    @neo4j/graphql:execution cypher params: {
    @neo4j/graphql:execution   param0: '*',
    @neo4j/graphql:execution   param1: 'Address',
    @neo4j/graphql:execution   param2: Integer { low: 1001, high: 0 },
    @neo4j/graphql:execution   param3: Integer { low: 1000, high: 0 }
    @neo4j/graphql:execution } +0ms
    @neo4j/graphql:execution executing cypher +945ms
    ... # Repeated three or four more times.
    @neo4j/graphql:execution executing cypher +27s
    @neo4j/graphql:execution CYPHER 5
    @neo4j/graphql:execution CALL db.index.fulltext.queryNodes("AddressSearch", $param0) YIELD node AS this0, score AS var1
    @neo4j/graphql:execution WHERE $param1 IN labels(this0)
    @neo4j/graphql:execution WITH collect({ node: this0 }) AS edges
    @neo4j/graphql:execution CALL (edges) {
    @neo4j/graphql:execution     UNWIND edges AS edge
    @neo4j/graphql:execution     WITH edge.node AS this0
    @neo4j/graphql:execution     WITH *
    @neo4j/graphql:execution     ORDER BY this0.streetAddress ASC
    @neo4j/graphql:execution     SKIP $param2
    @neo4j/graphql:execution     LIMIT $param3
    @neo4j/graphql:execution     RETURN collect({ node: { uuid: this0.uuid, streetAddress: this0.streetAddress, suburb: this0.suburb, state: this0.state, country: this0.country, postcode: this0.postcode, __resolveType: "Address" } }) AS var2
    @neo4j/graphql:execution }
    @neo4j/graphql:execution RETURN { edges: var2 } AS this +0ms
    @neo4j/graphql:execution cypher params: {
    @neo4j/graphql:execution   param0: '*',
    @neo4j/graphql:execution   param1: 'Address',
    @neo4j/graphql:execution   param2: Integer { low: 1001, high: 0 },
    @neo4j/graphql:execution   param3: Integer { low: 1000, high: 0 }
    @neo4j/graphql:execution } +0ms
    @neo4j/graphql:execution Neo4jError: The allocation of an extra 9.7 MiB would use more than the limit 12.0 GiB. Currently using 12.0 GiB. dbms.memory.transaction.total.max threshold reached
    : 
      at captureStacktrace (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/result.js:624:17)
      at new Result (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/result.js:112:23)
      at newCompletedResult (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/transaction.js:528:12)
      at Object.run (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/transaction.js:360:20)
      at TransactionPromise.Transaction.run (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/transaction.js:181:34)
      at ManagedTransaction.run (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/transaction-managed.js:54:21)
      at Executor.transactionRun (/home/projects/graphql-server/node_modules/@neo4j/graphql/src/classes/Executor.ts:291:28)
      at /home/projects/graphql-server/node_modules/@neo4j/graphql/src/classes/Executor.ts:269:33
      at TransactionExecutor._safeExecuteTransactionWork (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/internal/transaction-executor.js:211:26)
      at TransactionExecutor.<anonymous> (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/internal/transaction-executor.js:198:46)
      at step (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/internal/transaction-executor.js:44:23)
      at Object.next (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/internal/transaction-executor.js:25:53)
      at fulfilled (/home/projects/graphql-server/node_modules/neo4j-driver-core/lib/internal/transaction-executor.js:16:58)
      at processTicksAndRejections (node:internal/process/task_queues:105:5) {
    constructor: [Function],
    cause: undefined,
    gqlStatus: '50N42',
    gqlStatusDescription: 'error: general processing exception - unexpected error. Unexpected error has occurred. See debug log for details.',
    diagnosticRecord: [Object],
    classification: 'UNKNOWN',
    rawClassification: undefined,
    code: 'Neo.TransientError.General.MemoryPoolOutOfMemoryError',
    retriable: true
    } +28ms
    Error response from Neo4j Server:
    The allocation of an extra 9.7 MiB would use more than the limit 12.0 GiB. Currently using 12.0 GiB. dbms.memory.transaction.total.max threshold reached

I would expect the Cypher to run once and return the paginated results. I'm not an expert at Cypher by any means but in my rough testing I found that putting the ORDER BY, SKIP and LIMIT statements outside of the CALL got things working. I.e.

CALL db.index.fulltext.queryNodes("AddressSearch", $param0) YIELD node AS this0, score AS var1
WHERE $param1 IN labels(this0)
ORDER BY this0.streetAddress ASC
SKIP $param2
LIMIT $param3
WITH collect({ node: this0 }) AS edges
CALL (edges) {
    UNWIND edges AS edge
    WITH edge.node AS this0
    WITH *
    RETURN collect({ node: { uuid: this0.uuid, streetAddress: this0.streetAddress, suburb: this0.suburb, state: this0.state, country: this0.country, postcode: this0.postcode, __resolveType: "Address" } }) AS var2
}
RETURN { edges: var2 } AS this

System (please complete the following information):

  • OS: Ubuntu 22.04 LTS
  • Version: @neo4j/graphql@7.4.1
  • Node.js version: v22.13.1

Additional context
I'm running Neo4j itself via the neo4j:5.26.19 Docker container, but I don't think that's too relevant. Any increase in memory just causes the Cypher query to run more times before causing the OOM error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingconfirmedConfirmed bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions