Skip to content

Engine stops working between version 7.2.4 and 8.0.0 #1201

@LVerneyEC

Description

@LVerneyEC

Hi,

I've noticed a bug on my end when updating the engine past version 7.2.4.

Version 7.2.4 works perfectly fine across our VLOPSs tracking declarations: https://code.europa.eu/dsa/terms-and-conditions-database/vlops-and-vloses/vlop-vlose-declarations.

When updating to version 8.0.0, it crashes/hangs, usually on the "Apple App Store" policies (sometimes to the point that even a Ctrl+C would not kill the engine, and one has to force kill it). Last log lines would be:

[...]
2025-11-12T12:04:12+00:00 info  Amazon Store — Data Access for Vetted Researchers                           No changes after filtering, did not record version
2025-11-12T12:04:13+00:00 info  Amazon Store — Marketplace Sellers Conditions                               Recorded version with id 13c5e016809d3c7a0c889ee02807e122918d8db0
2025-11-12T12:04:13+00:00 info  Apple App Store — Apple Developer Agreement                                 No changes after filtering, did not record version
2025-11-12T12:04:14+00:00 info  Apple App Store — Claims of Infringement                                    Recorded version with id c597b2c42b13315c91e187b0d6b93747b42da08f
2025-11-12T12:04:14+00:00 info  Apple App Store — Developer Program License Agreement                       Recorded version with id 93bda281b8223a097d12c324f6342a51f70ebd6f
2025-11-12T12:04:15+00:00 info  Apple App Store — Media Services Terms and Conditions                       No changes after filtering, did not record version
2025-11-12T12:04:15+00:00 info  Apple App Store — Redress Rights                                            No changes after filtering, did not record version


(not an exact match, sometimes it happens slightly before, sometimes slightly after, but always around the Apple App Store policies).

I am using Node v20.19.5, which should be fully supported.

Tests done:

  • Running on single service (including "Apple App Store") works fine.
  • Updating to the latest version (9.2.0) still exhibits the same behavior.
  • Internet uplink should not be an issue here, and anyways it should crash with a proper timeout/error/socket hang up and not block the scraping process.

Steps to reproduce:

$ git clone https://github.com/OpenTermsArchive/engine && git checkout v8.0.0
$ nvm use lts/iron
$ node -v
v20.19.5
$ npm -v
10.8.2
$ npm ci
npm warn deprecated [email protected]: This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.
npm warn deprecated [email protected]: This package is deprecated. Use the optional chaining (?.) operator instead.
npm warn deprecated [email protected]: This package is deprecated. Use require('node:util').isDeepStrictEqual instead.
npm warn deprecated @humanwhocodes/[email protected]: Use @eslint/config-array instead
npm warn deprecated [email protected]: Rimraf versions prior to v4 are no longer supported
npm warn deprecated @humanwhocodes/[email protected]: Use @eslint/object-schema instead
npm warn deprecated [email protected]: Glob versions prior to v9 are no longer supported
npm warn deprecated [email protected]: Glob versions prior to v9 are no longer supported
npm warn deprecated [email protected]: Glob versions prior to v9 are no longer supported
npm warn deprecated [email protected]: Please upgrade to latest, formidable@v2 or formidable@v3! Check these notes: https://bit.ly/2ZEqIau
npm warn deprecated [email protected]: The querystring API is considered Legacy. new code should use the URLSearchParams API instead.
npm warn deprecated [email protected]: Use your platform's native DOMException instead
npm warn deprecated [email protected]: no longer maintained
npm warn deprecated [email protected]: Please upgrade to superagent v10.2.2+, see release notes at https://github.com/forwardemail/superagent/releases/tag/v10.2.2 - maintenance is supported by Forward Email @ https://forwardemail.net
npm warn deprecated @accordproject/[email protected]: This version of the package is deprecated
npm warn deprecated @accordproject/[email protected]: Not maintained
npm warn deprecated [email protected]: This version is no longer supported. Please see https://eslint.org/version-support for other options.
npm warn deprecated [email protected]: Package no longer supported. Contact Support at https://www.npmjs.com/support for more info.

added 946 packages in 10s

200 packages are looking for funding
  run `npm fund` for details

$ npx ota track
[... - crash as above]

The latter being run with our own set of declarations (note that you need to run the ./build.sh bash script to generate JSON declarations) and an empty data/{snapshots,versions} folder.

I don't see anything specific around the terms where it is crashing (could be PDF handling, but terms with PDF is not the blocking one and always processed ; could be fullDom/htmlOnly, but both are already in terms before ; could be a specific website or URL, but it's semi-random and it's only happening above a specific version of the engine).

Thanks,
Best

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions