Description
What happens?
Hello. On my corporate environment, duckdb-wasm
freezes on the first call to query
/runQuery
to write a parquet file.
I have opened a ticket evidence-dev/evidence#3155 to describe me problem, but I think I narrowed it down to a specific call.
In short, my process blocks when duckdb-wasm do this query
COPY (SELECT * FROM read_parquet(['/path/to/my/project/.evidence/meta/buildinfo_csv/code_coverage/tmp/code_coverage.0.parquet'])) TO '/path/to/my/project/.evidence/template/static/data/buildinfo_csv/code_coverage/code_coverage.parquet' (FORMAT 'PARQUET', CODEC 'ZSTD', USE_TMP_FILE false);
On my personal Mac, it does NOT freeze, everything works (even with the proxu). But in a Mac in our corporate environent (behind a proxy), on a Windows WSL2 using an ubuntu 22, on a VM with a Ubuntu 22, on docker image using the official node 22 (alpine i guess), it freezes.
I tried many versions of duckdb-wasm, node (20, 22, 23), always the same result.
I cannot find what is the reason why a call to query would fails. I tried to debug with strace, it seems locked in a mutex (dead lock?)
It is possible that the proxy being slightly different might have an impact? On my mac (so in the env that does NOT freeze), the proxy are with the format http://localhost:3128
(there is a local reverse proxy agent). On all the other environment that freeze, the format is http://theusername:[email protected]:3128
To Reproduce
On ubuntu 22 VM or docker image running inside our network infrastructure (so behind a corporate proxy).
# our internal certificates
NODE_EXTRA_CA_CERTS: /etc/ssl/certs/ca-certificates.crt
HTTPS_PROXY/HTTP_PROXY/NO_PROXY/https_proxy/http_proxy/no_proxy set to our internal proxy http url
node --version
npm --version
npx degit evidence-dev/template my-project
cd my-project
npm run sources
This latest commands never stops.
Here is the backtrace (using chrome inspect) to where it locks:
The query (r
variable) is:
COPY (SELECT * FROM read_parquet(['/home/jupyter-gsemet/Projects/test-evidence/my-project/.evidence/meta/needful_things/orders/tmp/orders.0.parquet'])) TO '/home/jupyter-gsemet/Projects/test-evidence/my-project/.evidence/template/static/data/needful_things/orders/orders.parquet' (FORMAT 'PARQUET', CODEC 'ZSTD', USE_TMP_FILE false);
Evidence seems to have first call to convert csv or its own query on a duckdb example database, into a parquet file first, and then call duckdb-wasm to construct a single parquet file.
The exact line in Evidence source code where it freezes is https://github.com/evidence-dev/evidence/blob/f461d6d63a09d9ee3bb149c2d9e1721c70bbc9ac/packages/lib/universal-sql/src/build-parquet.js#L191
Browser/Environment:
Not relevant (node process)
Device:
linux or mac
DuckDB-Wasm Version:
1.29.0
DuckDB-Wasm Deployment:
See evidence source code
Full Name:
Gaetan Semet
Affiliation:
Ampere Technologies