-
Notifications
You must be signed in to change notification settings - Fork 172
Description
Version
Version 0.3.2-37-g559b654. Commit 559b65455d7ef6b03e8e9e96a0e50fd4fe8a9c86 (current main).
Platform
Linux [server name] 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Description
I'm interested in collecting address_appearances. I have a sync'd Erigon 3 node that uses --prune.mode archive. I believe I have all of the JSON RPC APIs enabled: --http.api "web3,eth,erigon,trace,ots,net,debug,txpool"
I ran: cryo address_appearances --rpc http://127.0.0.1:8545 --verbose and expected that I would either get all of the parquet files that the documentation says would be created, or an error output.
Instead, I get this output, but only after it finishes:
cryo parameters
───────────────
- version: 0.3.2-37-g559b654
- data:
- datatypes: address_appearances
- blocks: n=21,876,375 min=0 max=21,876,374 align=no reorg_buffer=0
- exclude failed items: false
- source:
- network: ethereum
- rpc url: http://127.0.0.1:8545
- max requests per second: unlimited
- max concurrent requests: unlimited
- max concurrent chunks: 4
- max retries: 5
- initial backoff: 500
- output:
- chunk size: 1,000
- chunks to collect: 21,778 / 21,877
- output format: parquet
- output dir: /home/liam/data
- report file: $OUTPUT_DIR/.cryo/reports/2025-02-20_06-01-13.575391.json
schema for address_appearances
──────────────────────────────
- block_number: uint32
- transaction_hash: binary
- address: binary
- relationship: string
- chain_id: uint64
sorting address_appearances by: block_number, transaction_hash, address, relationship
other available columns: block_hash
collecting data
───────────────
started at 2025-02-20 06:01:13.575
done at 2025-02-20 06:18:06.368
error summary
─────────────
(errors in 21778 chunks)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 28929 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 4252 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 10193 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 16743 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 3210 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 3011 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 6449 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 1925 (4x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 4948 (1x)
- Failed to get block: deserialization error: missing field `creationMethod` at line 1 column 6427 (1x)
...
collection summary
──────────────────
- total duration: 1012.793 seconds
- total chunks: 21,877
- chunks errored: 21,778 / 21,877 (99.0%)
- chunks skipped: 99 / 21,877 (0.0%)
- chunks collected: 0 / 21,877 (0.0%)
- blocks collected: 0
- blocks per second: 0.0
- blocks per minute: 0.0
- blocks per hour: 0.0
- blocks per day: 0.0
- rows written: 0
This error: Failed to get block: deserialization error: missing field creationMethod at line 1 column 6427 seems to suggest that maybe Erigon isn't including this necessary data. So my problem could be better broken down if I had answers to a few questions:
- Is cryo tested against Erigon? If not, which JSON-RPC servers is it tested against?
- https://github.com/paradigmxyz/cryo?tab=readme-ov-file#json-rpc shows which methods are used, but not for all data sets. It seems to me like, from
crates/freeze/src/datasets/address_appearances.rs, this useseth_getBlockByNumber(second argument set tofalseto get just the hashes of transactions),eth_getLogs,eth_getTransactionByHash,eth_getTransactionReceipt, andtrace_transaction. Is that right? I'll try to update the README in a PR.
When I collect the dataset traces, it only succeeds on the first 99 chunks then fails on everyone thereafter, so I think this is some compatibility issue in trace_block. I will update this shortly with what I find.