Blockchain spiders aim to collect on-chain data, including:
- Transfer subgraph: the money flow with a center of specific address/transaction.
- Transaction: the transaction data on chains, e.g., receipts, logs, trace, etc.
- Label data: the labels of address/transaction.
- ...
For more info in detail, see our documentation.
Let's start with the following command:
git clone https://github.com/wuzhy1ng/BlockchainSpider.git
And then install the dependencies:
pip install -r requirements.txt
We demonstrate how to crawl a transaction subgraph of KuCoin hacker on Ethereum and trace the illegal fund of the hacker!
Run on this command as follow:
scrapy crawl txs.blockscan -a source=0xeb31973e0febf3e3d7058234a5ebbae1ab4b8c23 -a apikeys=7MM6JYY49WZBXSYFDPYQ3V7V3EMZWE4KJK
You can find the money transfer data on ./data/AccountTransferItem.csv
.
In this section, we will demonstrate how to collect transaction data.
The following command will continuously collect transactions in Ethereum from block number 19000000
to the latest block:
scrapy crawl trans.block.evm -a start_blk=19000000 -a providers=https://eth.llamarpc.com
You can find the label data on ./data
, in which:
BlockItem.csv
saves the metadata for blocks, such as minter, timestamp and so on.TransactionItem.csv
saves the external transactions of blocks.
BlockchainSpider also supports collecting transaction receipts, logs, token transfers, etc. Moreover, collecting block data from EVM-compatible chains (e.g., BNBChain, Polygon, etc.) is also available; see our documentation.
The following command will continuously collect transaction data in Solana from block height 270000000
to the latest block:
scrapy crawl trans.block.solana -a start_slot=270000000 -a providers=https://solana-mainnet.g.alchemy.com/v2/UOD8HE4CVqEiDY5E_9XbKDFqYZzJE3XP
In this section, we demonstrate how to collect labeled addresses in darknet!
Run this command as follow:
scrapy crawl labels.tor -a source=http://6nhmgdpnyoljh5uzr5kwlatx2u3diou4ldeommfxjz3wkhalzgjqxzqd.onion
You can find the label data on ./data/LabelReportItem
, each row of this file is a json object.
The following paper supports BlockchainSpider
. Here are the bib references:
@article{tracer23wu,
author={Wu, Zhiying and Liu, Jieli and Wu, Jiajing and Zheng, Zibin and Chen, Ting},
journal={IEEE Transactions on Information Forensics and Security},
title={TRacer: Scalable Graph-Based Transaction Tracing for Account-Based Blockchain Trading Systems},
year={2023},
volume={18},
number={},
pages={2609-2621}
}
@inproceedings{mots23wu,
author = {Wu, Zhiying and Liu, Jieli and Wu, Jiajing and Zheng, Zibin and Luo, Xiapu and Chen, Ting},
title = {Know Your Transactions: Real-time and Generic Transaction Semantic Representation on Blockchain \& Web3 Ecosystem},
year = {2023},
publisher = {Association for Computing Machinery},
address = {Austin, TX, USA},
doi = {10.1145/3543507.3583537},
pages = {1918–1927},
numpages = {10},
series = {WWW '23}
}
Please refer to the old version of this project.