Skip to content

Commit 409df51

Browse files
authored
Merge pull request #15 from Adamant-im/feat/add-indexes
Additional optional indexes
2 parents 94ebc9a + e3a8ebf commit 409df51

File tree

7 files changed

+142
-93
lines changed

7 files changed

+142
-93
lines changed

README.md

Lines changed: 71 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,34 @@
11
# Indexer for Ethereum to get transaction list by ETH address
22

3-
Known Ethereum nodes lack functionality to get transaction list for ETH address (account). This Indexer allows to explore ETH and ERC20 transactions by Ethereum address and obtain a history of any user|wallet in just a move, like Etherscan does.
3+
Known Ethereum nodes lack the functionality to get a transaction list for an ETH address (account). This Indexer allows one to explore ETH and ERC20 transactions by Ethereum address and obtain a history of any user|wallet in just a move as Etherscan does.
44

5-
Indexer is written in Python. It works as a service in background:
5+
Indexer is written in Python. It works as a service in the background:
66

7-
- Connects to Ethereum node (works well with Geth, Nethermind or other node, which provides http/ws/ipc API)
8-
- Stores all transactions in Postgres database
7+
- Connects to Ethereum node (works well with Geth, Nethermind, or other node, which provides http/ws/ipc API)
8+
- Stores all transactions in the Postgres database
99
- Provides data for API to get transactions by address with postgrest
1010

11+
Sample request:
12+
13+
![Indexer's request example](./assets/indexer-request.png)
14+
1115
## Stored information
1216

13-
All indexed transactions includes (database field names shown):
17+
All indexed transactions include (database field names shown):
1418

1519
- `time` is a transaction's timestamp
1620
- `txfrom` sender's Ethereum address
1721
- `txto` recipient's Ethereum address
18-
- `value` stores amount of ETH transferred
22+
- `value` stores the amount of ETH transferred
1923
- `gas` indicates `gasUsed`
2024
- `gasprice` indicates `gasPrice`
2125
- `block` is a transaction's block number
2226
- `txhash` is a transaction's hash
23-
- `contract_to` indicates recipient's Ethereum address in case of contract
27+
- `contract_to` indicates the recipient's Ethereum address in case of a token transfer
2428
- `contract_value` stores amount of ERC20 transaction in its tokens
2529
- `status` tx status
2630

27-
To reduce storage requirements, Indexer stores only token transfer ERC20 transaction, started with `0xa9059cbb` in raw tx input.
31+
To reduce storage requirements, Indexer stores only token transfer ERC20 transactions, started with `0xa9059cbb` in raw tx input.
2832

2933
An example:
3034

@@ -48,15 +52,15 @@ Refers to transaction 0xcf56a031dfc89f5a3686cd441ea97ae96a66f5809a4c8c1b370485a0
4852

4953
## Ethereum Indexer's API
5054

51-
To get Ethereum transactions by address, Postgrest is used. It provides RESTful API to Postgres index database.
55+
To get Ethereum transactions by address, Postgrest is used. It provides RESTful API to the Postgres index database.
5256

53-
After index is created, you can use requests like
57+
After an index is created, you can use requests like
5458

5559
```
5660
curl -k -X GET "http://localhost:3000/?and=(contract_to.eq.,or(txfrom.eq.0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98,txto.eq.0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98))&order=time.desc&limit=25"
5761
```
5862

59-
The request will show 25 last transactions for Ethereum address 0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98 (Bittrex), ordered by timestamp. For API reference, see [Postgrest](https://postgrest.org/en/stable/api.html).
63+
The request will show the 25 last transactions for Ethereum address 0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98 (Bittrex), ordered by timestamp. For API reference, see [Postgrest](https://postgrest.org/en/stable/api.html).
6064

6165
# Ethereum Indexer Setup
6266

@@ -72,7 +76,7 @@ The request will show 25 last transactions for Ethereum address 0xFBb1b73C4f0BDa
7276

7377
### Ethereum Node
7478

75-
Make sure your Ethereum node is installed and is fully synced. You can check its API and best block height with the command:
79+
Make sure your Ethereum node is installed and fully synced. You can check its API and best block height with the command:
7680

7781
```
7882
curl --data '{"method":"eth_blockNumber","params":[],"id":1,"jsonrpc":"2.0"}' -H "Content-Type: application/json" -X POST localhost:8545
@@ -90,31 +94,48 @@ pip3 install psycopg2
9094

9195
### PostgreSQL
9296

93-
Install Postgres. Create Postgres user:
97+
Install Postgres. Create Postgres user/role:
9498

95-
```
99+
``` bash
100+
su - postgres #switch to psql admin user
96101
createuser -s api_user
97102
```
98103

99-
Where `api_user` is a user who will run indexer service. (As example we create superuser. You can use your own grants.)
104+
Where `api_user` is a user who will run the indexer service. (As example, we create a superuser. You can use your own grants.)
100105

101106
Create database `index` for Ethereum transaction index:
102107

103-
```
108+
``` sql
104109
CREATE DATABASE index;
105110
```
106111

107-
Add tables into `index` using SQL script `create_table.sql`:
112+
Add tables into `index` using SQL script `create_tables.sql`:
108113

109-
```
110-
psql -f create_table.sql index
114+
``` bash
115+
psql -f create_tables.sql index
111116
```
112117

113-
Note, for case insensitive comparisons we use `citex` data type instead of `text`.
118+
For case-insensitive comparisons, we use `citex` data type instead of `text`.
119+
120+
Create database indexes to request tx data fast. **It's better to allow this tool to store initial tx data until the current block first, and then create these indexes. Filling initial tx data will be faster this way.**
121+
122+
Create recommended database indexes:
123+
124+
``` bash
125+
psql -f create_indexes.sql index
126+
```
114127

115-
Remember to grant privileges to psql database and tables for users you need. Example:
128+
Create additional database indexes:
116129

130+
``` bash
131+
psql -f create_indexes_add.sql index
117132
```
133+
134+
Additional indexes cover more complex requests, such as getting Ethereum-only or specific token transactions for an address. [See Request examples](#api-request-examples).
135+
136+
Remember to grant privileges to psql database `index` and tables for users you need. Example:
137+
138+
``` sql
118139
\c index
119140
GRANT ALL ON ethtxs TO api_user;
120141
GRANT ALL ON aval TO api_user;
@@ -125,54 +146,55 @@ GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO api_user;
125146

126147
### Ethereum transaction Indexer
127148

128-
`ethsync.py` is a script which makes Ethereum transaction index. It accepts the following env variables:
149+
`ethsync.py` is a script that makes an Ethereum transaction index. It accepts the following env variables:
129150

130151
- DB_NAME: Postgres database name. Example: `index`.
131152
- ETH_URL: Ethereum node url to reach the node. Supports websocket, http and ipc. See examples in `ethsync.py`.
132153
- START_BLOCK: the first block to synchronize from. Default is 1.
133154
- CONFIRMATIONS_BLOCK: the number of blocks to leave out of the synch from the end. I.e., last block is current `blockNumber - CONFIRMATIONS_BLOCK`. Default is 0.
134-
- PERIOD: Number of seconds between to synchronization. Default is 20 sec.
155+
- PERIOD: Number of seconds between synchronization. Default is 20 sec.
135156
- LOG_FILE: optional file path and name where s=to save logs. If not provided, use StreamHandler.
136157

137-
Indexer can fetch transactions not from the beginning, but from special block number `START_BLOCK`. It will speed up indexing process and reduce database size. For a reference:
158+
The indexer can fetch transactions not from the beginning, but from a particular block number `START_BLOCK`. It will speed up the indexing process and reduce database size. For a reference:
138159

139-
- index size starting from 5,555,555 block to 9,000,000 is about 190 GB
140-
- index size starting from 11,000,000 block to 12,230,000 is about 83 GB
141-
- index size starting from 14,600,000 block to 15,100,000 is about 27 GB
160+
- index size starting from 5,555,555 block to 9,000,000 (3.5 mln blocks) is about 190 GB
161+
- index size starting from 11,000,000 block to 12,230,000 (1 mln blocks) is about 83 GB
162+
- index size starting from 14,600,000 block to 15,100,000 (0.5 mln blocks) is about 27 GB
163+
- index size starting from 14,600,000 block to 18,100,000 (3.5 mln blocks) with additional indexes is about 289 GB
142164

143-
At first start, Indexer will store transactions starting from the block you set. It will take a time. After that, it will check for new blocks every `PERIOD` seconds and update the index.
165+
At first start, the Indexer will store transactions starting from the block you set. It will take time. After that, it will check for new blocks every `PERIOD` seconds and update the index.
144166

145167
Sample run string:
146168

147169
```
148170
DB_NAME=index ETH_URL=http://127.0.0.1:8545 START_BLOCK=14600000 LOG_FILE=/home/api_user/ETH-transactions-storage/ethsync.log python3 /home/api_user/ETH-transactions-storage/ethsync.py
149171
```
150172

151-
We recommend to run Indexer script `ethsync.py` as a background service to make sure it will be restarted in case of failure. See `ethsync.service` as an example. Copy it to /lib/systemd/system/ethsync.service, update according to your settings, then register a service:
173+
We recommend running the Indexer script `ethsync.py` as a background service to ensure it will be restarted in case of failure. See `ethsync.service` as an example. Copy it to /lib/systemd/system/ethsync.service, update according to your settings, then register a service:
152174

153175
```
154176
systemctl start ethsync.service
155177
systemctl enable ethsync.service
156178
```
157179

158-
Note, indexing takes time. To check indexing process, get the last indexed block:
180+
Note, that indexing takes time. To check the indexing process, get the last indexed block:
159181

160182
```
161183
psql -d index -c 'SELECT MAX(block) FROM ethtxs;'
162184
```
163185

164-
And compare to Ethereum node's best block.
186+
And compare it to the Ethereum node's best block.
165187

166188
### Troubleshooting
167189

168-
To test connection from script, set a connection line in `ethtest.py`, and run it. In case of success, it will print current Ethereum's last block.
190+
To test the connection from the script, set a connection line in `ethtest.py`, and run it. In case of success, it will print the current Ethereum's last block.
169191

170192
To test a connection to a Postgres database `index`, run `pgtest.py`.
171193

172194
### Transaction API with Postgrest
173195

174196
[Install and configure](https://postgrest.org/en/stable/install.html) Postgrest.
175-
Here is an example to run API for user `api_user` connected to `index` database on 3000 port:
197+
Here is an example of running API for user `api_user` connected to `index` database on the 3000 port:
176198

177199
```
178200
db-uri = "postgres://api_user@/index"
@@ -191,7 +213,7 @@ Make sure you add Postgrest in crontab for autostart on reboot:
191213

192214
### Make Indexer's API public
193215

194-
If you need to provide public API, use any web server like nginx and setup proxy to Postgrest port in config:
216+
If you need to provide public API, use any web server like nginx and set a proxy to Postgrest port in config:
195217

196218
```
197219
location /ethtxs {
@@ -206,41 +228,48 @@ location /max_block {
206228
207229
```
208230

209-
This way endpoints will be available:
231+
This way, endpoints will be available:
210232

211233
- `/ethtxs` used to fetch Ethereum transactions by address
212-
- `/aval` returns status of service. Endpoint `aval` is a table with `status` field just to check API availability.
213-
- `/max_block` returns max Ethereum indexed block
234+
- `/aval` returns the status of service. Endpoint `aval` is a table with `status` field just to check API availability.
235+
- `/max_block` returns max Ethereum-indexed block
214236

215237
Example:
216238

217239
```
218240
https://yourdomain.com/max_block
219241
```
220242

221-
## Dockerized and docker compose
243+
## Dockerized and docker-compose
222244

223245
by Guénolé de Cadoudal ([email protected])
224246

225-
In the `docker-compose.yml` you find a configuration that show how this tool can be embedded in a docker configuration with the following processes:
247+
In the `docker-compose.yml`, you find a configuration that shows how this tool can be embedded in a docker configuration with the following processes:
226248

227249
- postgres db: to store the indexed data
228250
- postgREST tool to expose the data as a REST api (see above comments)
229-
- GETH node in POA mode. Can be Openethereum, or another node, but not tested
251+
- GETH node in POA mode. It can be Nethermind or another node, but it has not been tested
230252
- EthSync tool (this tool)
231253

232254
[Set env variables](#ethereum-transaction-indexer).
233255

234256
# API request examples
235257

236-
Get last 25 Ethereum transactions without ERC-20 transactions for address 0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98:
258+
Get the last 25 Ethereum transactions without ERC-20 transactions for address 0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98:
237259

238260
```
239261
curl -k -X GET "http://localhost:3000/ethtxs?and=(contract_to.eq.,or(txfrom.eq.0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98,txto.eq.0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98))&order=time.desc&limit=25"
240262
241263
```
242264

243-
Get last 25 ERC-20 transactions without Ethereum transactions for address 0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98:
265+
Get the last 25 USDT transactions for address 0xabfDF505fFd5587D9E7707dFB47F45AF1f03E275:
266+
267+
```
268+
curl -k -X GET "http://localhost:3000/ethtxs?and=(txto.eq.0xdac17f958d2ee523a2206206994597c13d831ec7,or(txfrom.eq.0xabfDF505fFd5587D9E7707dFB47F45AF1f03E275,contract_to.eq.000000000000000000000000abfDF505fFd5587D9E7707dFB47F45AF1f03E275))&order=time.desc&limit=25"
269+
270+
```
271+
272+
Get the last 25 ERC-20 transactions without Ethereum transactions for address 0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98:
244273

245274
```
246275
curl -k -X GET "http://localhost:3000/ethtxs?and=(contract_to.neq.,or(txfrom.eq.0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98,txto.eq.0xFBb1b73C4f0BDa4f67dcA266ce6Ef42f520fBB98))&order=time.desc&limit=25"

assets/indexer-request.png

368 KB
Loading

create_indexes.sql

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
CREATE INDEX block_index
2+
ON public.ethtxs USING btree
3+
(block);
4+
5+
CREATE INDEX contract_to_index
6+
ON public.ethtxs USING btree
7+
(contract_to);
8+
9+
CREATE INDEX txfrom_index
10+
ON public.ethtxs USING btree
11+
(txfrom);
12+
13+
CREATE INDEX txto_index
14+
ON public.ethtxs USING btree
15+
(txto);

create_indexes_add.sql

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
CREATE INDEX time_index
2+
ON public.ethtxs USING btree
3+
(time);
4+
5+
CREATE INDEX txto_txfrom_index
6+
ON public.ethtxs USING btree
7+
(txto, txfrom);
8+
9+
CREATE INDEX txto_contract_to_index
10+
ON public.ethtxs USING btree
11+
(txto, contract_to);
12+
13+
CREATE INDEX txto_w_empty_contract_to_index
14+
ON public.ethtxs USING btree
15+
(txto)
16+
WHERE contract_to = '';
17+
18+
/*
19+
You can replace txto_w_empty_contract_to_index with the next one.
20+
It uses more disk space but generally it's faster.
21+
*/
22+
23+
/*
24+
CREATE INDEX complex_index
25+
ON public.ethtxs USING btree
26+
(contract_to, txfrom, txto);
27+
*/

create_table.sql

Lines changed: 0 additions & 50 deletions
This file was deleted.

create_tables.sql

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
CREATE EXTENSION citext;
2+
3+
CREATE TABLE public.ethtxs
4+
(
5+
time integer,
6+
txfrom citext,
7+
txto citext,
8+
gas bigint,
9+
gasprice bigint,
10+
block integer,
11+
txhash citext,
12+
value numeric,
13+
contract_to citext,
14+
contract_value citext,
15+
status boolean
16+
);
17+
18+
CREATE TABLE public.aval
19+
(
20+
"status" INTEGER
21+
);
22+
23+
INSERT INTO public.aval(status) VALUES (1);
24+
25+
CREATE VIEW max_block as
26+
SELECT
27+
MAX(block)
28+
FROM public.ethtxs;

docker-compose.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ services:
1111
POSTGRES_PASSWORD: postgres!secret
1212
volumes:
1313
- ./data/postgres:/var/lib/postgresql/data
14-
- ./create_table.sql:/docker-entrypoint-initdb.d/init.sql
14+
- ./create_tables.sql:/docker-entrypoint-initdb.d/init.sql
1515
healthcheck:
1616
test: ["CMD-SHELL", "pg_isready -U app_user -d app_db"]
1717
interval: 5s

0 commit comments

Comments
 (0)