This lightweight app allows robust and resilient transfer of data.
It is specifically designed for large data volumes, where one would like to keep track of accomplished transactions. This would then allow to resume the transfer job where it was left off in case of failure.
We support local file-system as source, and S3 bucket as destination, but this can be easily modified if needed.
Under the hood, this app leverages an SQLite database that stores transactions. In particular, each upload job will comprise one or more data items (or files).
Last, we support threaded jobs, to improve upload performance.
This app is packaged using Poetry, which you must install first before running the following command from the project’s root:
poetry install
Set the location of the local database file with environment variable FORW_SERV_DB_PATH
, e.g.
FORW_SERV_DB_PATH=/path/to/forwarding_service.db
The default value is $HOME/.cache/forwarding_service.db
.
We provide a simple CLI that should be self-explanatory:
python main.py --help
There are two parameters that concern threaded uploads:
--n-threads
defines the number of threads.--split-ratio
defines how the full set of items we wish to send will be split. The rationale of this parameter lies in the fact that multiple threads cannot write to the database in a concurrent manner. We therefore split the whole set into smaller batches, send each batch one by one using multi-threading, and finally update the database. This allows to resume the job starting from the last completed batch.