Data preparation for Filecoin

This repository contains information about everything related to data preparation that is required before onboarding data to Filecoin. This includes tooling, documentation, and performance benchmarks.

The repository is split into 4 main sections:

Docs: this section includes documentation explaining how data onboarding to filecoin works, best practices and common pitfalls. It also contains links to available tools in the ecosystem.
Modules: the different data onboarding steps are encoded as modules (written in python and bash) which could be easily imported and used in any data onboarding pipeline.
Orchestrators: these are example scripts demonstrating how to import and use the modules from the modules section to orchestrate data onboarding.
Performance benchmarks: these include performance benchmarks for different available tools.

Other tools in the ecosystem

banyancomputer/dataprep -- this tool handles encryption, compression, deduping and chunking. The output of this tool could then be carred etc and used for deal making.

Lead Maintainer

Anjor

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
docs		docs
modules		modules
orchestrators		orchestrators
performance		performance
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data preparation for Filecoin

Other tools in the ecosystem

Lead Maintainer

About

Releases

Packages

Contributors 3

Languages

License

filecoin-project/data-prep-tools

Folders and files

Latest commit

History

Repository files navigation

Data preparation for Filecoin

Other tools in the ecosystem

Lead Maintainer

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages