A data transformation pipeline built with dbt (data build tool) and DuckDB.
This project uses dbt to transform and model data with DuckDB as the database engine. DuckDB is an embedded analytical database that's perfect for local development and analytics.
- Python 3.7+
- dbt-duckdb
# Create a virtual environment (recommended)
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dbt with DuckDB adapter
pip install dbt-duckdb
- Clone this repository
- Activate your virtual environment
- Install dependencies:
pip install dbt-duckdb
- Test the setup:
dbt debug --profiles-dir .
- Run the models:
dbt run --profiles-dir .
This project uses DuckDB with the following configuration:
- Development database:
data/dev.duckdb
- Production database:
data/prod.duckdb
- Profile location:
./profiles.yml
(in the project root)
├── models/ # dbt models
├── macros/ # dbt macros
├── seeds/ # CSV files for dbt seed
├── snapshots/ # dbt snapshots
├── tests/ # dbt tests
├── data/ # DuckDB database files (gitignored)
├── profiles.yml # dbt profile configuration
└── dbt_project.yml # dbt project configuration
# Run all models
dbt run --profiles-dir .
# Run tests
dbt test --profiles-dir .
# Generate and serve documentation
dbt docs generate --profiles-dir .
dbt docs serve --profiles-dir .
# Debug configuration
dbt debug --profiles-dir .
- No setup required: DuckDB is embedded, no server installation needed
- Fast analytics: Optimized for analytical workloads
- SQL compatible: Standard SQL with analytics extensions
- Portable: Database files can be easily shared and backed up
- Memory efficient: Works well with limited resources
- Learn more about dbt in the docs
- Check out Discourse for commonly asked questions and answers
- Join the chat on Slack for live discussions and support
- Find dbt events near you
- Check out the blog for the latest news on dbt's development and best practices