|
7 | 7 | [](https://pypi.org/project/dbt-score) |
8 | 8 | [](https://makeapullrequest.com) |
9 | 9 |
|
10 | | - |
| 10 | +**A comprehensive linter for dbt metadata that helps maintain high-quality data |
| 11 | +models at scale.** |
11 | 12 |
|
12 | | -## What is `dbt-score`? |
| 13 | +```shell |
| 14 | +dbt-score lint |
| 15 | +🥉 orders (score: 2.7) |
| 16 | + WARN (medium) dbt_score.rules.generic.columns_have_description: Columns lack a description: customer_id, customer_name. |
| 17 | + WARN (high) dbt_score.rules.generic.has_description: Model lacks a description. |
| 18 | + WARN (medium) dbt_score.rules.generic.has_owner: Model lacks an owner. |
| 19 | + WARN (medium) dbt_score.rules.generic.sql_has_reasonable_number_of_lines: SQL query too long: 238 lines (> 200). |
| 20 | + WARN (medium) dbt_score_rules.custom_rules.has_test: Model lacks a test. |
| 21 | +``` |
13 | 22 |
|
14 | | -`dbt-score` is a linter for dbt metadata. |
| 23 | +## What is dbt-score? |
15 | 24 |
|
16 | | -[dbt][dbt] (Data Build Tool) is a great framework for creating, building, |
17 | | -organizing, testing and documenting _data models_, i.e. data sets living in a |
18 | | -database or a data warehouse. Through a declarative approach, it allows data |
19 | | -practitioners to build data with a methodology inspired by software development |
20 | | -practices. |
| 25 | +`dbt-score` is a powerful linting tool designed to evaluate and score [dbt][dbt] |
| 26 | +(Data Build Tool) models based on metadata quality. It helps data teams maintain |
| 27 | +consistent standards across dbt projects by programmatically enforcing best |
| 28 | +practices for documentation, testing, naming conventions, and more. |
21 | 29 |
|
22 | | -This leads to data models being bundled with a lot of metadata, such as |
23 | | -documentation, data tests, access control information, column types and |
24 | | -constraints, 3rd party integrations... Not to mention any other metadata that |
25 | | -organizations need, fully supported through the `meta` parameter. |
| 30 | +### Key Features |
26 | 31 |
|
27 | | -At scale, with hundreds or thousands of data models, all this metadata can |
28 | | -become confusing, disparate, and inconsistent. It's hard to enforce good |
29 | | -practices and maintain them in continuous integration systems. This is |
30 | | -where`dbt-score` plays its role: by allowing data teams to programmatically |
31 | | -define and enforce metadata rules, in an easy and scalable manner. |
| 32 | +- 🔍 **Comprehensive Linting**: Evaluates dbt entities against configurable |
| 33 | + rules for documentation, tests, naming, and structure |
| 34 | +- 📊 **Scoring System**: Provides numerical scores (0-10) for individual models |
| 35 | + and overall project health |
| 36 | +- 🎯 **Flexible Configuration**: Customizable rules, severity levels, and |
| 37 | + scoring thresholds via `pyproject.toml` |
| 38 | +- 🚀 **CI/CD Integration**: Fail builds when quality standards aren't met |
| 39 | +- 📈 **Progress Tracking**: Visual badges and scoring to track data quality |
| 40 | + improvements over time |
| 41 | +- 🔧 **Extensible**: Create custom rules tailored to organization-specific needs |
| 42 | + |
| 43 | +## Quick Start |
| 44 | + |
| 45 | +### Installation |
| 46 | + |
| 47 | +```shell |
| 48 | +pip install dbt-score |
| 49 | +``` |
| 50 | + |
| 51 | +> **Note**: Install `dbt-score` in the same environment as `dbt-core`. |
| 52 | +
|
| 53 | +### Basic Usage |
| 54 | + |
| 55 | +Run `dbt-score` from your dbt project root: |
| 56 | + |
| 57 | +```bash |
| 58 | +# Basic linting |
| 59 | +dbt-score lint |
| 60 | + |
| 61 | +# Also show passing tests |
| 62 | +dbt-score lint --show all |
| 63 | + |
| 64 | +# Lint specific models |
| 65 | +dbt-score lint --select +my_model+ |
| 66 | + |
| 67 | +# Auto-generate manifest (via `dbt parse`) and lint |
| 68 | +dbt-score lint --run-dbt-parse |
| 69 | +``` |
| 70 | + |
| 71 | +### Example Output |
| 72 | + |
| 73 | +``` |
| 74 | +dbt-score lint --show all |
| 75 | +🥉 orders (score: 2.7) |
| 76 | + WARN (medium) dbt_score.rules.generic.columns_have_description: Columns lack a description: customer_id, customer_name. |
| 77 | + WARN (high) dbt_score.rules.generic.has_description: Model lacks a description. |
| 78 | + WARN (medium) dbt_score.rules.generic.has_owner: Model lacks an owner. |
| 79 | + WARN (medium) dbt_score.rules.generic.sql_has_reasonable_number_of_lines: SQL query too long: 238 lines (> 200). |
| 80 | + WARN (medium) dbt_score_rules.custom_rules.has_test: Model lacks a test. |
| 81 | +
|
| 82 | +🥇 customers (score: 10.0) |
| 83 | + OK dbt_score.rules.generic.columns_have_description |
| 84 | + OK dbt_score.rules.generic.has_description |
| 85 | + OK dbt_score.rules.generic.has_owner |
| 86 | + OK dbt_score.rules.generic.sql_has_reasonable_number_of_lines |
| 87 | + OK dbt_score_rules.custom_rules.has_test |
| 88 | +
|
| 89 | +Project score: 6.3 🥈 |
| 90 | +``` |
| 91 | + |
| 92 | +## Configuration |
| 93 | + |
| 94 | +Configure `dbt-score` via `pyproject.toml` in the dbt project root: |
| 95 | + |
| 96 | +```toml |
| 97 | +[tool.dbt-score] |
| 98 | +# Fail CI if project score falls below threshold |
| 99 | +fail_project_under = 7.5 |
| 100 | +fail_any_item_under = 8.0 |
| 101 | + |
| 102 | +# Disable specific rules |
| 103 | +disabled_rules = ["dbt_score.rules.generic.columns_have_description"] |
| 104 | + |
| 105 | +# Configure badges |
| 106 | +[tool.dbt-score.badges] |
| 107 | +first.threshold = 10.0 |
| 108 | +first.icon = "🥇" |
| 109 | +second.threshold = 8.0 |
| 110 | +second.icon = "🥈" |
| 111 | +third.threshold = 6.0 |
| 112 | +third.icon = "🥉" |
| 113 | +wip.icon = "🏗️" |
| 114 | + |
| 115 | +# Customize rule severity and parameters |
| 116 | +[tool.dbt-score.rules."dbt_score.rules.generic.sql_has_reasonable_number_of_lines"] |
| 117 | +severity = 1 |
| 118 | +max_lines = 300 |
| 119 | +``` |
| 120 | + |
| 121 | +## Why Use dbt-score? |
| 122 | + |
| 123 | +As dbt projects grow to hundreds or thousands of models, maintaining consistent |
| 124 | +metadata becomes increasingly challenging: |
| 125 | + |
| 126 | +- **Inconsistent Documentation**: Some models are well-documented, others lack |
| 127 | + basic descriptions |
| 128 | +- **Missing Tests**: Critical models without proper data quality tests |
| 129 | +- **Naming Inconsistencies**: Models that don't follow established conventions |
| 130 | +- **Technical Debt**: Long, complex SQL queries that are hard to maintain |
| 131 | +- **Compliance Issues**: Missing ownership or governance metadata |
| 132 | + |
| 133 | +`dbt-score` addresses these challenges by: |
| 134 | + |
| 135 | +- **Automated Quality Checks**: Continuously evaluate dbt projects against best |
| 136 | + practices |
| 137 | +- **Objective Scoring**: Get clear, numerical feedback on model quality |
| 138 | +- **Team Alignment**: Establish shared standards across data teams |
| 139 | +- **CI/CD Integration**: Prevent quality regressions in production |
| 140 | + |
| 141 | +## Built-in Rules |
| 142 | + |
| 143 | +`dbt-score` comes with a small set of rules covering needs applicable to most |
| 144 | +dbt projects. |
| 145 | + |
| 146 | +## Advanced Usage |
| 147 | + |
| 148 | +### Custom Rules |
| 149 | + |
| 150 | +Create organization-specific rules by writing simple Python functions: |
| 151 | + |
| 152 | +```python |
| 153 | +from dbt_score import Model, rule, RuleViolation |
| 154 | + |
| 155 | +@rule |
| 156 | +def model_has_business_owner(model: Model) -> RuleViolation: |
| 157 | + if model.meta.get("business_owner") is None: |
| 158 | + return RuleViolation("Model lacks a business owner.") |
| 159 | +``` |
| 160 | + |
| 161 | +### CI/CD Integration |
| 162 | + |
| 163 | +Add `dbt-score` to CI pipelines: |
| 164 | + |
| 165 | +```yaml |
| 166 | +- name: Run dbt-score |
| 167 | + run: | |
| 168 | + dbt-score lint --run-dbt-parse |
| 169 | +``` |
| 170 | +
|
| 171 | +or equivalent in your favourite CI platform. `dbt-score` exits with 0 or 1 to |
| 172 | +signal success or failure, making integrations a breeze! |
| 173 | + |
| 174 | +### Selective Linting |
| 175 | + |
| 176 | +Use dbt's selection syntax to lint specific parts of projects: |
| 177 | + |
| 178 | +```bash |
| 179 | +# Lint only staging models |
| 180 | +dbt-score lint --select staging.* |
| 181 | +
|
| 182 | +# Lint a model and its dependencies |
| 183 | +dbt-score lint --select +my_important_model |
| 184 | +
|
| 185 | +# Lint recently changed models |
| 186 | +dbt-score lint --select state:modified |
| 187 | +``` |
32 | 188 |
|
33 | 189 | ## Documentation |
34 | 190 |
|
35 | | -Everything you need (and more) can be found in [`dbt-score` documentation |
36 | | -website][dbt-score]. |
| 191 | +For comprehensive documentation, including detailed rule descriptions, |
| 192 | +configuration options, and advanced usage patterns, visit the [`dbt-score` |
| 193 | +documentation website][dbt-score]. |
37 | 194 |
|
38 | 195 | ## Contributing |
39 | 196 |
|
40 | | -Would you like to contribute to `dbt-score`? That's great news! Please follow |
41 | | -[the guide on the documentation website][contributors-guide]. 🚀 |
| 197 | +Contributions are welcome! This includes: |
| 198 | + |
| 199 | +- Reporting bugs or requesting features |
| 200 | +- Improving documentation |
| 201 | +- Adding new rules or formatters |
| 202 | +- Fixing issues |
| 203 | + |
| 204 | +Check out the [contributing guide][contributors-guide] to get started. 🚀 |
| 205 | + |
| 206 | +## Requirements |
| 207 | + |
| 208 | +- Python 3.10+ |
| 209 | +- dbt-core 1.5+ |
| 210 | + |
| 211 | +## License |
| 212 | + |
| 213 | +This project is licensed under the MIT License - see the |
| 214 | +[LICENSE.txt](LICENSE.txt) file for details. |
| 215 | + |
| 216 | +--- |
42 | 217 |
|
43 | 218 | [dbt]: https://github.com/dbt-labs/dbt-core |
44 | 219 | [dbt-score]: https://dbt-score.picnic.tech/ |
|
0 commit comments