Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore and add static checks for DAGs for early detection of common issues #43176

Open
1 of 2 tasks
omkar-foss opened this issue Oct 18, 2024 · 4 comments
Open
1 of 2 tasks

Comments

@omkar-foss
Copy link
Collaborator

Description

As per users' feedback in the Airflow Debugging Survey 2024, 48.3% of respondents chose early issue detection during execution as one of their top 2 choices.

Use case/motivation

Goal of this issue:

  • Enhance early detection of DAG issues to minimize dev time
  • Some sort of static analysis similar to Ruff checks for DAGs
  • Runtime analysis of DAGs if feasible can also be done to save dev time

Related issues

Parent Issue: #40975

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@omkar-foss omkar-foss added kind:feature Feature Requests needs-triage label for new issues that we didn't triage yet labels Oct 18, 2024
@omkar-foss omkar-foss changed the title Explore and add static checks for DAGs for early detect of common issues Explore and add static checks for DAGs for early detection of common issues Oct 18, 2024
@nathadfield nathadfield removed the needs-triage label for new issues that we didn't triage yet label Oct 23, 2024
@omkar-foss
Copy link
Collaborator Author

In the context of this issue, DAG linting as mentioned by this article (shared by @potiuk on slack) can also be explored:

https://medium.com/@snir.isl/mastering-airflow-dag-standardization-with-pythons-ast-a-deep-dive-into-linting-at-scale-1396771a9b90

Inspired by above article, may be we can have commands like airflow dag lint <dagname>.py for checking common DAGs issues and airflow dag grade <dagname>.py for grading DAGs based on their code quality.

@potiuk
Copy link
Member

potiuk commented Oct 29, 2024

Love that idea. We could even had some way of checking for "best practices" - like not using DB while parsing etc. This might also be then used as part of the upgrade-check mechanism that we are planning for Airflow 2-> 3 migration - see #41641 cc: @Lee-W

@Lee-W
Copy link
Member

Lee-W commented Oct 30, 2024

I feel it's a bit different. 🤔 But for not using DB while parsing, that's something we should check in the upgrade check. But the "best practices" thing would probably be something else. Probably integrating with ruff or building our own linter would be better.

@Dev-iL
Copy link
Contributor

Dev-iL commented Jan 5, 2025

Good to see new airflow-specific Ruff rules being added (AIR3##). Nice work @Lee-W!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants