Skip to content

Improve dag error handling #49164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Yusin0903
Copy link

Improve error handling for DAG access in API endpoints

This PR improves error handling when accessing DAGs via API endpoints:

  • Add consistent error handling for DAG access across API endpoints
  • Handle ImportError and SyntaxError with 422 status code and clear error messages
  • Handle generic exceptions with 500 status code
  • Provide clear error messages for different failure scenarios
  • Update API documentation to include new error response codes
  • Add comprehensive tests for error handling scenarios

This change ensures users get clear feedback when DAGs fail to load
due to import errors, syntax errors, or other unexpected issues.

closes: #48960


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@boring-cyborg boring-cyborg bot added the area:API Airflow's REST/HTTP API label Apr 12, 2025
Copy link

boring-cyborg bot commented Apr 12, 2025

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: [email protected]
    Slack: https://s.apache.org/airflow-slack

@Yusin0903 Yusin0903 force-pushed the improve-dag-error-handling branch 3 times, most recently from 49162b4 to 49f4b74 Compare April 12, 2025 18:33
Copy link
Member

@jason810496 jason810496 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!
I agree with @rawwar, and also leave nits for modularizing.

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. A few adjustments needed before we can merge :)

@Yusin0903 Yusin0903 force-pushed the improve-dag-error-handling branch from 49f4b74 to e296a5d Compare May 1, 2025 14:53
@pierrejeambrun pierrejeambrun added this to the Airflow 3.1.0 milestone May 5, 2025
Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main problem of this is it will really handle only a precise deserialization errror => dag_id is missing from the encoded dag.

Any other undexpected error in the deserialization process, will just crash and not provide any specific feedback.

I think we shoul catch everything in SerializedDAG.deserialize_dag and raise based on the exception a new custom DeserializationError.

Then all those deserialization errors can be catch on the API side and provide a relevant feedback.

@pierrejeambrun
Copy link
Member

Don't hesitate to directly resolve comments you've addressed so we know easily what's left to focus on.

@Yusin0903 Yusin0903 force-pushed the improve-dag-error-handling branch from e296a5d to a62f542 Compare May 13, 2025 18:33
@Yusin0903 Yusin0903 force-pushed the improve-dag-error-handling branch 2 times, most recently from df76944 to f1e2875 Compare May 13, 2025 18:47
@Yusin0903
Copy link
Author

The main problem of this is it will really handle only a precise deserialization errror => dag_id is missing from the encoded dag.

Any other undexpected error in the deserialization process, will just crash and not provide any specific feedback.

I think we shoul catch everything in SerializedDAG.deserialize_dag and raise based on the exception a new custom DeserializationError.

Then all those deserialization errors can be catch on the API side and provide a relevant feedback.

Hi @pierrejeambrun,
Thanks for the review.
I’ve raised a custom DeserializationError and handled it in SerializedDAG.deserialize_dag.
I also noticed the new dependency DagBagDep, and I’ve merged it into get_dag_from_dag_bag.

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looking good. A few comments. Thanks.

@Yusin0903 Yusin0903 force-pushed the improve-dag-error-handling branch 3 times, most recently from c933083 to 73dad08 Compare May 25, 2025 12:18
@Yusin0903
Copy link
Author

Hi @pierrejeambrun,
Thanks for the review and suggestion.
I have catched all the Runtime error in the DeserializationError and simplify the dag_bag.get_dag handle.

Copy link
Contributor

@bugraoz93 bugraoz93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, thanks! Small comments here and there :)

@Yusin0903 Yusin0903 force-pushed the improve-dag-error-handling branch from 73dad08 to a66859d Compare May 26, 2025 17:57
Comment on lines 99 to +105
DatabaseErrorHandlers = [
_UniqueConstraintErrorHandler(),
]

DAGErrorHandlers = [
DAGErrorHandler(),
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need these to be separate? How about just have one list called ERROR_HANDLERS instead.

Or, do we really need the list at all? Why not just import the classes directly in app.py?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 unique list of ErrorHandlers seems more appropriate for now as there is really just one db handler.


return dag
return dag
except (RuntimeError, ValueError, KeyError) as err:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does RuntimeError here only intend to catch the case we raise ourselves? I’d rather avoid it if possible; it’s used too often for built-in errors that we may not want to mask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know, basically any unexpected error there is a deserialization error, and we prefer to wrap that into a DeserializationError (with the original stack trace), to show that to the users of the API, instead of having a plain ValueError/KeyError with a 500 (Default ValueError and KeyError cannot be handled globally by the api server exception handler).

Raising anything else not handled by the server will end up in 500 and the error buried in the stack trace.

Comment on lines +1796 to +1797
except:
raise
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
except:
raise

def exception_handler(self, request: Request, exc: DeserializationError):
"""Handle DAG deserialization exceptions."""
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST,
Copy link
Member

@uranusjr uranusjr May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

400 does not sound right to me. This is not something the client can resolve; a 500 would be more appropriate IMO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. 400 'could work' maybe if we consider that a dag authorizing mistake causing the deserialization error and the user needs to operate on a the dag itself ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pierrejeambrun Just to confirm, should we still raise a 400 status code in this case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets try with a 500 as TP suggested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel Airflow should catch authoring mistakes and prevent un-deserialisable content enter the database; in that sense 500 is more appropriate. 400 still does not feel appropriate even if the input is fixable since the code implies the user can get a successful result simply by tweaking the request parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API - Improve error response on missing dag in the dagbag.
6 participants