Replies: 2 comments 1 reply
-
@adartaud |
Beta Was this translation helpful? Give feedback.
0 replies
-
We are also working on something new that might make this really really configurable. Would you be open to having a chat? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
From what I understood, we can setup task retries policies, depending on wether these exceptions are "System" or "User" scoped.
We are running mostly spark tasks in our K8s cluster, and we have identified some "transient" exceptions that can happen during our Spark jobs, that can be simply "recovered" from the Flyte UI. These exceptions are raised as "User Errors" by Flyte since they are raised from within our code, although they should be treated as system errors.
We would like our tasks to have a retry mechanism on these specific exceptions, without actually setting a retry on all User Exceptions. I did not find any configuration that would allow this. One possible workaround I thought of was to actually catch these exceptions ourselves and re-raise them as
FlyteScopedSystemException
orFlyteSystemException
so that they would be treated as System exceptions with their retry policy here: https://github.com/flyteorg/flytekit/blob/5503ee5e232fdbc633af39c7f4539a04906102fc/flytekit/exceptions/scopes.py#L203Does this workaround seems reasonable ? Do you have a recommended way of handling this ?
Thanks,
Beta Was this translation helpful? Give feedback.
All reactions