-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cached task raising tenacity exception crashes #14417
Comments
Thanks for raising this one @thundercat1 and for providing workarounds in case others come across it. I've added this issue to our backlog. |
hi @thundercat1 - I think this issue has mostly to do with the fact that you cannot pickle a import pickle
import threading
def minimal_repro():
lock = threading.RLock()
try:
pickle.dumps(lock)
except TypeError as e:
print(f"Pickling error: {e}")
if __name__ == "__main__":
minimal_repro() # Pickling error: cannot pickle '_thread.RLock' object if you use |
I imagine the common usage pattern would be to define a serializer that works for your expected / successful return value (since that's what you want to cache). The fact that your serializer needs to handle all possible unhandled exceptions, even though you have no intent to cache them, is very surprising behavior. What if some other situation arises where an exception is raised that isn't json serializable? I agree that tenacity specific handling probably isn't warranted, but I wonder if it's possible to implement more generic handling that allows retries to be executed, and task state to be correctly assigned (failed) even when the exception isn't serializable?
I've implemented some other handling here - now that I know of this particular issue, I can catch this specific exception and replace it with something that's serializable by default. But I also think it's something Prefect should probably be able to handle by default. |
The core problem here appears to be that we are persisting (and therefore serializing) exceptions; in general I don't like this and don't think it's necessary - I'll look at whether any functionality depends on that and whether we can safely remove it. That being said, I want to callout that if a user uses a distributed system (like Dask or Ray), then the same error would occur because those systems would need to serialize the exception to send it across the network. I think we can look at opening an issue in tenacity to avoid maintaining a lock reference on raised exceptions. |
Ha, looks like someone has raised this issue with tenacity already: jd/tenacity#429 |
First check
Bug summary
I have a task that calls a library function. Inside the library,
tenacity
is used for retries. But when a tenacity retry exception is raised, the task crashes without executing its own retries.Reproduction
Error
Versions (
prefect version
output)Additional context
This only is a problem if a
cache_key_fn
is included in the task decorator.Easy workarounds exist -
reraise=True
in the tenacity retryI'll be implementing one of these workarounds, but it would be nice if the core library included handling for tenacity exceptions by default.
The text was updated successfully, but these errors were encountered: