Closed
Description
Dask is weird, and will re-run a whole batch of tasks if one fails (or the worker needs to restart for any reason).
This resulted in a bunch of error messages on a recent very-large-import run, that were ultimately not really errors:
Task: <Task 'reduce_pixel_shards-c53b3b8c9aa4e70e44dcdfa7b7ff9801' reduce_pixel_shards(, ...)>
Exception: "FileNotFoundError('/data3/epyc/data3/hats/skymapper/photometry/intermediate/order_6/dir_20000/pixel_21738')"
The output file (dataset/Norder=6/Dir=20000/Npix=21738.parquet
) does in fact exist in the final catalog, and everything else about the import pipeline succeeded. This was mostly just annoying and made me think that the pipeline had failed when it didn't.
Metadata
Metadata
Assignees
Type
Projects
Status
Done