-
Notifications
You must be signed in to change notification settings - Fork 162
Description
We are getting the md5 checksum error again across multiple pipelines. this error is causing failures for our data pipelines.
Earlier, we received this error and Google provided the following findings:
"GCS doesn't store MD5 checksums for composite objects. A composite object isn't a single entity; it's a collection of individual objects. Verifying the integrity of the composite object itself isn't possible using MD5. This behavior was already in place and there haven’t been any recent changes on the GCS side.
There are chances that earlier object uploaded were not the composite objects and that's why the MD5 checksum consistency check was not the issue for them. From the error, it is confirmed that MD5 checksum verification is done for the composite objects and for composite objects GCS doesn't add any MD5 checksum values.
Since, updating the gcsfs helped here, I believe it might be related to gcsfs releases, please note that gcsfs is not the GCP tool it's a third-party tool. I believe GCS service doesn't have any role in the error you observed."
Solution: We had upgraded the libraries to the latest version which eventually fixed the issue for certain pipelines, however, this time this solution alone is not fixing the issue.
We wanted to connect with you to understand why this error is getting triggered again, is there any change being done on the environment or any release again?
Please let us know at the earliest.