Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

push: "Checksum Type mismatch occurred" when pushing to Wasabi (S3 compatible) #10695

Closed
amarburg opened this issue Feb 27, 2025 · 2 comments
Closed
Labels
bug Did we break something? fs: s3 Related to the S3 filesystem upstream Issues which need to be resolved in an upstream dependency

Comments

@amarburg
Copy link

amarburg commented Feb 27, 2025

Bug Report

Description

Pushing individual files to our established bucket at Wasabi (S3-compatible cloud store), we get:

(S3 ACCESS and SECRET KEY defined in environment variables)
$ dvc push <push some_large_file>
Collecting                                                                          |0.00 [00:00,    ?entry/s]
ERROR: failed to transfer '4041c2812ed00efb5a0d57de2b8a9c4d' - [Errno 22] Checksum Type mismatch occurred, expected checksum Type: null, actual checksum Type: crc32: An error occurred (InvalidRequest) when calling the UploadPart operation: Checksum Type mismatch occurred, expected checksum Type: null, actual checksum Type: crc32
Pushing
ERROR: failed to push data to the cloud - 1 files failed to upload 

This is a bucket we have used consistently over the last ~2-3 years. We are able to pull from the bucket (i.e. it is not an acccess key issue)

Reproduce

  1. Set up S3 bucket at Wasabi.
  2. git init my_repo
  3. dvc init
  4. dvc remote add -d wasabi s3://bucket-name/
  5. dvc remote modify wasabi endpointurl https://s3.us-west-1.wasabisys.com
  6. dvc add big_file.txt
  7. AWS_ACCESS_KEY_ID="user" AWS_SECRET_ACCESS_KEY="secret" dvc push -r wasabi big_file.txt

Expected

File should be pushed to S3 remote and available for pulling by other users.

Environment information

Tested on both Ubuntu 24.04 and 20.04. DVC from snap:

$ snap info dvc
name:      dvc
summary:   Data Version Control
publisher: Casper (casper-dcl)
store-url: https://snapcraft.io/dvc
contact:   [email protected]
license:   Apache-2.0
description: |
  Git for Data & Models https://dvc.org
commands:
  - dvc
snap-id:      ceYKZQ2pf75cN9OVM33Bk36vVEwz3HaP
tracking:     v2/stable
refresh-date: yesterday at 13:58 PST
channels:
  latest/stable:    3.59.1  2025-02-16 (1488) 404MB classic
...                            
  v2/stable:        3.59.1  2025-02-16 (1488) 404MB classic
$ dvc doctor
DVC version: 3.59.1 (snap)
--------------------------
Platform: Python 3.12.9 on Linux-6.8.0-54-generic-x86_64-with-glibc2.31
Subprojects:
        dvc_data = 3.16.9
        dvc_objects = 5.1.0
        dvc_render = 1.0.2
        dvc_task = 0.40.2
        scmrepo = 3.3.10
Supports:
        azure (adlfs = 2024.12.0, knack = 0.12.0, azure-identity = 1.20.0),
        gdrive (pydrive2 = 1.21.3),
        gs (gcsfs = 2025.2.0),
        hdfs (fsspec = 2025.2.0, pyarrow = 19.0.0),
        http (aiohttp = 3.11.12, aiohttp-retry = 2.9.1),
        https (aiohttp = 3.11.12, aiohttp-retry = 2.9.1),
        oss (ossfs = 2023.12.0),
        s3 (s3fs = 2025.2.0, boto3 = 1.36.3),
        ssh (sshfs = 2025.2.0),
        webdav (webdav4 = 0.10.0),
        webdavs (webdav4 = 0.10.0),
        webhdfs (fsspec = 2025.2.0)
Config:
        Global: /home/aaron/.config/dvc
        System: /etc/dvc
Cache types: hardlink, symlink
Cache directory: zfs on zvol1/home/aaron
Caches: local
Remotes: s3, s3
Workspace directory: zfs on zvol1/home/aaron
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/f92fa966085d661846d8ea2e53107206

Additional Information (if any):

Here's the output from dvc push -r wasabi --verbose <filename>: output.txt

@skshetry
Copy link
Member

skshetry commented Feb 27, 2025

This is an issue with a newer version of botocore (see boto/boto3#4392) and s3fs, which is discussed here: fsspec/s3fs#931.

You can either use old version of botocore (and dvc installed through pip), or see if that envvar fixes the issue mentioned in fsspec/s3fs#931.

If you can notify Wasabi about this particular issue, that'd be great too so that they can be more compatible with this unexpected change.

@skshetry skshetry added bug Did we break something? upstream Issues which need to be resolved in an upstream dependency fs: s3 Related to the S3 filesystem labels Feb 27, 2025
@amarburg
Copy link
Author

Thanks for the prompt response. Noting that

export AWS_REQUEST_CHECKSUM_CALCULATION=WHEN_REQUIRED   
dvc push ....

works for me. I will notify Wasabi.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? fs: s3 Related to the S3 filesystem upstream Issues which need to be resolved in an upstream dependency
Projects
None yet
Development

No branches or pull requests

2 participants