Skip to content

Unexpected behaviour: S3 meta client copy when uploading parts of copy #4326

Closed as not planned
@jsaladich

Description

@jsaladich

Describe the bug

I have two buckets in two different regions. Both buckets are properly described and verified (assume orig == origin, where the file is located at this present moment and dest == destination, where the file must be copied)

s3_orig = boto3.resource('s3', endpoint_url = endpoint_url_orig, 
                               region_name = region_orig).Bucket(bucket_orig)
s3_dest = boto3.resource('s3', endpoint_url = endpoint_url_dest, 
                               region_name = region_dest).Bucket(bucket_dest)

I just need to make some simple copies between bucket (without versioning, yet both have versioning enabled):

keys_orig = s3_orig.meta.client.list_objects_v2(Bucket = bucket_orig)['Contents']
keys_cp = [k for k in keys_orig if glob_criteria in k['Key']]
s3_dest.meta.client.copy(
            CopySource = dict_source_to_copy, 
            Bucket = bucket_dest,
            Key = key_cp['Key'],
            SourceClient = s3_orig.meta.client
        )

That calls a multipart put since files are >10GB size. Printing some inputs your func CopyPartTask from copies.py

class CopyPartTask(Task):
    """
    """
    print(client.list_buckets(), "\n")
    print(key)
    print(bucket,  "\n")
    print(part_number, "\n")
    print(extra_args, "\n")

we get:

{'ResponseMetadata': {'RequestId': 'tx71635851192b493c82c65-00672615f5', 'HostId': 'tx71635851192b493c82c65-00672615f5', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'tx71635851192b493c82c65-00672615f5', 'x-amz-request-id': 'tx71635851192b493c82c65-00672615f5', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

10 

{'CopySourceRange': 'bytes=75497472-83886079'} 

{'ResponseMetadata': {'RequestId': 'txc157cda148c24be1a8fcd-00672615f6', 'HostId': 'txc157cda148c24be1a8fcd-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'txc157cda148c24be1a8fcd-00672615f6', 'x-amz-request-id': 'txc157cda148c24be1a8fcd-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

7 

{'CopySourceRange': 'bytes=50331648-58720255'} 

{'ResponseMetadata': {'RequestId': 'txaa40d9de770744d3ac626-00672615f6', 'HostId': 'txaa40d9de770744d3ac626-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'txaa40d9de770744d3ac626-00672615f6', 'x-amz-request-id': 'txaa40d9de770744d3ac626-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

6 

{'CopySourceRange': 'bytes=41943040-50331647'} 

{'ResponseMetadata': {'RequestId': 'tx5a3b9f8edd384bf68f40f-00672615f6', 'HostId': 'tx5a3b9f8edd384bf68f40f-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'tx5a3b9f8edd384bf68f40f-00672615f6', 'x-amz-request-id': 'tx5a3b9f8edd384bf68f40f-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

8 

{'CopySourceRange': 'bytes=58720256-67108863'} 

{'ResponseMetadata': {'RequestId': 'tx00a19e31546c4c5eb8f5d-00672615f6', 'HostId': 'tx00a19e31546c4c5eb8f5d-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'tx00a19e31546c4c5eb8f5d-00672615f6', 'x-amz-request-id': 'tx00a19e31546c4c5eb8f5d-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 
stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

5 

{'CopySourceRange': 'bytes=33554432-41943039'} 

{'ResponseMetadata': {'RequestId': 'txdc2217893c014646b5ad8-00672615f6', 'HostId': 'txdc2217893c014646b5ad8-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'txdc2217893c014646b5ad8-00672615f6', 'x-amz-request-id': 'txdc2217893c014646b5ad8-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

2 

{'CopySourceRange': 'bytes=8388608-16777215'} 

{'ResponseMetadata': {'RequestId': 'txb2016c93e2c4456a90b82-00672615f6', 'HostId': 'txb2016c93e2c4456a90b82-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'txb2016c93e2c4456a90b82-00672615f6', 'x-amz-request-id': 'txb2016c93e2c4456a90b82-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 
stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

4 

{'CopySourceRange': 'bytes=25165824-33554431'} 

{'ResponseMetadata': {'RequestId': 'tx92b946df2b7c4b49938aa-00672615f6', 'HostId': 'tx92b946df2b7c4b49938aa-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'tx92b946df2b7c4b49938aa-00672615f6', 'x-amz-request-id': 'tx92b946df2b7c4b49938aa-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

1 

{'CopySourceRange': 'bytes=0-8388607'} 

{'ResponseMetadata': {'RequestId': 'tx9833651aee334c1aa376d-00672615f6', 'HostId': 'tx9833651aee334c1aa376d-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'tx9833651aee334c1aa376d-00672615f6', 'x-amz-request-id': 'tx9833651aee334c1aa376d-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}], 'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

3 

{'CopySourceRange': 'bytes=16777216-25165823'} 

{'ResponseMetadata': {'RequestId': 'txe6fd139f311644ca8c047-00672615f6', 'HostId': 'txe6fd139f311644ca8c047-00672615f6', 'HTTPStatusCode': 200, 'HTTPHeaders': {'content-type': 'application/xml', 'content-length': '371', 'x-amz-id-2': 'txe6fd139f311644ca8c047-00672615f6', 'x-amz-request-id': 'txe6fd139f311644ca8c047-00672615f6', 'date': 'Sat, 02 Nov 2024 12:07:18 GMT'}, 'RetryAttempts': 0}, 'Buckets': [{'Name': 's3-replication', 'CreationDate': datetime.datetime(2024, 10, 7, 17, 33, 22, tzinfo=tzlocal())}],'Owner': {'DisplayName': '...', 'ID': '...'}} 

stx/sfcWindmax/tas_CMIP6-std_ssp119_all-ens_LAT0LON0_day-to-1daily_1979-2080.nc
s3-replication 

9 

{'CopySourceRange': 'bytes=67108864-75497471'} 

---------------------------------------------------------------------------
NoSuchBucket                              Traceback (most recent call last)
Cell In[1], line 94
     90         dict_source_to_copy = {'Bucket': bucket_orig,
     91                                'Key': key_cp['Key']}
     93     print(f"{str(np.datetime64('now', 's'))}: Replicating from {bucket_orig} to {bucket_dest}:\n{key_cp['Key']}")
---> 94     s3_dest.meta.client.copy(
     95                 CopySource = dict_source_to_copy, 
     96                 Bucket = bucket_dest,
     97                 Key = key_cp['Key'],
     98                 SourceClient = s3_orig.meta.client
     99             )
    100 #s3_orig.Bucket(bucket_orig).meta.client.head_object(Bucket = bucket_orig, Key = key_cp['Key'])

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/boto3/s3/inject.py:450, in copy(self, CopySource, Bucket, Key, ExtraArgs, Callback, SourceClient, Config)
    441 with create_transfer_manager(self, new_config) as manager:
    442     future = manager.copy(
    443         copy_source=CopySource,
    444         bucket=Bucket,
   (...)
    448         source_client=SourceClient,
    449     )
--> 450     return future.result()

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/s3transfer/futures.py:103, in TransferFuture.result(self)
     98 def result(self):
     99     try:
    100         # Usually the result() method blocks until the transfer is done,
    101         # however if a KeyboardInterrupt is raised we want want to exit
    102         # out of this and propagate the exception.
--> 103         return self._coordinator.result()
    104     except KeyboardInterrupt as e:
    105         self.cancel()

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/s3transfer/futures.py:266, in TransferCoordinator.result(self)
    263 # Once done waiting, raise an exception if present or return the
    264 # final result.
    265 if self._exception:
--> 266     raise self._exception
    267 return self._result

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/s3transfer/tasks.py:139, in Task.__call__(self)
    135     # If the task is not done (really only if some other related
    136     # task to the TransferFuture had failed) then execute the task's
    137     # main() method.
    138     if not self._transfer_coordinator.done():
--> 139         return self._execute_main(kwargs)
    140 except Exception as e:
    141     self._log_and_set_exception(e)

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/s3transfer/tasks.py:162, in Task._execute_main(self, kwargs)
    159 # Log what is about to be executed.
    160 logger.debug(f"Executing task {self} with kwargs {kwargs_to_display}")
--> 162 return_value = self._main(**kwargs)
    163 # If the task is the final task, then set the TransferFuture's
    164 # value to the return value from main().
    165 if self._is_final:

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/s3transfer/copies.py:375, in CopyPartTask._main(self, client, copy_source, bucket, key, upload_id, part_number, extra_args, callbacks, size, checksum_algorithm)
    373 print(part_number, "\n")
    374 print(extra_args, "\n")
--> 375 response = client.upload_part_copy(
    376     CopySource=copy_source,
    377     Bucket=bucket,
    378     Key=key,
    379     UploadId=upload_id,
    380     PartNumber=part_number,
    381     **extra_args,
    382 )
    383 for callback in callbacks:
    384     callback(bytes_transferred=size)

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/botocore/client.py:565, in ClientCreator._create_api_method.<locals>._api_call(self, *args, **kwargs)
    561     raise TypeError(
    562         f"{py_operation_name}() only accepts keyword arguments."
    563     )
    564 # The "self" in this scope is referring to the BaseClient.
--> 565 return self._make_api_call(operation_name, kwargs)

File /geoskop/miniforge-pypy3/envs/gsk/lib/python3.11/site-packages/botocore/client.py:1017, in BaseClient._make_api_call(self, operation_name, api_params)
   1013     error_code = error_info.get("QueryErrorCode") or error_info.get(
   1014         "Code"
   1015     )
   1016     error_class = self.exceptions.from_code(error_code)
-> 1017     raise error_class(parsed_response, operation_name)
   1018 else:
   1019     return parsed_response

NoSuchBucket: An error occurred (NoSuchBucket) when calling the UploadPartCopy operation: The specified bucket does not exist.

Why the multipart upload can't finish? The parts return a 200

Regression Issue

  • Select this option if this issue appears to be a regression.

Expected Behavior

I would expect the multipart upload to finish if the parts can be uploaded

Current Behavior

An error saying the bucket can't be found

Reproduction Steps

Described above

Possible Solution

No response

Additional Information/Context

No response

SDK version used

None

Environment details (OS name and version, etc.)

Ubuntu 23.04, Python 3.12, boto3 1.34.162

Metadata

Metadata

Assignees

Labels

bugThis issue is a confirmed bug.p2This is a standard priority issueresponse-requestedWaiting on additional information or feedback.s3

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions