Skip to content

Conversation

aaronmaxlevy
Copy link

@aaronmaxlevy aaronmaxlevy commented Sep 4, 2025

Fixes #32571

Implementing workaround from aws/aws-sdk-go-v2#1816 (comment)

@aaronmaxlevy aaronmaxlevy requested a review from a team as a code owner September 4, 2025 03:02
@aaronmaxlevy aaronmaxlevy force-pushed the aaron_fix_gcs_support branch 2 times, most recently from 0df9b1c to 2901ae4 Compare September 4, 2025 03:13
Copy link

codecov bot commented Sep 4, 2025

Codecov Report

❌ Patch coverage is 11.76471% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.97%. Comparing base (2cb83a1) to head (6abde9d).
⚠️ Report is 45 commits behind head on main.

Files with missing lines Patch % Lines
server/datastore/s3/s3.go 6.25% 59 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #32573      +/-   ##
==========================================
- Coverage   64.02%   63.97%   -0.05%     
==========================================
  Files        1987     2002      +15     
  Lines      195630   198415    +2785     
  Branches     6550     6550              
==========================================
+ Hits       125256   126943    +1687     
- Misses      60573    61478     +905     
- Partials     9801     9994     +193     
Flag Coverage Δ
backend 65.17% <11.76%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

lucasmrod
lucasmrod previously approved these changes Sep 4, 2025
Copy link
Member

@lucasmrod lucasmrod left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Nit: Missing changes/ file, but don't worry, we can add it after merge.

@aaronmaxlevy
Copy link
Author

@lucasmrod I just added a changes file :)

@lucasmrod lucasmrod self-assigned this Sep 4, 2025
lucasmrod
lucasmrod previously approved these changes Sep 4, 2025
@lucasmrod
Copy link
Member

Having a hard time reproducing the issue and the fix:

Running Fleet the following way:

FLEET_S3_SOFTWARE_INSTALLERS_BUCKET=... \
FLEET_S3_SOFTWARE_INSTALLERS_ACCESS_KEY_ID=... \
FLEET_S3_SOFTWARE_INSTALLERS_SECRET_ACCESS_KEY=... \
FLEET_S3_SOFTWARE_INSTALLERS_ENDPOINT_URL=https://storage.googleapis.com \
FLEET_S3_SOFTWARE_INSTALLERS_FORCE_S3_PATH_STYLE=true \
./build/fleet serve

When attempting to upload software on main I get:
Screenshot 2025-09-04 at 1 45 52 PM
And with the changes on this PR I get:
Screenshot 2025-09-04 at 1 41 41 PM

@aaronmaxlevy
Copy link
Author

@lucasmrod so you have successfully reproduced the issue — that "Forbidden" error is reflective of the HTTP 403 response that GCS returns back to fleet.

That said, while this has fixed part of the bug (e.g. HEAD requests to GCS are working now), some requests to GCS are still failing. I am able to reproduce what you are showing and am looking into it further.

Looks like it is specifically failing to upload the installer package to GCS:

level=error ts=2025-09-04T17:19:04.476364Z component=http user=REDACTED method=POST uri=/api/latest/fleet/software/fleet_maintained_apps took=5.4596635s err="upload maintained app installer to S3: storing installer: upload multipart failed, upload id: REDACTED, cause: operation error S3: UploadPart, https response error StatusCode: 403, RequestID: , HostID: , api error SignatureDoesNotMatch: Invalid argument."

@lucasmrod
Copy link
Member

lucasmrod commented Sep 4, 2025

That said, while this has fixed part of the bug (e.g. HEAD requests to GCS are working now), some requests to GCS are still failing. I am able to reproduce what you are showing and am looking into it further.

@aaronmaxlevy Thanks for looking into this!

@aaronmaxlevy
Copy link
Author

@lucasmrod No problem! It looks like 41fd840 complicated this further because it switched to using multipart uploads, and there seems to be a compatibility issue with GCS with that approach. I am looking further into it and hoping to have a workaround / solution soon.

@aaronmaxlevy
Copy link
Author

@lucasmrod I have pushed a new commit that fixes multipart upload to GCS also — it's a bit janky, but there isn't really a better option right now AFAICT.

Probably the best long term solution would be to support GCS as a separate storage mechanism backed by Google's own SDK, but that would be a larger, breaking change (the GCS SDK uses different authentication), and IMO, this fixes things until something better is in place.

FWIW, the Accept-Encoding header issue is specific to GCS, but the multipart upload issue with trailing checksums I can see being an issue potentially with other "S3-compatible" providers that haven't implemented support for that yet.

@lucasmrod
Copy link
Member

Hi @aaronmaxlevy!

I'm testing your latest changes and one upload worked (7.5 MB installer) but then uploading other larger installers started to fail/hang. Also tried a very small one (< 1MB) and it failed with signature error.

Maybe we can try not doing multipart upload when using GCS?

@aaronmaxlevy
Copy link
Author

aaronmaxlevy commented Sep 5, 2025

@lucasmrod that is interesting — I have tried it multiple times now with installers of various larger sizes and it is working consistently with larger installers. The largest I tested with was 530 MB.

Often times it would hang for a bit at 97%, but then complete successfully — I think that is just reflective of the time it takes to transfer the data, and the challenges with accurately estimating progress percentage.

I was able to reproduce the signing error you received with a small installer package, and I have pushed a fix for this. Currently, the S3 Upload Manager code being used will only do multipart upload for installers larger than 5 MB. For smaller installers, it will just do a normal "PutObject", which is the only other S3 API for which the AWS Go SDK v2 enables trailing checksums.

In other words, even if we didn't use multipart uploads at all for GCS, this would still be an issue. Anyhow, this should work fully / properly now. Let me know if there are any issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Google Cloud Storage (GCS) support is broken as of 4.71.0
2 participants