Remove unreferenced arguments from S3 endpoint ruleset #3529

akx · 2025-07-23T11:07:11Z

None of these arguments are referenced by the ruleset, but the set of arguments is used for caching endpoint resolution results via

botocore/botocore/endpoint_provider.py

Lines 707 to 708 in 7ddc232

    
           @lru_cache_weakref(maxsize=CACHE_SIZE) 
        
           def resolve_endpoint(self, **input_parameters):

.

This means that the entire endpoint ruleset is evaluated (only to end up with the same result) for every S3-related request with a different Key, IOW for every file stored in S3. Since evaluating the rule tree is pretty slow, this causes significant slowdowns in mass operations, not to mention memory bloat for the (useless) cache.

The three removed parameters are not in the equivalent JS SDK code either.

Refs #3528.

None of these arguments are referenced by the ruleset, but set of arguments is used for caching endpoint resolution results. This means that the entire endpoint ruleset is evaluated (only to end up with the same result) for every S3-related request with a different Key. Refs boto#3528

codecov-commenter · 2025-07-23T11:14:28Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 93.13%. Comparing base (6300f6b) to head (0f6e3cb).
Report is 460 commits behind head on develop.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3529      +/-   ##
===========================================
+ Coverage    92.94%   93.13%   +0.18%     
===========================================
  Files           66       68       +2     
  Lines        14956    15324     +368     
===========================================
+ Hits         13901    14272     +371     
+ Misses        1055     1052       -3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

SamRemis · 2025-08-01T17:30:59Z

Hi @akx,

Thank you for the PR. Unfortunately those models are owned upstream by the service team and can be updated at any time. Even if we were to merge this today, if the service team makes any unrelated changes to their endpoints tomorrow, this file would just be overwritten and the keys would come back.

While these aren't used in ruleset evaluation for botocore, my current understanding is that they are used by other AWS products and cannot be removed from the upstream ruleset. I am still working on gathering more specifics.

Thank you for calling this out - it doesn't make sense for us to slow down evaluation for botocore users based on parameters we don't use. We should either come up with an alternate solution here that causes unused keys to be ignored for caching purposes or we should find a way to remove these upstream.

akx · 2025-08-04T06:36:29Z

@SamRemis Hi, thanks for looking into this!

Yeah, I figured this isn't the correct way to fix this given these lines had been added by an automated bot commit anyway, but I wanted to get the ball rolling.

We should either come up with an alternate solution here that causes unused keys to be ignored for caching purposes [...]

As mentioned above, the JavaScript SDK has a better set of cache keys (clients/client-s3/src/endpoint/endpointResolver.ts added in aws/aws-sdk-js-v3@4be028b, "chore: codegen sync").

It looks like the smithy-typescript-codegen codegen code checks for this exact eventuality and elides parameters that aren't used by the ruleset for purposes of cache keying. (As it happens, I wrote basically similar code earlier to prepare for this PR and based on it, it looks like that these 3 are apparently the only parameters unused across all of Botocore.)

One somewhat simple option could be to add some tribal knowledge to the resolver in botocore for this particular use: iff the ruleset is for S3, and the original set of parameters in the ruleset is this known one, then remove the three unused args before passing to caching/evaluation? Also probably a test that would fail if the S3 ruleset's parameters change, so maintainers would know to take action to adjust the code.

Another more labor-intensive option I can think of that would probably be for the best in the longer run is to also codegen ruleset evaluation into Python, instead of having some Python code evaluate the JSON at runtime.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Remove unreferenced arguments from S3 endpoint ruleset #3529

Remove unreferenced arguments from S3 endpoint ruleset #3529

Uh oh!

akx commented Jul 23, 2025

Uh oh!

codecov-commenter commented Jul 23, 2025 •

edited

Loading

Uh oh!

SamRemis commented Aug 1, 2025

Uh oh!

akx commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	@lru_cache_weakref(maxsize=CACHE_SIZE)
	def resolve_endpoint(self, **input_parameters):

Remove unreferenced arguments from S3 endpoint ruleset #3529

Are you sure you want to change the base?

Remove unreferenced arguments from S3 endpoint ruleset #3529

Uh oh!

Conversation

akx commented Jul 23, 2025

Uh oh!

codecov-commenter commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

SamRemis commented Aug 1, 2025

Uh oh!

akx commented Aug 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Jul 23, 2025 •

edited

Loading