Implement S3 beatmap storage #13

bdach · 2024-12-23T13:32:19Z

Depends on Optimise package-related operations #12

This one probably deserves a bit more scrutiny because there are a few details that stand out. I'll be adding a few self-review comments on those after I open.

S3 is being a bit too slow for my liking without this. Also includes not re-initialising the S3 client on every request which I am starting to suspect is just wrong. It's what spectator server does, but doing that causes the 32-client pool to be recreated every time which... given .NET guidelines on `HttpClient` usage, does... not seem optimal...

osu.Server.BeatmapSubmission/Services/S3BeatmapStorage.cs

bdach · 2024-12-23T13:37:28Z

osu.Server.BeatmapSubmission/Services/S3BeatmapStorage.cs

+            client = new AmazonS3Client(
+                new BasicAWSCredentials(AppSettings.S3AccessKey, AppSettings.S3SecretKey),
+                new AmazonS3Config
+                {
+                    CacheHttpClient = true,
+                    HttpClientCacheSize = 32,
+                    RegionEndpoint = RegionEndpoint.USWest1,
+                    UseHttp = true,
+                    ForcePathStyle = true,
+                    RetryMode = RequestRetryMode.Legacy,
+                    MaxErrorRetry = 5,
+                    Timeout = TimeSpan.FromMinutes(1),
+                });


Note that contrary to osu-server-spectator (and more?) this is created in the ctor. This service also has a singleton lifetime. This means that this object is shared.

I suspect that osu-server-spectator is very wrong in instantiating AmazonS3Client on every call. The reason why is that HttpClientCacheSize is supposed to specify how many HttpClient instances the amazon client can use for requests. Which would mean that every instantiation of AmazonS3Client spawns 32 HttpClients. The guidance on HttpClient usage is that it is supposed to be shared, as in you're not supposed to create one every call.

(All of that would be nullified if it turned out AmazonS3Client is doing some black magic hackery with static to ensure the http client cache is created once ever, but on a quick source inspection I do not believe that to be true.)

Also there's an argument here that rather than doing its own HttpClient pooling this should be feeding into the ASP.NET-provided HttpClient DI factory machinery. I would be OK with making it do that, but am lacking context as to why this HttpClientCacheSize was being set to what it was (probably originates from 5 projects back and kept getting copy-pasted around as convenient).

Sounds very suspicious. I do recall that this should be a singleton by design, so I think we probably want to fix that in server-spectator as well.

bdach · 2024-12-23T13:39:43Z

osu.Server.BeatmapSubmission/Services/S3BeatmapStorage.cs

+            await Task.WhenAll(
+                uploadBeatmapPackage(beatmapSetId, beatmapPackage, stream),
+                uploadBeatmapFiles(files));


There is some parallelisation here because in testing S3 was being pretty slow for me.

Note that due to the structure of everything every beatmap upload basically incurs a full S3 roundtrip, e.g. a download of the previous package and an upload of the new package, plus the upload of the individual files.

The files upload is a bit wasteful, since there is a possibility of the files already existing. However, eliminating that is annoying structurally; I'd have to pass in stuff to indicate which files to upload and which to not which gets a bit dicey for me. It can probably be done, though.

It's always good to have parallel uploads for S3, because yeah, the round trip time can fluctuate (especially based around server region). I think this stems from the upload performing quite a few requests to get to the final state.

osu.Server.BeatmapSubmission/Services/S3BeatmapStorage.cs

- `CacheHttpClient` defaults to true on .NET (as in past .NET Core) anyway. - The `HttpClientCacheSize` default of 1 is likely the better one these days, what with how `HttpClient` is designed to be used.

…ucket

bdach added 2 commits December 23, 2024 13:38

Implement baseline S3 beatmap storage

c7f2c37

bdach self-assigned this Dec 23, 2024

pull-request-size bot added the size/L label Dec 23, 2024

bdach commented Dec 23, 2024

View reviewed changes

bdach mentioned this pull request Dec 23, 2024

Path to deployment to staging #2

Closed

Merge branch 'master' into s3

fa46867

peppy self-requested a review December 24, 2024 06:22

peppy reviewed Dec 24, 2024

View reviewed changes

osu.Server.BeatmapSubmission/Services/S3BeatmapStorage.cs Outdated Show resolved Hide resolved

peppy reviewed Dec 24, 2024

View reviewed changes

osu.Server.BeatmapSubmission/Services/S3BeatmapStorage.cs Outdated Show resolved Hide resolved

bdach added 3 commits December 24, 2024 10:40

Remove likely-outdated HttpClient configuration flags

c4674ba

- `CacheHttpClient` defaults to true on .NET (as in past .NET Core) anyway. - The `HttpClientCacheSize` default of 1 is likely the better one these days, what with how `HttpClient` is designed to be used.

Skip one extra array copy

71efcb0

Also upload beatmap files to a separate directory

7097b6c

bdach mentioned this pull request Dec 24, 2024

Implement required interactions with beatmap mirrors #14

Merged

1 task

That separate beatmap files directory was supposed to be a separate b…

fd39d35

…ucket

peppy approved these changes Dec 26, 2024

View reviewed changes

peppy merged commit db04619 into ppy:master Dec 26, 2024
4 checks passed

bdach deleted the s3 branch December 26, 2024 07:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement S3 beatmap storage #13

Implement S3 beatmap storage #13

Uh oh!

bdach commented Dec 23, 2024 •

edited by peppy

Loading

Uh oh!

Uh oh!

bdach Dec 23, 2024

Uh oh!

bdach Dec 23, 2024

Uh oh!

peppy Dec 24, 2024

Uh oh!

bdach Dec 23, 2024

Uh oh!

peppy Dec 24, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Implement S3 beatmap storage #13

Implement S3 beatmap storage #13

Uh oh!

Conversation

bdach commented Dec 23, 2024 • edited by peppy Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

bdach Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

bdach Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

peppy Dec 24, 2024

Choose a reason for hiding this comment

Uh oh!

bdach Dec 23, 2024

Choose a reason for hiding this comment

Uh oh!

peppy Dec 24, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bdach commented Dec 23, 2024 •

edited by peppy

Loading