Upload the package zip in chunks for GHA #1043

quyykk · 2023-04-28T19:56:26Z

This fixes microsoft/vcpkg#31072 and microsoft/vcpkg#31132.

It changes the upload to the cache so that it happens in chunks of 450MB, instead of all at once. This is because GitHub rejects uploads bigger than ~500MB (educated guess) to its cache.

~~The implementation is a bit hacky though, but I haven't found a better solution: It splits the file into multiple 450MB chunk files on disk.~~

It now reads 450MB chunks from the file at a time.

BillyONeal · 2023-04-28T23:51:37Z

Sorry for the noise, trying to verify that the transition over to GitHub Actions for the PR bot is working...

src/vcpkg/archives.cpp

src/vcpkg/base/downloads.cpp

autoantwort · 2023-05-30T16:01:40Z

I haven't found a better solution: It splits the file into multiple 450MB chunk files on disk.

You could pass the data via stdin.

quyykk · 2023-06-04T09:33:37Z

You could pass the data via stdin.

I could but I honestly have no idea how.

autoantwort · 2023-07-21T14:59:13Z

I could but I honestly have no idea how.

There is now #1134 :)

quyykk · 2023-08-30T19:29:13Z

I've switched it to use stdin instead, and it still works 😄. This PR is again ready for review

quyykk · 2023-09-07T19:29:02Z

Seems like the formatting errors are caused by GitHub upgrading their clang-format, and not my fault. 😄

src/vcpkg/base/downloads.cpp

DerZade · 2024-01-22T18:54:30Z

What is the status of this? 🤔

quyykk · 2024-03-11T20:17:22Z

I fixed the merge conflicts that accumulated. Just needs someone from the team to review 😄

src/vcpkg/base/downloads.cpp

src/vcpkg/binarycaching.cpp

Co-authored-by: Thomas1664 <[email protected]>

dg0yt · 2024-03-13T08:49:43Z

src/vcpkg/base/downloads.cpp

+        base_cmd.string_arg(url);
+
+        auto file_ptr = fs.open_for_read(file, VCPKG_LINE_INFO);
+        std::vector<char> buffer(chunk_size);


I see that the default size is 450 MB. And no limits.
Is there an alternative to reading it into a buffer first just to forward it to anothers command stdin?
(Remembering all those Raspi people which struggle to build vcpkg due to low memory...)

The original way was to split the file on disk, but that's pretty hacky I think.

But I can decrease the buffer size. I'm not sure what you mean with limit.

Are people really running a GitHub Runner server on a Raspi lmao 😄

Are people really running a GitHub Runner server on a Raspi lmao

Well, this is only the tool uploading the artifacts. Caching large artifacts is more important when build machine power is low.

Ah okay. What buffer size do you think I should use? I can't make it really small or else the upload will be way slower than it would otherwise be.

I don't know. I see the trade-offs and barriers.

Can't make curl read chunks directly from (within) a large file.

Can't feed (the vcpkg function running) curl with piecewise input. (IO buffer smaller than network chunks.)

Changing curl (tool) is out of scope here.
If the interface remains running curl instead of calling into libcurl, then it would be best to fix the second point.
If this is too intrusive, it might be helpful to have a way for the user to change the buffer size, or at least to turn of the buffering in case of trouble.

BillyONeal · 2024-03-27T22:59:45Z

src/vcpkg/base/downloads.cpp

+        std::size_t bytes_read = 0;
+        for (std::size_t i = 0; i < file_size; i += bytes_read)
+        {
+            bytes_read = file_ptr.read(buffer.data(), sizeof(decltype(buffer)::value_type), chunk_size);


I think a whole curl process launch per chunk like this is kind of a problem. I don't see reasonable ways to achieve the effect this PR wants without linking with libcurl.

I suppose it could still be done like this but the chunk sizes would have to be bigger than make sense to denote as a single contiguous memory buffer; there should be more than one read / write per curl launch, etc.......

and that sounds like a lot more work than linking with libcurl.

Neumann-A · 2024-06-10T13:42:06Z

#1422 pulls in libcurl.

pedroraimundo-ao · 2024-11-14T16:07:18Z

@quyykk Is there any way to start using this in GitHub Actions at the moment? It is sorely needed to make CI builds that depend on qtbase bearable.

(maybe checking out a specific vcpkg branch in-tree? applying a patch and working dirty?)

TheWillard · 2024-11-14T21:21:05Z

@pedroraimundo-ao Probably not the answer you were looking for, but I switched to Conan for my dependencies.

rainman110 · 2024-12-05T09:22:26Z

@BillyONeal What is the status of this PR? I think many struggle to cache dependencies (in particular qt), which

Take a very long time to build
And are very large

Without caching these large dependencies, VCPKG in unusable in github actions. Conan thus might be a good alternative.

talregev · 2025-04-04T18:40:19Z

This is also should fix these error in the ci:
For vcpkg ci, maximum blok 4000 MiB.

check my issue as well:
microsoft/vcpkg#44060

talregev · 2025-04-06T18:37:52Z

@quyykk
I started to look at your code, to think if I can rebase your changes to the latest vcpkg.
Are you still working on this PR?
I think if you make your changes optional, even if there is some problem with file allocation to memory,
vcpkg team will consider it, because the user will activate it with command line option.
Later on, people will can change it to more correct way with libcurl linked.

Let me know what you think.

talregev · 2025-04-08T05:36:04Z

Hi all,
This is an high demand feature.
I started a new PR at #1643
The code is taken from here. I did it a very initial to see if I am in the right direction, and setup it on the CI.
I want to create a CI driven development, meaning, I want to test that it upload correctly by chunks to cloud in the CI.
Currently I lack the knowledge how to do it. Help will be appreciated.

My success meaning that it will be a feature that many people need and want.

talregev · 2025-04-10T05:17:55Z

I take this code and change it that it can now able to upload very large size binary cache to vcpkg ci, by chunks. Currently I set the chunk to 500MiB, but it can very until 4000Mib.

I tested it on vcpkg ci and it working.

It using standard input to upload the file as it not ideal.
I offered to add it as experimental with x flag, meaning it will only work if the user ask for it, and it not accepted by vcpkg team.

Now with my experience, I can also dev and try the same on GitHub action, but I am waiting to some progress from vcpkg team about the solution in the standard input.

You are welcome to test and review my PR.

Thank you all.

vicroms · 2025-04-26T03:11:06Z

Closing this PR as per #1662

quyykk and others added 2 commits April 28, 2023 14:59

Upload the package zip in chunks for GHA.

cd70f77

Merge remote-tracking branch 'origin/main' into HEAD

7fa81d1

BillyONeal reviewed May 1, 2023

View reviewed changes

src/vcpkg/archives.cpp Outdated Show resolved Hide resolved

BillyONeal reviewed May 1, 2023

View reviewed changes

src/vcpkg/archives.cpp Outdated Show resolved Hide resolved

BillyONeal requested changes May 1, 2023

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

quyykk added 4 commits May 14, 2023 10:50

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

df0a632

Fix merge conflicts.

db160c1

Address review comments

5f343b1

Fix formatting

905209b

quyykk requested a review from BillyONeal May 23, 2023 19:59

albertziegenhagel mentioned this pull request Jul 1, 2023

Improve CI workflow albertziegenhagel/snail-server#14

Merged

autoantwort mentioned this pull request Jul 21, 2023

Infrastructure: allow passing stdin to subprocesses. #1134

Merged

quyykk added 5 commits August 30, 2023 18:29

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

77e5c79

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

68f9a6b

Pass the data through stdin instead of spliting the archive on disk.

41d09b3

Fix warning.

d025512

Fix formatting.

126232c

quyykk added 3 commits August 30, 2023 22:29

Send the correct range in the HTTP request.

18868f1

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

a1bf8c6

Some small cleanups.

e6bf304

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

6faa588

autoantwort mentioned this pull request Sep 9, 2023

Question: Why are cache entries uploaded in chunks of 32MB? actions/toolkit#1528

Open

autoantwort suggested changes Sep 9, 2023

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

Address review comments.

8b15edd

quyykk added 3 commits October 8, 2023 20:58

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

a29b364

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

6e27717

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

710d38f

Merge remote-tracking branch 'upstream/main' into gha-upload-fixes

78b6bf3

Format fixes.

2568af0

Thomas1664 reviewed Mar 12, 2024

View reviewed changes

src/vcpkg/base/downloads.cpp Outdated Show resolved Hide resolved

Thomas1664 reviewed Mar 12, 2024

View reviewed changes

src/vcpkg/binarycaching.cpp Outdated Show resolved Hide resolved

quyykk and others added 2 commits March 12, 2024 10:28

Apply suggestions from code review. Thanks!

d397db3

Co-authored-by: Thomas1664 <[email protected]>

Format fixes.

608a72b

dg0yt reviewed Mar 13, 2024

View reviewed changes

BillyONeal reviewed Mar 27, 2024

View reviewed changes

Neumann-A mentioned this pull request Jun 10, 2024

add and link libcurl #1422

Draft

DorianBDev mentioned this pull request Sep 23, 2024

Use vcpkg for providing dependencies. DegateCommunity/Degate#37

Merged

7 tasks

JavierMatosD added the requires:vcpkg-team-review This PR or issue requires someone from the vcpkg team to take a further look. label Oct 9, 2024

j-mie mentioned this pull request Jan 15, 2025

Fix GitHub Actions Cache setting an incorrect Content-Range header #1572

Merged

talregev mentioned this pull request Apr 4, 2025

[ci upload cache] upload cache failure microsoft/vcpkg#44060

Closed

This was referenced Apr 7, 2025

Add hacky way of code of upload by chunk #1640

Closed

[WIP] Upload binaries by chunks to azure as blocklist #1643

Closed

vicroms closed this Apr 26, 2025

Upload the package zip in chunks for GHA #1043

Upload the package zip in chunks for GHA #1043

Uh oh!

Conversation

quyykk commented Apr 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BillyONeal commented Apr 28, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

autoantwort commented May 30, 2023

Uh oh!

quyykk commented Jun 4, 2023

Uh oh!

autoantwort commented Jul 21, 2023

Uh oh!

quyykk commented Aug 30, 2023

Uh oh!

quyykk commented Sep 7, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

DerZade commented Jan 22, 2024

Uh oh!

quyykk commented Mar 11, 2024

Uh oh!

Uh oh!

Uh oh!

dg0yt Mar 13, 2024

Choose a reason for hiding this comment

Uh oh!

quyykk Mar 13, 2024

Choose a reason for hiding this comment

Uh oh!

dg0yt Mar 13, 2024

Choose a reason for hiding this comment

Uh oh!

quyykk Mar 13, 2024

Choose a reason for hiding this comment

Uh oh!

dg0yt Mar 14, 2024

Choose a reason for hiding this comment

Uh oh!

BillyONeal Mar 27, 2024

Choose a reason for hiding this comment

Uh oh!

BillyONeal Mar 28, 2024

Choose a reason for hiding this comment

Uh oh!

Neumann-A commented Jun 10, 2024

Uh oh!

pedroraimundo-ao commented Nov 14, 2024

Uh oh!

TheWillard commented Nov 14, 2024

Uh oh!

rainman110 commented Dec 5, 2024

Uh oh!

talregev commented Apr 4, 2025

Uh oh!

talregev commented Apr 6, 2025

Uh oh!

talregev commented Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

talregev commented Apr 10, 2025

Uh oh!

vicroms commented Apr 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

quyykk commented Apr 28, 2023 •

edited

Loading

talregev commented Apr 8, 2025 •

edited

Loading