-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance opportunity for yarn 3 cache #325
Comments
It can be even more efficient by not including the OS and Node version in the cache key as well - #272 (comment) |
Hello @belgattitude. Thank you for your feature request. Could you please describe it little bit. Do you want to add restore-keys to the action to take previous cache or you also want to add |
Hello @belgattitude, just a gentle ping. |
Let me check something in a couple of days. I'll be back asap. |
@dmitry-shibanov first of all thanks for considering this issue. I've edited the P/R desc with latest comments from @merceyz
Yes that's the idea, that way we take advantage of the built-in yarn 2+/3+ cache management. Yarn will prune old packages and only fetch new ones keeping the packages zipped into Note that if multiple actions are run in parallel, only the first one will be able to reserve the cache, others will print a warning after:
As far as I tested it does not create problems and can be safely ignored..
Actually this seems to be what matters most. (AFAIK setup/node try to cache only the There's some kind of guarantee that yarn cache is portable across OS and node versions as @merceyz pointed out, while node_modules might not (native binaries, postinstall tricks). As the cache folder is configurable (through yarnrc.yml or # Get the yarn cache path.
- name: Get yarn cache directory path
id: yarn-cache-dir-path
run: echo "::set-output name=dir::$(yarn config get cacheFolder)" The restore key could be a constant: - name: Restore yarn cache
uses: actions/cache@v2
id: yarn-cache # use this to check for `cache-hit` (`steps.yarn-cache.outputs.cache-hit != 'true'`)
with:
path: ${{ steps.yarn-cache-dir-path.outputs.dir }}
key: yarn-cache-folder-${{ hashFiles('**/yarn.lock', '.yarnrc.yml') }}
restore-keys: |
yarn-cache-folder- Interestingly pnpm has a benchmark page with some insights about the different modes (cache+node_modules, cache only) Looks having both node_modules + yarn cache will be faster (excluding action/cache restore/generate...). I guess even more as yarn won't probably have to re-run the link phase (generally slow with native binaries: esbuild, swc, sharp, prisma...). But with caching node_modules I'm not sure of the portability though (it depends also of setup/node). Note also that node_modules folder is only used in I would say a safe way would be to just save 'yarn cache folder' and not node_modules. Let me know if it answer your questions. Have a great day PS: @merceyz thanks for your comment about portability, I was wondering if the newly introduced supportedtArchitecture change something ? |
Do you recommend using |
Hi @devgioele, Caching Cheers |
Deleting yarn.lock and running yarn install helped, but in my case with yarn caching enabled |
I think this is a duplicate of #328, which is similar but not specific to yarn |
Just to share, here's my updated composite action for
|
Simplify dependency caching for Yarn 3+ with Zero Installs. Inspired by actions/setup-node#325 and https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a.
Simplify dependency caching for Yarn 3+ with Zero Installs. Inspired by actions/setup-node#325 and gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a.
Would be good to have support for Yarn 3 caching |
This got a bit more complicated in Yarn v3.1.0 with the introduction of |
Is there any plan to add this anytime soon? |
Hello @nijikon, |
Hello @belgattitude, please confirm the problem we want to solve is eliminating the whole cache reset on any lock file invalidating. I mean that if only one (of 2) dependency changed akv-demo/setup-node-test@44f68b4#diff-51e4f558fae534656963876761c95b83b6ef5da5103c4adef6768219ed76c2deL19 the whole cache is reset https://github.com/akv-demo/setup-node-test/actions/runs/4767849996/jobs/8476525793#step:3:82 and all (of 2) dependencies must be fetched https://github.com/akv-demo/setup-node-test/actions/runs/4767849996/jobs/8476525793#step:4:28 Is your suggestion to solve the described problem with |
Hey @dsame
Yes. But https://github.com/akv-demo/setup-node-test/actions/runs/4767849996/jobs/8476525793#step:4:28 does not seem to work ? Am I wrong ? |
BTW you can have a look to some ideas
That said in my experience couldn't find one solution for every use-case. Comments in there strapi/strapi#16581 TLDR: Ideally with yarn, the following steps/files/folders can be cached
For 2 and 3 it's optional and should be measured on a specific repo. Cause saving node_modules is heavy (see post restore). For a concrete example see also a small sized monorepo but with a lot of deps: https://github.com/belgattitude/nextjs-monorepo-example. It seems doable to get an install under 20s There's more to say about supportedArchitectures (yarn 4), prisma postinstall tricks... |
Yes, it is |
@belgattitude with the close look i started to doubt about the feature. First of all yarn has 2 modes - local and global cache depending on With the global cache yarn does not purge the obsolete dependencies and we can not restore the previous cache in order to avoid the endless grow proof: you can see different versions of lodash with the local cache yarn does purge the obsolete dependencies BUT, in this case you should not use actions cache at all - you should keep (https://yarnpkg.com/features/offline-cache). If we even apply the feature only for local caches we are getting into the danger of clashing Can you please provide more justification to use caching instead of adding Also i have a question about node modules? Do we really need it to cache it with yarn? Since PnP has been introduced |
From my phone, so writing is minimal. Override global cache with YARN_ENABLE_GLOBAL_CACHE=false. That gives you an uniform way to do it for ci. The yarn cache will be stored in .yarn/cache by default but you can get it from config as well. Just cache .yarn/cache not .yarn Better to avoid caching node modules with this approach. It generally does not bring speed up due to action/cache extra work. That also solve the pnp support. |
Hello @belgattitude - one more clarification: is there any reason to do not add Having |
The initial offline mode idea isn't probably used. Storing large binaries in git and their mutation is probably not a good idea. I guess after few updates you'll end up with a git repo of 10's of Gb. That said I can see why they explored this. Yarn default is to not commit the cache (doc is confusing). But it's true setup/node action cache feature shouldn't be enabled by the user if he's using full offline. |
Thank you @belgattitude for your opinion, now i agree it is worth to continue with the feature in order to avoid huge repo. Can you please take a look at this Proof of concept draft PR and tell you opinion about does it make a sense to include node version and step id into the cache key keeping in mind having many huge cache will hit storage limit soon |
@belgattitude can you please also confirm the following : We have to use 3rd parameter of restoreCache in case if all of 3 conditions met: The 3rd parameter of
|
I have looked into your PR. stepId not necessary in my opinion. But there's something missing to ensure yarn 4. Not a lot of time the next days... I'll figure out a way to share info asap |
In the meantime I've tried to answers few questions in
Might worth to have a quick glance |
The next iteration of PoC PR showed the detection of all the conditions implies the significant changes in code base and makes it unreasonable complicated. @belgattitude can you please take a look at this issue - does not it seem the goal of reusing the existing cache can be achieved with resolving that issue in the more straightforward and common way? I looked through the links you've sent,
Summary: can you please confirm or deny that |
Sorry I couldn't read properly but some quick toughts:
I would agree that params helps.
yarnrc.yml isn't something that is mutated often, so I wouldn't worry about it to invalidate things often. Not accounting for it looks fragile in my impression, but I'm not 100% sure. I would let the benefit to doubt and include it in the hash (ie someone changes an advanced param...) What do you mean by overkill ? is the hash call slow ?
For yarn 2+, I would not cache node_modules at all. (only few edges case would benefit from it, in this case they might use a totally different action than the one in setup/node)
The general idea seems good, but as always a deep testing is required. Do you expect me to do it ? In that case I'll need some time.
Can you precise the meaning of ? "Unreasonable" => "in your opinion you wouldnt include the feature ?". Or you're still wondering ? I would understand of course. But as a small personal note, it might have an very nice impact on global CI time... . Not sure about how many projects uses yarn 2+ and setup/node cache, but that's an interesting feature imho |
Hello, @belgattitude
Not at all, i've requested you opinion just to confirm we are no the same page and to get rid off some my hesitations. Now there's a plan an we are on the road. Thanks for the cooperation. |
@marko-zivic-93 back in April #325 (comment), you wrote that you will be having a look at this in the upcoming quarter. Were you talking about Q2 or Q3 of 2023? |
Hello @nijikon the requested changes are merged into the main branch and are available since v3.7.0 release |
@dsame thanks. I will have a look if that helps my case. |
Hello @nijikon, i am going to close this issue because the PRs are merged and due to inactivity, but please feel free to reopen it or create new issue in case if the problem still exists |
@dsame I think I'm good. My current problem is that it does not cache the |
Bumps [@types/js-yaml](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/js-yaml) from 4.0.4 to 4.0.5. - [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases) - [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/js-yaml) --- updated-dependencies: - dependency-name: "@types/js-yaml" dependency-type: direct:development update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Thanks for the new cache feature. Much easier.
After few weeks, I realized that while it supports yarn, there's some improvements that can be made.
Yarn 3 (probably yarn 2+ too) manages downloaded archives pretty well (
.yarn/cache/*.zip
) and invalidating on yarn.lock changes does not take that into account and I saw a lot of cache misses.As an example I converted back to action-cache to illustrate and test.
I'm wondering if a similar approach could be done with setup-node ?
Updated example with action cache
Setup action
Testing a cache hit after adding a dependency
PS: Key points
~Example with action cache~ (old version, before @merceyz improvements)
Note
Here's a gist with an optimized install example: https://gist.github.com/belgattitude/042f9caf10d029badbde6cf9d43e400a
The text was updated successfully, but these errors were encountered: