Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

request-cache is not working properly #119

Open
ericboucher opened this issue May 16, 2022 · 4 comments · Fixed by #123
Open

request-cache is not working properly #119

ericboucher opened this issue May 16, 2022 · 4 comments · Fixed by #123
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@ericboucher
Copy link
Contributor

ericboucher commented May 16, 2022

Tests keep hitting rate limiting issues, which suggests that our request-cache implementation is not working as expected during CI runs. It is working fine locally, which makes it hard to debug...

We should resolve this problem to help with our CI, making it faster and less frustrating

@ericboucher ericboucher added the bug Something isn't working label May 16, 2022
@ericboucher ericboucher changed the title requets-cache is not working properly request-cache is not working properly May 17, 2022
@ericboucher ericboucher added the good first issue Good for newcomers label May 17, 2022
@ericboucher
Copy link
Contributor Author

ericboucher commented May 17, 2022

Idea: store the cache file somewhere else? In S3, or at least somewhere that gives us a bit more control?

https://github.com/marketplace/actions/s3-cache-for-github-actions or similar?

@laurentS
Copy link
Contributor

One year later, I'm reopening this, as the cache is clearly not used.
This screenshot from https://github.com/MeltanoLabs/tap-github/actions/caches confirms why:
image

Cache is python version specific, which defeats the purpose of our caching attempts.

A local run of the test suite consumes about 500 rest requests and 25 graphql points. As we run 5 versions of python, we're still at only 50% of quota on rest, and much less on graphql. So there's probably some other consumer of quota outside of CI.

Storage of cache outside of github seems like the only option at this point. I will follow up on the suggestion above.

@laurentS
Copy link
Contributor

Actually, I think I misunderstood what is happening. The cache key is api-cache-v4 which is python version agnostic.

But the cached file is 3 weeks old in the screenshot above, so our code ignores it (it expires the cache after 24h) so it hits the API again.

My guess at this point is that the cache works the first day we use it, but it then becomes stale and ignored by requests-cache. 2 options:

  • change code to expire the cache after much longer than 24h. The problem being that tests use a start_date of more or less "today". So beyond the first day, it's likely that requests might change
  • use a cache key based on the current date, which should make the cache file renew every 24h, in line with the cache expiry and api requests.

@laurentS
Copy link
Contributor

To expand on the above, this is all because cached files are read-only, so the cache file is not updated beyond its first creation time.
See https://github.com/actions/cache/blob/main/tips-and-workarounds.md#update-a-cache for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants