Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memcache write errors in the LMS ("object too large for cache") #877

Open
timmc-edx opened this issue Dec 18, 2024 · 1 comment
Open

memcache write errors in the LMS ("object too large for cache") #877

timmc-edx opened this issue Dec 18, 2024 · 1 comment

Comments

@timmc-edx
Copy link
Member

The vast majority of a certain class of memcache calls to set a key are failing with the error "object too large for cache".

These can be identified with @error.message:"b'object too large for cache'" "error.type:pymemcache.exceptions.MemcacheServerError in a Datadog query.

Notes

  • The failing spans are all operation_name:memcached.command resource_name:set. These come from the memcache library integration. These failing writes do not propagate their error upwards, which is for the best but does mean that querying is a little complicated; to get more information about what memcache operation was attempted, you'll need to look at their parent spans, which are operation_name:django.cache. You'll need to do an a => b trace search.
  • At the django.cache level I see that the resource names seem to all be django.core.cache.backends.memcached.OPERATION KEY_PREFIX (note the space). There are three key prefixes in effect: default, course_structure, and (uncommonly) general.
  • The vast majority of these errors are coming from set on course_structure. Here's a status breakdown for those resources. A few of the errors come from default.
  • Slicing a different way, almost all of course_structure sets are failing; almost all of default sets are succeeding. They are of roughly equal number.
@timmc-edx timmc-edx converted this from a draft issue Dec 18, 2024
@github-project-automation github-project-automation bot moved this to Todo in Arbi-BOM Jan 6, 2025
@jristau1984
Copy link

@UsamaSadiq @iamsobanjaved please consider this a discovery ticket to try and find a root cause for this, instead of simply bumping up the max threshold. Thanks!

@jristau1984 jristau1984 moved this to Backlog in Arch-BOM Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Todo
Status: Backlog
Development

No branches or pull requests

2 participants