-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Lately I've noticed that many of our Cloud Functions modules are hitting their memory limits. We need to review each of them and increase the memory limits where necessary.
Most of the work will be to figure out which ones actually need to be increased. The main metric to look at is "Memory utilization". It's measured in MB/call and plotted by percentile (50th, 95th, 99th -- i.e., 99% of calls used less than n MB). There are also logs. I do not know how the memory is actually managed (we're certainly sharing a much larger amount of memory with other customers), so I don't know what actually happens when a call exceeds its limit or what threshold we should aim for. Is it ok if 1% of the calls exceed the limit? 5%? How much does exceeding the limit affect the stability of our pipeline?
Potential consequences to look for include an increase in function execution time, function crashes, and container instances cycling (stop/start) more often. Increasing a module's memory limit will obviously cost more but it could save money overall if it means functions run faster and crash less.
And if the answer is different for Cloud Run vs Functions, that should affect our decisions.