Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

take prewarmed container's memory as used memory #4911

Merged

Conversation

ningyougang
Copy link
Contributor

@ningyougang ningyougang commented May 25, 2020

Description

Due to this pr (Adjust prewarm container dynamically): #4871 is merged
I think it is a good chance to add this feature.

for master branch, if invoker has 16GB physical memory, currently, if we configure 12GB user memory and 5GB prewarm pool, invokers will try to create containers up to 17GB memory and it may will end up with OOM.

After apply this feature, user just configure invoker memory to a precise value: 16GB , when reaches 16GB, it will not create container any more, this can avoid OOM

This has another benefit that make user must calculate the prewarmed container's memory carefully, and configure the invoker memory more accurately

"CONFIG_whisk_containerPool_userMemory": "{{ hostvars[groups['invokers'][invoker_index | int]].user_memory | default(invoker.userMemory) }}"

Related issue and scope

  • I opened an issue to propose and discuss this change (#????)

My changes affect the following components

  • API
  • Controller
  • Message Bus (e.g., Kafka)
  • Loadbalancer
  • Invoker
  • Intrinsic actions (e.g., sequences, conductors)
  • Data stores (e.g., CouchDB)
  • Tests
  • Deployment
  • CLI
  • General tooling
  • Documentation

Types of changes

  • Bug fix (generally a non-breaking change which closes an issue).
  • Enhancement or new feature (adds new functionality).
  • Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

  • I signed an Apache CLA.
  • I reviewed the style guides and followed the recommendations (Travis CI will check :).
  • I added tests to cover my changes.
  • My changes require further changes to the documentation.
  • I updated the documentation where necessary.

@codecov-commenter
Copy link

codecov-commenter commented May 25, 2020

Codecov Report

Merging #4911 into master will decrease coverage by 6.09%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4911      +/-   ##
==========================================
- Coverage   83.38%   77.28%   -6.10%     
==========================================
  Files         201      201              
  Lines        9441     9446       +5     
  Branches      396      398       +2     
==========================================
- Hits         7872     7300     -572     
- Misses       1569     2146     +577     
Impacted Files Coverage Δ
...e/openwhisk/core/containerpool/ContainerPool.scala 97.92% <100.00%> (+0.04%) ⬆️
...core/database/cosmosdb/RxObservableImplicits.scala 0.00% <0.00%> (-100.00%) ⬇️
...ore/database/cosmosdb/cache/CacheInvalidator.scala 0.00% <0.00%> (-100.00%) ⬇️
...e/database/cosmosdb/cache/ChangeFeedConsumer.scala 0.00% <0.00%> (-100.00%) ⬇️
...core/database/cosmosdb/CosmosDBArtifactStore.scala 0.00% <0.00%> (-96.23%) ⬇️
...sk/core/database/cosmosdb/CosmosDBViewMapper.scala 0.00% <0.00%> (-93.90%) ⬇️
...tabase/cosmosdb/cache/CacheInvalidatorConfig.scala 0.00% <0.00%> (-92.31%) ⬇️
...enwhisk/connector/kafka/KamonMetricsReporter.scala 0.00% <0.00%> (-83.34%) ⬇️
...e/database/cosmosdb/cache/KafkaEventProducer.scala 0.00% <0.00%> (-77.78%) ⬇️
...whisk/core/database/cosmosdb/CosmosDBSupport.scala 0.00% <0.00%> (-74.08%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1274fdf...370ddef. Read the comment docs.

@ningyougang
Copy link
Contributor Author

ningyougang commented Jun 9, 2020

Below prs has relation with current pr

pull/4917 is adjust userMemory and pull/4790 is adjust prewarmed container

@ningyougang
Copy link
Contributor Author

upsteam guy: Tyson replied on dev mail list for this pr as below

I agree this is a good change, but may require operators to update their configured userMemory and/or their prewarm configs
 
to avoid unexpected problems. It might be good to add some log statements to indicate userMemory available AFTER initial 

prewarm config is applied, to give some hints that userMemory seen at invoker is now lower than before with the same configs?

Regarding but may require operators to update their configured userMemory and/or their prewarm configs to avoid unexpected problems.
Currently, below prs has been implemented with update their configured userMemory and/or their prewarm configs

Regarding It might be good to add some log statements to indicate userMemory available AFTER initial prewarm config is applied, to give some hints that userMemory seen at invoker is now lower than before with the same configs?
Agree, i think after above two prs(pull/4917 and pull/4790) merged, i can add some logs in this pr after rebased codes.


val prewarmedPoolForOtherKind = prewarmedPool.filter { info =>
info match {
case (_, PreWarmedData(_, `kind`, `memory`, _, _)) => false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it require filtering prewarm pool for other kinds?

Copy link
Contributor Author

@ningyougang ningyougang Sep 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm.. i forgot the reason why i want to get the other kind/memory prewarmpool :(,
but then, i feel use prewarmedPool directly is more nature, already updated.

@ningyougang ningyougang force-pushed the take_prewarm_into_used_invoker_memory branch from 370ddef to 4187f34 Compare September 9, 2020 01:01
@ningyougang ningyougang force-pushed the take_prewarm_into_used_invoker_memory branch from 4187f34 to 31741d1 Compare October 15, 2020 02:56
val createdContainer =
// Is there enough space on the invoker for this action to be executed.
if (hasPoolSpaceFor(busyPool, r.action.limits.memory.megabytes.MB)) {
if (hasPoolSpaceFor(busyPool ++ prewarmedPool, memory)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this needs proper documentation.

Copy link
Member

@style95 style95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a minor nit.

@ningyougang ningyougang force-pushed the take_prewarm_into_used_invoker_memory branch from 31741d1 to f933685 Compare November 20, 2020 07:14
@ningyougang ningyougang reopened this Nov 23, 2020
@ningyougang ningyougang force-pushed the take_prewarm_into_used_invoker_memory branch from f933685 to 22db3cb Compare January 18, 2021 01:56
@codecov-io
Copy link

codecov-io commented Jan 18, 2021

Codecov Report

Merging #4911 (22db3cb) into master (2d0c8a7) will increase coverage by 47.76%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #4911       +/-   ##
===========================================
+ Coverage   29.09%   76.85%   +47.76%     
===========================================
  Files         195      202        +7     
  Lines        9553     9820      +267     
  Branches      413      416        +3     
===========================================
+ Hits         2779     7547     +4768     
+ Misses       6774     2273     -4501     
Impacted Files Coverage Δ
...e/openwhisk/core/containerpool/ContainerPool.scala 97.20% <100.00%> (+5.66%) ⬆️
...nwhisk/core/monitoring/metrics/EventConsumer.scala 89.23% <0.00%> (ø)
...penwhisk/core/monitoring/metrics/MetricNames.scala 100.00% <0.00%> (ø)
...k/core/monitoring/metrics/PrometheusRecorder.scala 86.00% <0.00%> (ø)
.../core/monitoring/metrics/PrometheusEventsApi.scala 90.90% <0.00%> (ø)
...pache/openwhisk/core/monitoring/metrics/Main.scala 0.00% <0.00%> (ø)
...nwhisk/core/monitoring/metrics/KamonRecorder.scala 82.45% <0.00%> (ø)
...hisk/core/monitoring/metrics/OpenWhiskEvents.scala 93.75% <0.00%> (ø)
...openwhisk/common/tracing/OpenTracingProvider.scala 21.15% <0.00%> (+1.92%) ⬆️
...re/database/MultipleReadersSingleWriterCache.scala 98.00% <0.00%> (+2.00%) ⬆️
... and 143 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2d0c8a7...22db3cb. Read the comment docs.

@style95
Copy link
Member

style95 commented Jan 18, 2021

Merged as it has been around 2 months.

@style95 style95 merged commit b0baa7b into apache:master Jan 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants