Change prewarm container #4225

ningyougang · 2019-01-21T03:08:12Z

Add prewarm container
Delete prewarm container

Sometimes, the operator wants to add/delete prewarm container to invoker dynamically
For example: under some condition, operator can add the extra runtime perwarm container number(or reduce the number) in advance.

Description

Related issue and scope

I opened an issue to propose and discuss this change (#????)

My changes affect the following components

Types of changes

Bug fix (generally a non-breaking change which closes an issue).
Enhancement or new feature (adds new functionality).
Breaking change (a bug fix or enhancement which changes existing behavior).

Checklist:

I signed an Apache CLA.
I reviewed the style guides and followed the recommendations (Travis CI will check :).
[] I added tests to cover my changes.
[] My changes require further changes to the documentation.
I updated the documentation where necessary.

markusthoemmes · 2019-01-21T09:58:29Z

Thanks for the contribution. This looks like the inception of an active admin interface into the invoker, correct?

If so, I think we need a broader discussion on how to achieve that and for this case specifically: Do we want different invokers with different prewarming settings? Manipulating a single invoker in this case results in a divergence to the global configuration. If that invoker restarts, what happens?

My gut-feeling in this case is that you'd want a propagation mechanism to roll out a new config to all invokers on the fly.

I think this warrants a discussion on the dev-list.

codecov-io · 2019-01-21T10:06:00Z

Codecov Report

Merging #4225 into master will decrease coverage by 4.43%.
The diff coverage is 25%.

@@            Coverage Diff            @@
##           master   #4225      +/-   ##
=========================================
- Coverage   84.64%   80.2%   -4.44%     
=========================================
  Files         156     157       +1     
  Lines        7475    7588     +113     
  Branches      489     499      +10     
=========================================
- Hits         6327    6086     -241     
- Misses       1148    1502     +354

Impacted Files	Coverage Δ
...che/openwhisk/core/loadBalancer/LoadBalancer.scala	`50% <ø> (ø)`	⬆️
.../org/apache/openwhisk/core/connector/Message.scala	`62.33% <0%> (-2.53%)`	⬇️
...rg/apache/openwhisk/core/entity/ExecManifest.scala	`93.87% <0%> (-4%)`	⬇️
.../apache/openwhisk/core/invoker/InvokerServer.scala	`0% <0%> (ø)`
...e/openwhisk/core/containerpool/ContainerPool.scala	`79.57% <0%> (-11.41%)`	⬇️
...e/loadBalancer/ShardingContainerPoolBalancer.scala	`84.9% <0%> (-3.08%)`	⬇️
...la/org/apache/openwhisk/core/invoker/Invoker.scala	`71.66% <100%> (+0.48%)`	⬆️
.../apache/openwhisk/core/controller/Controller.scala	`74.13% <37.5%> (-9.74%)`	⬇️
...pache/openwhisk/core/invoker/InvokerReactive.scala	`71.2% <50.98%> (+6.71%)`	⬆️
...core/database/cosmosdb/RxObservableImplicits.scala	`0% <0%> (-100%)`	⬇️
... and 33 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b3a3ef4...36ceff8. Read the comment docs.

rabbah · 2019-01-21T22:18:02Z

@ningyougang thank you for this contribution --- having an "admin" interface to the controller and invoker has been something we've been missing. I do think this warrants a bit of discussion.

It's possible as Markus notes that you can get divergent invokers - I'd imagine one would rolls this out by running some script or playbook to update multiple invokers. A restart doesn't affect correctness, just performance... note however it is possible to have different runtime manifests which could lead to different functional behaviors (if you exclude a runtime entirely from the manifest, then actions allocated to that invoker will fail to run).

I can think of different ways to also approach this - say via the existing protocols for communication between the load balancer and the invokers. In particular, if admin changes were front-doored by the controller, you can keep the manifest consistent in the system. Although you might be deliberately thinking of having a heterogenous mix of stem cells.

This is great to see prototyped and give us something concrete to evaluate and discuss.

ningyougang · 2019-01-22T02:01:17Z

@markusthoemmes @rabbah

Background

I think openwhisk needs some operation tools(page or API) to admin, for example:

real-time status of Action, Controller, Invoker, etc.
Change prewarm runtime container for invokers

here, we just discuss Change prewarm container for invokers

Requirements

Admin can add/delete extra prewarm container via controller.
The flow may like this, admin sends a HTTP request to the controller, controller sends that request info to all invokers through kafka,
invoker gets the request info, parse the prewarm runtime info, and do add/delete prewarm container

Note, the prewarm runtime request info's image should be included in ${OPENWHISK_HOME}/ansible/files/runtimes.json ,
if unknow prewarm runtime image is sent to invoker, invoker should reject the request.
Need to support different prewarm runtime for different invokers?
In future, invokers may support group concept, which means, some invokers's cpu, memory is very good, these invokers can execute some height works.
but some invokers may execute some light works.
In this situation, we may consider heterogeneous invoker cluster support, so invokers needs to support different prewarm runtime also.
So admin can send the http request to invoker to add/delete prewarm container.

Question?

 Manipulating a single invoker in this case results in a divergence to the global configuration. If that invoker restarts, what happens?

If invoker restarts, the invoker will read the global runtime.json to create prewarm container( the extra adding prewarm container will be gone, because the invoker can't remember its previous state)

style95 · 2019-01-23T02:59:17Z

IMHO with regard to the admin API, I have been requested from users that they want to see the real-time status of their invocation.
For example, one of my users invokes an action with 200 ~ 300 concurrent containers for about 4 ~ 5 mins. In this case, he cannot call it with blocking call, he calls them asynchronously.
And at some point, he wants to make sure all his concurrent invocation finished.
Currently, he needs to indirectly poll activation and count the number of finished invocation.

I think we need to track the invocation status for each users in the future.(Though there could be some differences because invocation duration can be short.)
It can be achieved with controller API as well, but invoker-side API would give us more fine-grained control as containers are actually running on the invoker side.

One more point is what @ningyougang said.
In the future, there will be some needs for different kinds of container pool.
For example, users might need GPU enabled host, and users might expect more powerful hosts or dedicated resources with higher prices. Or OW operator(public cloud) might want to deploy invoker in many different types of host machines to maximize the utilization of computing resources in the cloud. And the heterogenous invoker cluster requires heterogenous prewarm configuration for efficiency and flexbility.
This is why I think the API should exist in each invoker and it is same for controllers as well.

style95 · 2019-01-23T03:09:08Z

And one of my collegues who is public cloud operator said, he noticed that invocation pattern of each runtimes changes over time. For example, python actions mostly invoked during the morning, PHP actions are invoked during the evening and so on. Also invocation patterns change based on the social events such as big conferences, world cup, big news and so on. He is thinking of dynamically changing the prewarm configuration based on the request pattern prediction with machine learning to minimize cold start. I think this feature can be provided outside of OW and per-invoker or per-controller API would be a good building block for this.

- Add prewarm container - Delete prewarm container - Add test case

ningyougang · 2019-01-31T06:04:10Z

I aready added the logic of add/delete prewarm container via controller to all managed invokers.

ningyougang · 2020-03-31T05:30:50Z

Use another pr: #4790
So close this pr.

ningyougang force-pushed the change-prewarm-container branch 5 times, most recently from fbbaaf3 to 56486a0 Compare January 21, 2019 07:43

ningyougang closed this Jan 21, 2019

ningyougang reopened this Jan 21, 2019

rabbah added enhancement invoker discussion admin labels Jan 21, 2019

ningyougang force-pushed the change-prewarm-container branch from 56486a0 to 9ee745b Compare January 31, 2019 01:53

Change prewarm container

36ceff8

- Add prewarm container - Delete prewarm container - Add test case

ningyougang force-pushed the change-prewarm-container branch from 9ee745b to 36ceff8 Compare January 31, 2019 05:27

style95 mentioned this pull request Nov 15, 2019

Reactive prewarm pool #4725

Open

ningyougang closed this Mar 31, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change prewarm container #4225

Change prewarm container #4225

ningyougang commented Jan 21, 2019 •

edited

Loading

markusthoemmes commented Jan 21, 2019

codecov-io commented Jan 21, 2019 •

edited

Loading

rabbah commented Jan 21, 2019

ningyougang commented Jan 22, 2019

style95 commented Jan 23, 2019

style95 commented Jan 23, 2019

ningyougang commented Jan 31, 2019

ningyougang commented Mar 31, 2020

Change prewarm container #4225

Change prewarm container #4225

Conversation

ningyougang commented Jan 21, 2019 • edited Loading

Description

Related issue and scope

My changes affect the following components

Types of changes

Checklist:

markusthoemmes commented Jan 21, 2019

codecov-io commented Jan 21, 2019 • edited Loading

Codecov Report

rabbah commented Jan 21, 2019

ningyougang commented Jan 22, 2019

Background

Requirements

Question?

style95 commented Jan 23, 2019

style95 commented Jan 23, 2019

ningyougang commented Jan 31, 2019

ningyougang commented Mar 31, 2020

ningyougang commented Jan 21, 2019 •

edited

Loading

codecov-io commented Jan 21, 2019 •

edited

Loading