-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HostManager overhaul #353
HostManager overhaul #353
Conversation
The agent-class column will show if there are discrepancies between what a hostmanager reports and the local scf that ocsbow sees. Also tolerates broader range of mismatch.
This just factors out each docker inspect call so failures can be trapped more easily. Behavior should be altered.
Instead, the 'prot' elements for docker-managed things are long-lived and will be immediately updated when we learn new service info.
The logic handling the agent list and reconciling it against docker inspection results is rewritten (and probably improved).
Docstring in site_config explains them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks for all the updates/bug fixes!
I tested several of the bug fixes, particularly those I reported. I, however, didn't figure out how to reproduce #348, maybe we'll just see how it performs on site.
I found one documentation/behavior inconsistency, noted below.
I still find the need for two separate host managers, one for Docker agents and one for Host agents, a bit awkward. More so now that manage options include "docker/up", "docker/down", "host/up", and "host/down", since one could potentially misconfigure a system by saying "host/up" on a Docker host manager, and vice versa.
I did test mixing these states a bit, and it puts things into a strange enough state (InfluxDBAgent[d?] ?/up
, for instance) that one would notice and fix the issue. So I'd say this is alright, but maybe take the opportunity to bring up the idea of merging this so that a single host manager is needed to control both host and docker agents. Not in this PR of course, but down the road.
Ah, let me take another look at this ... not sure I tested network / crossbar dropouts carefully.
One doesn't need two separate HMs, right? It's just a bit more config constraint / work to get them all under single management? For host-based and docker-based agents to run with a single HM, you need the instance descriptions to all be in a single host block (call it host1). Supposing all the defaults are set up to make life easy for host-based agents, the docker ones need:
This might be a touch easier if we'd also added a SITE_HOST envvar ... then you could use yaml anchors and state all the needed overrides (as environment) once at the top of the docker-compose.yaml. |
See #354, which should make host+docker-based on single HM quite manageable.
Ok, so HostManager crashes, just like everybody else, when crossbar/network goes down. I propose we revisit that after #337. |
Great, I was just writing a reply to your first response -- I hadn't tried to run with just a single HM. The mode we're in now at site is 2 HM's, one host, one docker (though after moving the ACU over, we don't really need the host one at the moment.) I fully support moving to a single HM though, so this is great.
Alright, I'm going to remove #348 from the resolved list for now then. |
Description
Updates to HostManager to address a number of recent concerns.
Biggest changes:
One interface change is the meaning of "manage: yes", and "manage: docker". Previously those would include the instance for HostManager management, but with a default target state of "down". I think it makes more sense for the default target state to be "up". This only matters if people are running HostManager, and if they are running HostManager they probably want to take advantage of the auto-startup. (Most installations probably have
--initial-state=up
added to the HM args -- that should be discouraged, now, as it would defeat the use of "host/down" and "docker/down" options.)Motivation and Context
This is intended to:
This does not include the features proposed in #320. I can rebase on that work if it converges soon, or integrate those functions directly in this branch if that is easier for everyone.
How Has This Been Tested?
Manually.
Types of changes
Checklist: