-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Is your feature request related to a problem? Please describe.
HAPipeline mode makes it so that multiple nodes can be running a pipeline at once. The issue is that only one node can participate in the process, and all the other nodes are sitting there not doing anything.
Describe the solution you'd like
In a distributed mode, something would need to be set up where the pipeline will run on each node. Each pipeline will run all steps concurrently, and there would be some message brokering service that hands DataContext objects between nodes.
Describe alternatives you've considered
I think a message broker service would be the best way forward because running our own service discovery and direct node communication doesn't really make sense in the architectural design of Watergrid. The idea is that there are a minimal number of service dependencies outside of the library itself, which hosts the pipeline and context protection.
Additional context
This issue will probably need to be an epic, as this will be require several smaller steps to fully implement.
Implementation
- Pipeline hash function representing unique step count/order #72
-
PipelineLockfunction to serialize/deserialize context objects #73 -
PipelineLockpub/sub methods (check for messages, get X messages, publish) #74 - Worker pool with job queue #75
- Verify pipeline hash in DP and HA mode before starting #76
- Build DP mode implementation of
Pipelineclass - Document DP mode setup and code
- Update changelog