-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ForceMove option on vacate parameters #489
base: master
Are you sure you want to change the base?
Add ForceMove option on vacate parameters #489
Conversation
dd1c07b
to
799ce06
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this. I have two questions.
@@ -391,6 +393,7 @@ func newMoveClusterParams(params *VacateClusterParams) (*platform_infrastructure | |||
platform_infrastructure.NewMoveClustersParams(). | |||
WithAllocatorDown(params.AllocatorDown). | |||
WithMoveOnly(params.MoveOnly). | |||
WithForceMove(params.ForceMove). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed here? I would remove it because I think it alters the soteria behaviour i.e. currently we:
- Issue a request to validate the move request
validate_only=true
- Use the params returned to issue the actual vacate
Now, if we set the force_move flag in the validation API, the admin console will respect what u asked.
So in case of soteria, what would be the behaviour? which value will the force_move
param have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove this one, This is the result ( see logs https://platform-logging.kb.eu-west-1.aws.qa.cld.elstc.co/app/r/s/1ElD8 )
Testing with Force_move: true
in Fleetctl code:
Our of 3 log entries,
The first one will have the correct value forceMove=[true]
:
Request to move clusters: (allocatorId=[gi-5926589073801201158], authz=[Some(1)], moveRequest=[None], maybeClusterType=[None], forceUpdate=[false], allocatorDown=[None], validateOnly=[true], moveOnly=[true], forceMove=[true])
The second and the third one will have the wrong value forceMove=[false]
:
Request to move clusters: (allocatorId=[gi-5926589073801201158], authz=[Some(1)], moveRequest=[Some(MoveClustersRequest(Some(List(MoveElasticsearchClusterConfiguration(List(44e610693f534648a7ccacedcb16c65b),None))),None,None,None,None))], maybeClusterType=[None], forceUpdate=[false], allocatorDown=[Some(false)], validateOnly=[true], moveOnly=[true], forceMove=[false])
Request to move clusters: (allocatorId=[gi-5926589073801201158], authz=[Some(1)], moveRequest=[Some(MoveClustersRequest(Some(List(MoveElasticsearchClusterConfiguration(List(44e610693f534648a7ccacedcb16c65b),Some(TransientElasticsearchPlanConfiguration(Some(PlanStrategy(None,Some(GrowShrinkStrategyConfig()),None,None)),Some(EsPlanControlConfiguration(Some(32768),Some(5),None,Some(List(AllocatorMoveRequest(gi-5926589073801201158,None,Some(false)))),Some(true),None,None,Some(false),None,None,Some(false),Some(3),Some(300),Some(false),None,Some(false),Some(false),Some(false),Some(false),Some(false),None)),None,None,None))))),None,None,None,None))], maybeClusterType=[Some(ElasticsearchClusterType)], forceUpdate=[false], allocatorDown=[Some(false)], validateOnly=[false], moveOnly=[true], forceMove=[false])
Testing with Force_move: true
in Fleetctl code:
all the logs entries will have forceMove=[false]
see https://platform-logging.kb.eu-west-1.aws.qa.cld.elstc.co/app/r/s/NtaT0
@@ -127,6 +127,7 @@ func moveNodes(id string, params *VacateParams, p *pool.Pool) ([]pool.Validator, | |||
WithAllocatorID(id). | |||
WithMoveOnly(params.MoveOnly). | |||
WithContext(api.WithRegion(context.Background(), params.Region)). | |||
WithForceMove(params.ForceMove). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, won't this alter the behaviour of the soteria flow? for example to always set force_move
to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove this one, This is the result ( see logs https://platform-logging.kb.eu-west-1.aws.qa.cld.elstc.co/app/r/s/g99SA )
- Testing with
Force_move: true
in Fleetctl code:
Our of 3 log entries,
The first one will have the wong value forceMove=[false]
:
Request to move clusters: (allocatorId=[gi-5926589073801201158], authz=[Some(1)], moveRequest=[None], maybeClusterType=[None], forceUpdate=[false], allocatorDown=[None], validateOnly=[true], moveOnly=[true], forceMove=[false])
The second and the third one will have the correct value forceMove=[true]
:
Request to move clusters: (allocatorId=[gi-5926589073801201158], authz=[Some(1)], moveRequest=[Some(MoveClustersRequest(Some(List(MoveElasticsearchClusterConfiguration(List(44e610693f534648a7ccacedcb16c65b),None))),None,None,None,None))], maybeClusterType=[None], forceUpdate=[false], allocatorDown=[Some(false)], validateOnly=[true], moveOnly=[true], forceMove=[true])
Request to move clusters: (allocatorId=[gi-5926589073801201158], authz=[Some(1)], moveRequest=[Some(MoveClustersRequest(Some(List(MoveElasticsearchClusterConfiguration(List(44e610693f534648a7ccacedcb16c65b),Some(TransientElasticsearchPlanConfiguration(Some(PlanStrategy(None,Some(GrowShrinkStrategyConfig()),None,None)),Some(EsPlanControlConfiguration(Some(32768),Some(5),None,Some(List(AllocatorMoveRequest(gi-5926589073801201158,None,Some(false)))),Some(true),None,None,Some(false),None,None,Some(false),Some(3),Some(300),Some(false),None,Some(false),Some(false),Some(false),Some(false),Some(false),None)),None,None,None))))),None,None,None,None))], maybeClusterType=[Some(ElasticsearchClusterType)], forceUpdate=[false], allocatorDown=[Some(false)], validateOnly=[false], moveOnly=[true], forceMove=[false])
- Testing with
Force_move: true
in Fleetctl code:
all the logs entries will have forceMove=[false]
see see logs https://platform-logging.kb.eu-west-1.aws.qa.cld.elstc.co/app/r/s/se28H
As a last attempt added ForceMove as a Plan Override in the last commit and tested but also was unsuccessful |
Also tested directly with curl, but the plans in adminconsole still had
|
Moving this back as a Draft and will create a Bug Jira as commented with Gheorge Phucea |
Description
Adds ForceMove parameter to vacate parametes, in order to choose primitive vacate or standard vacate
Related Issues
Rel: https://elasticco.atlassian.net/browse/INFRA-3057
Rel: https://elasticco.atlassian.net/browse/INFRA-2823
Motivation and Context
When a vacate plan fails due to Primitive vacate failure ( with error
InfrastructureFailure:CopyDataFailure
), we want to be able to retry with Standad vacate instead of Primitive vacate from automation tools like Fleetctl in order to try to reduce the number of SDH issues created.How Has This Been Tested?
This has been tested locally from Fleetctl https://github.com/elastic/fleetctl/pull/195, testing the cloud-sdk-go changes with this line in go.mod
replace github.com/elastic/cloud-sdk-go => ../cloud-sdk-go
Types of Changes
Readiness Checklist