-
Notifications
You must be signed in to change notification settings - Fork 290
Description
Describe the bug
Stack was running nicely, had scaled up to four instances.
There's only enough work for three instances, so the ASG gets told to reduce capacity.
As a result, jobs which are in progress get interrupted partway through (in this case, the job was midway through pushing out a production hotfix, which was a great addition to a morning of incident response :))
Expected behavior
An agent which isn't currently performing work gets selected for termination
Actual behaviour
An instance performing useful work is often killed
Stack parameters:
- AWS Region: us-east-2
- Version 6.27.0
** Context **
Changing the size of an ASG is a very blunt instrument.
Consider instead the detach-instance call, which removes an instance from the ASG and decrements the DesiredCapacity.
Once the detach-instance completes, you could then terminate the instance from the lambda; this lets you pick which instance gets killed.