Skip to content

Shrinking autoscale group kills in-progress builds #1399

@DanielHeath

Description

@DanielHeath

Describe the bug

Stack was running nicely, had scaled up to four instances.

There's only enough work for three instances, so the ASG gets told to reduce capacity.

As a result, jobs which are in progress get interrupted partway through (in this case, the job was midway through pushing out a production hotfix, which was a great addition to a morning of incident response :))

Expected behavior

An agent which isn't currently performing work gets selected for termination

Actual behaviour

An instance performing useful work is often killed

Stack parameters:

  • AWS Region: us-east-2
  • Version 6.27.0

** Context **

Changing the size of an ASG is a very blunt instrument.

Consider instead the detach-instance call, which removes an instance from the ASG and decrements the DesiredCapacity.

Once the detach-instance completes, you could then terminate the instance from the lambda; this lets you pick which instance gets killed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions