Open
Description
Operator Version, Kind and Kubernetes Version
- Operator version: 2.9.2
- Kind: AgentPool
- Kubernetes version: v1.32.3-eks
YAML Manifest File
apiVersion: app.terraform.io/v1alpha2
kind: AgentPool
metadata:
name: agent-pool
namespace: terraform
spec:
organization: org-name
token:
secretKeyRef:
name: tfc-owners-team-token
key: token
name: agent-pool
agentTokens:
- name: agent-token
agentDeployment:
annotations:
karpenter.sh/do-not-disrupt: "true"
spec:
serviceAccountName: service-account-name
containers:
- name: tfc-agent
image: hashicorp/tfc-agent
resources:
requests:
cpu: 1
memory: 2Gi
limits:
memory: 3Gi
autoscaling:
minReplicas: 0
maxReplicas: 20
cooldownPeriod:
scaleUpSeconds: 30
scaleDownSeconds: 30
Output Log
https://gist.github.com/jrindy-iterable/f3055164b2f4a752215764cac51a8c0f
Output of relevant kubectl
commands
N/A
Steps To Reproduce
- Apply the AgentPool manifest with autoscaling options:
kubectl apply -f agentpool.yaml
- Have workspaces sitting in states that require user confirmation/approval/interaction
Expected Behavior
What should have happened?
- Agents should scale down when they aren't doing anything instead of sitting idle.
- This should include workspaces that are waiting for user confirmation since there is no specific timeframe that a user may interact with a workspace waiting for user interaction, so there is no reason to leave an agent sitting idle for those workspaces.
Actual Behavior
What actually happened?
- Since the fix in v2.9.2 for the
pendingWorkspaceRuns
uses theStatusGroup: "non_final"
parameter it is returning workspaces that require user confirmation. https://developer.hashicorp.com/terraform/cloud-docs/api-docs/run#run-status-groups
Additional Context
Add any other context about the problem here.
- This started happening when I bumped our helm chart from 2.9.0 to 2.9.2.
- I am now using 2.9.1 without issue though, so the issue definitely started with v2.9.2.
References
- 🐛 Fix the early agent termination issue #610 - I believe that this is the PR that introduced the issue
Community Note
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.