Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listener pod failing after scale-set upgrade #3726

Open
4 tasks done
albertollamaso opened this issue Aug 28, 2024 · 2 comments
Open
4 tasks done

Listener pod failing after scale-set upgrade #3726

albertollamaso opened this issue Aug 28, 2024 · 2 comments
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers

Comments

@albertollamaso
Copy link

Checks

Controller Version

2.318.0

Deployment Method

Helm

Checks

  • This isn't a question or user support case (For Q&A and community support, go to Discussions).
  • I've read the Changelog before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes

To Reproduce

1. Upgrade `gha-runner-scale-set` from any version to another, example: 2.317.0 -> 2.318.0
2. Check logs of the listener pod, example:

kubectl logs -f self-hosted-hide-7ff847bf-listener

Logs:

2024-08-28T09:43:33Z	INFO	listener-app.listener	Current runner scale set statistics.	{"statistics": "{\"totalAvailableJobs\":0,\"totalAcquiredJobs\":1,\"totalAssignedJobs\":1,\"totalRunningJobs\":0,\"totalRegisteredRunners\":0,\"totalBusyRunners\":0,\"totalIdleRunners\":0}"}
2024-08-28T09:43:33Z	INFO	listener-app.worker.kubernetesworker	Calculated target runner count	{"assigned job": 1, "decision": 1, "min": 0, "max": 5, "currentRunnerCount": 1, "jobsCompleted": 0}
2024-08-28T09:43:33Z	INFO	listener-app.worker.kubernetesworker	Compare	{"original": "{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"replicas\":-1,\"patchID\":-1,\"ephemeralRunnerSpec\":{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"containers\":null}}},\"status\":{\"currentReplicas\":0,\"pendingEphemeralRunners\":0,\"runningEphemeralRunners\":0,\"failedEphemeralRunners\":0}}", "patch": "{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"replicas\":1,\"patchID\":0,\"ephemeralRunnerSpec\":{\"metadata\":{\"creationTimestamp\":null},\"spec\":{\"containers\":null}}},\"status\":{\"currentReplicas\":0,\"pendingEphemeralRunners\":0,\"runningEphemeralRunners\":0,\"failedEphemeralRunners\":0}}"}
2024-08-28T09:43:33Z	INFO	listener-app.worker.kubernetesworker	Preparing EphemeralRunnerSet update	{"json": "{\"spec\":{\"patchID\":0,\"replicas\":1}}"}
2024-08-28T09:43:33Z	INFO	listener-app.listener	Deleting message session
2024/08/28 09:43:34 Application returned an error: handling initial message failed: could not patch ephemeral runner set , patch JSON: {"spec":{"patchID":0,"replicas":1}}, error: ephemeralrunnersets.actions.github.com "self-hosted-hide-rhtjx" not found


### Describe the bug

It looks like that the listener is looking for a `ephemeralrunnersets` that does not exist. Checking the properties of CRD `autoscalinglisteners` I could confirm that this resource is tied to the `ephemeralrunnersets.actions.github.com "self-hosted-hide-rhtjx"`

kubectl describe autoscalinglisteners self-hosted-hide-7ff847bf-listener -n github-self-hosted-runners

Name: self-hosted-hide-7ff847bf-listener
Namespace: github-self-hosted-runners
Labels: actions.github.com/organization=hidehide
actions.github.com/scale-set-name=self-hosted-hide
actions.github.com/scale-set-namespace=github-self-hosted-scale-set
app.kubernetes.io/component=runner-scale-set-listener
app.kubernetes.io/instance=self-hosted-hide
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=self-hosted-hide
app.kubernetes.io/part-of=gha-runner-scale-set
app.kubernetes.io/version=0.9.3
helm.sh/chart=gha-rs-0.9.3
...

Ephemeral Runner Set Name: self-hosted-hide-rhtjx


Currently to fix the issue I have to delete the `autoscalinglisteners` every time I upgrade a version.

kubectl delete autoscalinglisteners self-hosted-appsupport-7ff847bf-listener




### Describe the expected behavior

The listener does not fail after a version upgrade of the scale-set

### Additional Context

```yaml
n/a

Controller Logs

2024-08-28T09:14:16Z	INFO	AutoscalingListener	Listener pod is terminated	{"version": "0.9.3", "autoscalinglistener": {"name":"self-hosted-hide-7ff847bf-listener","namespace":"github-self-hosted-runners"}, "namespace": "github-self-hosted-runners", "name": "self-hosted-hide-7ff847bf-listener", "reason": "Error", "message": ""}
2024-08-28T09:14:17Z	INFO	AutoscalingListener	Listener pod is terminated	{"version": "0.9.3", "autoscalinglistener": {"name":"self-hosted-hide-7ff847bf-listener","namespace":"github-self-hosted-runners"}, "namespace": "github-self-hosted-runners", "name": "self-hosted-hide-7ff847bf-listener", "reason": "Error", "message": ""}
2024-08-28T09:14:18Z	INFO	AutoscalingListener	Listener pod is terminated	{"version": "0.9.3", "autoscalinglistener": {"name":"self-hosted-hide-7ff847bf-listener","namespace":"github-self-hosted-runners"}, "namespace": "github-self-hosted-runners", "name": "self-hosted-hide-7ff847bf-listener", "reason": "Error", "message": ""}

Runner Pod Logs

it actually does not start any runner due the listener crashing
@albertollamaso albertollamaso added bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers labels Aug 28, 2024
Copy link
Contributor

Hello! Thank you for filing an issue.

The maintainers will triage your issue shortly.

In the meantime, please take a look at the troubleshooting guide for bug reports.

If this is a feature request, please review our contribution guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gha-runner-scale-set Related to the gha-runner-scale-set mode needs triage Requires review from the maintainers
Projects
None yet
Development

No branches or pull requests

2 participants
@albertollamaso and others