-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
What happened?
When updating any field under the spec of a SparkConnect resource (e.g., executor.instances, resource requests etc.), the CR’s manifest updates correctly, but the changes do not get reflected in the running pods. The operator status and pod count remain stale, and the operator does not reconcile or restart pods automatically. Changes only take effect after manually deleting pods, which forces the operator to recreate them.
Reproduction Code
### Steps to Reproduce:
1. Deploy a SparkConnect resource, for example:
```yaml
apiVersion: sparkoperator.k8s.io/v1alpha1
kind: SparkConnect
metadata:
name: spark-connect
namespace: spark
spec:
sparkVersion: 4.0.0
server:
template:
metadata:
labels:
key1: value1
key2: value2
annotations:
key3: value3
key4: value4
spec:
containers:
- name: spark-kubernetes-driver
image: spark:4.0.0
imagePullPolicy: Always
resources:
requests:
cpu: 1
memory: 1Gi
limits:
cpu: 1
memory: 1Gi
serviceAccount: spark-operator-spark
securityContext:
capabilities:
drop:
- ALL
runAsGroup: 185
runAsUser: 185
runAsNonRoot: true
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
executor:
instances: 1
cores: 1
memory: 512m
template:
metadata:
labels:
key1: value1
key2: value2
annotations:
key3: value3
key4: value4
spec:
containers:
- name: spark-kubernetes-executor
image: spark:4.0.0
imagePullPolicy: Always
securityContext:
capabilities:
drop:
- ALL
runAsGroup: 185
runAsUser: 185
runAsNonRoot: true
allowPrivilegeEscalation: false
seccompProfile:
type: RuntimeDefault
- Update a field under
spec
, for example:
- Increase
executor.instances
from 1 to 3 - Change
server.template.spec.containers[0].resources.requests.cpu
- Observe the resource:
kubectl get sparkconnect spark-connect -n spark -o yaml
The updated spec is visible, but the status still reflects the old state (e.g., number of running executors does not change).
-
Observe pods do not scale or update accordingly.
-
Only by manually deleting pods (e.g.,
kubectl delete pod ...
) do the changes get applied when pods restart.
Expected behavior
Any spec update in the SparkConnect resource should trigger the operator to:
1.Reconcile the change automatically, scaling or updating pods as necessary, without manual pod deletion.
2.Update the status to reflect the current state matching the desired spec.
Actual behavior
1.Spec updates are accepted and reflected in the CR manifest.
2.Operator does not reconcile or restart pods based on spec changes.
3.Pod state and operator status remain stale.
4.Manual pod deletion is required to trigger reconciliation.
Environment & Versions
- Kubernetes Version:v1.29.0
- Spark Operator Version:2.3.0
- Apache Spark Version:4.0.0
Additional context
No response
Impacted by this bug?
Give it a 👍 We prioritize the issues with most 👍