feat(scheduledsparkapplication): configurable timestampPrecision (nan… #2742

rahul810050 · 2025-11-19T11:41:26Z

feat(scheduledsparkapplication): add configurable timestampPrecision for run name generation

This PR introduces a new field .spec.timestampPrecision that allows users to control the precision of the timestamp suffix added to SparkApplication names generated by ScheduledSparkApplication. Supported values are:

nanos | micros | millis | seconds | minutes

The default remains nanos for full backward compatibility.

Summary

Previously, scheduled runs always used time.UnixNano() to generate the timestamp suffix, producing a 19-digit value. When combined with long application names, this frequently caused the run name to exceed Kubernetes’ 63-character limit.

This PR makes the timestamp precision configurable so users can choose a shorter suffix if needed. A new minutes option is also added to match Kubernetes CronJob controller behavior, which only schedules in minute granularity.

Key Changes

API / CRD
- Added new optional field .spec.timestampPrecision to ScheduledSparkApplicationSpec
- Enum validation: nanos, micros, millis, seconds, minutes
- Default: "nanos" (preserves current behavior)
- Regenerated CRDs using make generate + make manifests
Controller
- Added formatTimestamp() helper to format timestamps according to the selected precision
- Updated run-name generation to use this helper
- minutes mode computes Unix()/60 to stay consistent with CronJob naming semantics
Tests
- Added format_timestamp_test.go to validate timestamp length for all supported precisions
- Updated envtest setup helper script for contributors
Helm Chart
- Added optional value controller.scheduledSparkApplication.timestampPrecision
- Defaults to "nanos"

Why This Is Needed (Fixes #2602)

Users with long application names often hit Kubernetes’ 63-character name limit because the operator always appends a 19-digit nanosecond timestamp.

Allowing precision selection reduces the suffix size:

Precision	Digits	Example Use Case
minutes	~8	Matches CronJob granularity; shortest valid suffix
seconds	10	Common for hourly/daily jobs
millis	13	High-rate jobs
micros	16	Advanced workloads
nanos	19	Current behavior

This gives users flexibility while keeping backward compatibility.

How I Tested

gofmt -s -w .
make generate
make manifests

# setup envtest
bash scripts/setup-envtest-binaries.sh
export KUBEBUILDER_ASSETS="$(pwd)/bin/k8s/v1.32.0-linux-amd64"

# run unit tests
go test ./internal/controller/scheduledsparkapplication -v

Verified CRD generated correctly (enum + default)
Ensured all timestamp precisions produce expected digit lengths
Confirmed controller creates SparkApplication names with correct suffixes

Checklist:

ChenYi015 · 2025-11-19T11:46:26Z

Hi @rahul810050 , you do not have to create a new PR for every update. You can run git push command by adding an extra --force flag.

rahul810050 · 2025-11-19T11:52:52Z

Hi @rahul810050 , you do not have to create a new PR for every update. You can run git push command by adding an extra --force flag.

sure @ChenYi015 i will keep it in mind...thanks!

ChenYi015

@rahul810050 Thanks for your contribution! I have left some comments.

ChenYi015 · 2025-11-20T06:05:28Z

api/v1beta2/scheduledsparkapplication_types.go

 // EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
 // NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.

+// ScheduledSparkApplicationSpec defines the desired state of ScheduledSparkApplication.


Duplicated.

ChenYi015 · 2025-11-20T06:06:44Z

api/v1beta2/scheduledsparkapplication_types.go

+	// TimestampPrecision controls the precision of the timestamp appended to generated
+	// SparkApplication names for scheduled runs.
+	//
+	// Allowed values: "nanos", "micros", "millis", "seconds", "minutes"
+	// +kubebuilder:validation:Enum=nanos;micros;millis;seconds;minutes
+	// +optional
+	// +kubebuilder:default:=nanos
+	// Defaults to "nanos" to preserve current behavior.
+	TimestampPrecision string `json:"timestampPrecision,omitempty"`
+


This field should be removed as we have dicussed.

ChenYi015 · 2025-11-20T06:09:29Z

internal/controller/scheduledsparkapplication/controller.go

+	"os"
+	"strconv"
+


Would be better to group the imports with the specified order, i.e. stdlib, third-party and self imports.

ChenYi015 · 2025-11-20T06:14:05Z

scripts/setup-envtest-binaries.sh

@@ -0,0 +1,102 @@
+#!/usr/bin/env bash


Is this file needed? We can run unit tests and e2e tests directly without setting up envtest binaries.

Actually I am on Arch Linux so I needed this for my machine

Is there any error when running make unit-test?

@ChenYi015 thanks for asking!

Yes — when I run make unit-test on my Arch Linux setup, it fails because envtest cannot find the Kubernetes API server binaries (kube-apiserver / etcd). Arch Linux does not ship compatible versions by default, and setup-envtest also fails due to version mismatch.

That’s why I added the helper script — it downloads the exact binaries that controller-runtime expects and places them in the correct directory, allowing the unit tests and e2e tests to run successfully on my machine.

If the project prefers not to include this script, I’m totally fine removing it and relying only on the documented workflow. I mainly added it to help cross-platform contributors like myself.

Please let me know what you prefer and I’ll update the PR accordingly!

ChenYi015 · 2025-11-20T06:26:03Z

charts/spark-operator-chart/values.yaml

+  # -- default precision for ScheduledSparkApplication timestamp suffix
+  scheduledSparkApplication:
+    timestampPrecision: "nanos"
+


Suggested change

# -- default precision for ScheduledSparkApplication timestamp suffix

scheduledSparkApplication:

timestampPrecision: "nanos"

# ScheduledSparkApplication controller configurations.

scheduledSparkApplication:

# -- Default timestamp precision for the scheduled SparkApplication name suffix.

# Can be one of `nanos`, `micros`, `millis`, `seconds` and `minutes`.

timestampPrecision: nanos

Would be better to move this value after controller.batchScheduler as it is a less common used configuration. Then run make helm-docs to update the Helm chart README.

ChenYi015 · 2025-11-20T06:32:28Z

internal/controller/scheduledsparkapplication/controller.go

+
+	// Determine timestamp precision with precedence:
+	// 1) controller-wide env var SCHEDULED_SA_TIMESTAMP_PRECISION (if set and non-empty)
+	// 2) per-app scheduledApp.Spec.TimestampPrecision (if set)
+	// 3) default "nanos" for backward compatibility
+	precision := "nanos" // fallback default
+
+	if envPrecision, ok := os.LookupEnv("SCHEDULED_SA_TIMESTAMP_PRECISION"); ok && strings.TrimSpace(envPrecision) != "" {
+		precision = strings.TrimSpace(envPrecision)
+	} else if strings.TrimSpace(scheduledApp.Spec.TimestampPrecision) != "" {
+		precision = strings.TrimSpace(scheduledApp.Spec.TimestampPrecision)
+	}
+
+	suffix := formatTimestamp(precision, t)
+


We can add a new controller flag to be consistent with the current implementation, like --timestamp-precision, instead of adding a new env.
See: https://github.com/kubeflow/spark-operator/blob/f53373e7e94ec6eb69d2f6829e61e9c9497e71d6/cmd/operator/controller/start.go

google-oss-prow · 2025-11-20T11:01:45Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from chenyi015. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ChenYi015 · 2025-11-21T03:06:54Z

Some changes still need to be made:

We do not have to removecontroller.scheduledSparkApplication from values.yaml.

Add a new flag parameter named ScheduledSATimestampPrecision , like:

spark-operator/cmd/operator/controller/start.go

Lines 86 to 96 in f53373e

    
           // Batch scheduler 
        
           enableBatchScheduler  bool 
        
           kubeSchedulerNames    []string 
        
           defaultBatchScheduler string 
        
           // Spark web UI service and ingress 
        
           enableUIService    bool 
        
           ingressClassName   string 
        
           ingressURLFormat   string 
        
           ingressTLS         []networkingv1.IngressTLS 
        
           ingressAnnotations map[string]string

Then pass the value to scheduledsparkapplication.Options, like:

spark-operator/cmd/operator/controller/start.go

Lines 443 to 448 in f53373e

    
           func newScheduledSparkApplicationReconcilerOptions() scheduledsparkapplication.Options { 
        
           	options := scheduledsparkapplication.Options{ 
        
           		Namespaces: namespaces, 
        
           	} 
        
           	return options 
        
           }

ChenYi015 · 2025-11-21T02:37:55Z

api/v1beta2/scheduledsparkapplication_types.go


-// EDIT THIS FILE!  THIS IS SCAFFOLDING FOR YOU TO OWN!
-// NOTE: json tags are required.  Any new fields you add must have json tags for the fields to be serialized.
-


Changes to this file should be removed.

ChenYi015 · 2025-11-21T02:39:06Z

charts/spark-operator-chart/templates/controller/deployment.yaml

-        {{- with .Values.controller.env }}
+
+        # --- Controller environment variables (preserve user configured controller.env if any).
+        {{- if .Values.controller.env }}
        env:
-        {{- toYaml . | nindent 8 }}
+{{ toYaml .Values.controller.env | nindent 8 }}
        {{- end }}
+


Let us revert it back to avoid unnecessary changes.

ChenYi015 · 2025-11-21T03:09:15Z

internal/controller/scheduledsparkapplication/format_timestamp_test.go

@@ -0,0 +1,26 @@
+package scheduledsparkapplication
+


Since the formatTimestampLengths util function is defined in controller.go, it is suggested to put the related unit tests in file controller_test.go .

…os|micros|millis|seconds|minutes) Add optional spec.timestampPrecision to configure the precision of the timestamp suffix appended to generated SparkApplication names for scheduled runs. Default remains 'nanos' for backward compatibility. Adds 'minutes' option to match CronJob granularity and keep generated names short. Includes helper function, unit tests and optional chart value. Fixes: kubeflow#2602 Signed-off-by: rahul810050 <[email protected]>

google-oss-prow bot requested review from ImpSy and nabuskey November 19, 2025 11:41

google-oss-prow bot added the size/L label Nov 19, 2025

rahul810050 mentioned this pull request Nov 19, 2025

Ability to configure the precision of the timestamp attached to the application name when running scheduled spark applications #2602

Open

rahul810050 force-pushed the feat/timestamp-precision branch from aa58c67 to 17c21a9 Compare November 19, 2025 19:34

ChenYi015 requested changes Nov 20, 2025

View reviewed changes

google-oss-prow bot assigned ChenYi015 Nov 20, 2025

rahul810050 requested a review from ChenYi015 November 20, 2025 11:45

rahul810050 force-pushed the feat/timestamp-precision branch from 9f27777 to 8e68ded Compare November 20, 2025 11:51

ChenYi015 reviewed Nov 21, 2025

View reviewed changes

ChenYi015 mentioned this pull request Nov 21, 2025

Feat/sparkconnect ingress #2745

Open

6 tasks

rahul810050 force-pushed the feat/timestamp-precision branch from 752715e to 7e76e64 Compare November 22, 2025 11:36

rahul810050 requested a review from ChenYi015 November 22, 2025 17:42


		// EDIT THIS FILE! THIS IS SCAFFOLDING FOR YOU TO OWN!
		// NOTE: json tags are required. Any new fields you add must have json tags for the fields to be serialized.

feat(scheduledsparkapplication): configurable timestampPrecision (nan… #2742

Are you sure you want to change the base?

feat(scheduledsparkapplication): configurable timestampPrecision (nan… #2742

Conversation

rahul810050 commented Nov 19, 2025

Summary

Key Changes

Why This Is Needed (Fixes #2602)

How I Tested

Checklist:

Uh oh!

ChenYi015 commented Nov 19, 2025

Uh oh!

rahul810050 commented Nov 19, 2025

Uh oh!

ChenYi015 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

google-oss-prow bot commented Nov 20, 2025

Uh oh!

ChenYi015 commented Nov 21, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants