-
Notifications
You must be signed in to change notification settings - Fork 750
feat(spark): Populate time variables for log links (#6328) #6411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(spark): Populate time variables for log links (#6328) #6411
Conversation
Thank you for opening this pull request! 🙌 These tips will help get your PR across the finish line:
|
Code Review Agent Run #adc73aActionable Suggestions - 0Review Details
Bito Usage GuideCommands Type the following command in the pull request comment and save the comment.
Refer to the documentation for additional commands. Configuration This repository uses Documentation & Help |
Not yet tested I just want to make sure my implementation and my track is correct as this is my first contribution to flyte. |
Changelist by BitoThis pull request implements the following key changes.
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6411 +/- ##
==========================================
+ Coverage 58.49% 59.46% +0.97%
==========================================
Files 940 553 -387
Lines 71555 37862 -33693
==========================================
- Hits 41855 22515 -19340
+ Misses 26519 13690 -12829
+ Partials 3181 1657 -1524
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
We will review and get back soon |
This pull request modifies the getEventInfoForSpark function within flyteplugins/go/tasks/plugins/k8s/spark/spark.go.
Extract Timestamps: Retrieves the SubmissionTime, CompletionTime, and TerminationTime from the sparkOp.SparkApplicationStatus (sj.Status).
Format Timestamps: The retrieved metav1.Time values are formatted into RFC3339 strings and standard Unix timestamps (int64). SubmissionTime is used for the start time, and CompletionTime (or TerminationTime as a fallback) is used for the end time. Checks are included to handle cases where these timestamps might be zero (e.g., job hasn't started/finished).
Populate tasklog.Input: The formatted timestamps are used to populate the PodRFC3339StartTime, PodRFC3339FinishTime, PodUnixStartTime, and PodUnixFinishTime fields within the tasklog.Input struct.
Targeted Application: This population is specifically done for the calls to p.GetTaskLogs that fetch logs associated with the Spark driver pod (sj.Status.DriverInfo.PodName), namely the "Driver Logs" (using the "Mixed" log config) and "User Logs" (using the "User" log config). Calls for "System" and "AllUser" logs remain unchanged as they use the application name rather than a specific pod name.
Summary by Bito
This PR enhances Spark logging by extracting time metrics from SparkApplicationStatus, computing start/finish timestamps based on submission and completion times. The timestamps are formatted in RFC3339 and Unix formats to improve the tasklog.Input structure for both driver and user logs, enabling more accurate log tracking and troubleshooting.Unit tests added: False
Estimated effort to review (1-5, lower is better): 1