Skip to content

Commit

Permalink
[SPARK-51175][CORE] Make Master show elapsed time when removing dri…
Browse files Browse the repository at this point in the history
…vers

### What changes were proposed in this pull request?

This PR aims to make `Spark Master` show `Elapsed time` when removing drivers.

**BEFORE**
```
25/02/11 22:08:28 INFO Master: Removing driver: driver-20250211220723-0000 (FINISHED)
25/02/11 22:13:00 INFO Master: Removing driver: driver-20250211221217-0001 (KILLED)
```

**AFTER**
```
25/02/11 22:08:28 INFO Master: Removing driver: driver-20250211220723-0000 (FINISHED, Elapsed time: 64629 ms)
25/02/11 22:13:00 INFO Master: Removing driver: driver-20250211221217-0001 (KILLED, Elapsed time: 43128 ms)
```

### Why are the changes needed?

When there are multiple submitted jobs, it's difficult to find how long the jobs took.

Please note that `Spark Driver` can be stuck due to insufficient resources of the cluster. So, it's `Elapsed time` instead of `Uptime (or Runtime)`.
```
25/02/11 22:12:17 INFO Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
25/02/11 22:12:17 WARN Master: Driver driver-20250211221217-0001 requires more resource than any of Workers could have.
25/02/11 22:13:00 INFO Master: Asked to kill driver driver-20250211221217-0001
25/02/11 22:13:00 INFO Master: Kill request for driver-20250211221217-0001 submitted
25/02/11 22:13:00 INFO Master: Removing driver: driver-20250211221217-0001 (KILLED, Elapsed time: 43128 ms)
```

### Does this PR introduce _any_ user-facing change?

No, there is no behavior change. Only logs show additional info.

### How was this patch tested?

Manual tests.

1. Start `Master`.
```
$ SPARK_NO_DAEMONIZE=1 sbin/start-master.sh
```

2. Start 'Worker'.
```
$ sbin/start-worker.sh spark://$(hostname):7077
```

3. Submit a job.
```
$ ./examples/src/main/scripts/submit-pi.sh
```

4. Check the log of `Master`.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes #49903 from dongjoon-hyun/SPARK-51175.

Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
dongjoon-hyun committed Feb 12, 2025
1 parent 8bacf99 commit 42ecabf
Showing 1 changed file with 2 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -1375,7 +1375,8 @@ private[deploy] class Master(
drivers.find(d => d.id == driverId) match {
case Some(driver) =>
logInfo(log"Removing driver: ${MDC(LogKeys.DRIVER_ID, driverId)}" +
log" (${MDC(LogKeys.DRIVER_STATE, finalState)})")
log" (${MDC(LogKeys.DRIVER_STATE, finalState)}, Elapsed time:" +
log" ${MDC(LogKeys.TOTAL_TIME, System.currentTimeMillis() - driver.startTime)} ms)")
drivers -= driver
if (completedDrivers.size >= retainedDrivers) {
val toRemove = math.max(retainedDrivers / 10, 1)
Expand Down

0 comments on commit 42ecabf

Please sign in to comment.