Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-37410][runtime/metrics] Split level Watermark metrics #26276

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Efrat19
Copy link

@Efrat19 Efrat19 commented Mar 8, 2025

What is the purpose of the change

This pull request adds split level watermark metrics, covering watermark progress and per-split state gauges (active, idle and paused)
The change is widely described in FLIP-513: Split-level Watermark Metrics

Brief change log

Verifying this change

This change added tests and can be verified as follows:

  • The new metric group, as well as state transitions reporting under alignment / idleness are unit tested.
  • The change was manually verified by running a flink job reading from 2 sources, fast and slow, and verifying split level metrics reports were aligned with the watermark, paused and idle status of each split.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes (metrics)
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs

@Efrat19 Efrat19 changed the title [FLINK-37410] Split level Watermark metrics [FLINK-37410][runtime/metrics] Split level Watermark metrics Mar 8, 2025
@flinkbot
Copy link
Collaborator

flinkbot commented Mar 8, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@Efrat19
Copy link
Author

Efrat19 commented Mar 10, 2025

@flinkbot run azure

<tr>
<td>watermark.activeTimeMsPerSecond</td>
<td>
The time (in milliseconds) this split is active (i.e. not paused due to watermark alignment or idle due to idleness detection) per second.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:this split is active -> this split has been active

<tr>
<td>watermark.accumulatedActiveTimeMs</td>
<td>
Accumulated time (in milliseconds) this split was active since registered
Copy link
Contributor

@davidradl davidradl Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: since registered -> since it was registered.
is this updated every second? If so then we should say that

same for the next 2 , it was and full stop

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants