Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metrics related to virtual threads #9619

Merged
merged 2 commits into from
Jan 3, 2025

Conversation

tjquinno
Copy link
Member

@tjquinno tjquinno commented Dec 23, 2024

Description

Resolves #9533

New meters related to virtual thread usage

  • current number of active virtual threads (disabled by default)
  • total number of virtual thread starts (disabled by default)
  • total number of pinned threads
  • timer (with histogram) of pinned threads' durations
  • number of failed attempts to submit virtual threads to platform threads

The virtual thread count meters are disabled by default for performance reasons. Enable them by setting metrics.virtual-threads.count.enabled=true in configuration, but be aware doing so can degrade the server's performance.

New meter related to platform threads

Helidon also now exposes the new system meter thread.starts which displays the total number of platform thread starts performed in the JVM since server start-up.


The PR adds a new MetersProvider implementation to the helidon-metrics-system-meters component. The new implementation registers for selected Java Flight Recorder events to track data related to virtual threads.

The three virtual thread meters that are enabled by default should be rare so monitoring the JFR events for them adds very little overhead. In contrast, to maintain the current number of active virtual threads and the total number of virtual thread starts the added code must register for and respond to virtual thread start and end events which can be costly. That's why those two meters--and the registration of listeners for those events from JFR--are disabled by default.

Documentation

Small additions to the SE and MP metrics guide and doc pages.

@tjquinno tjquinno self-assigned this Dec 23, 2024
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Dec 23, 2024
@spericas
Copy link
Member

spericas commented Jan 2, 2025

Some general questions @tjquinno ...

  • current number of active virtual threads (disabled by default)
  • total number of virtual thread starts (disabled by default)
  • total number of pinned threads

Is there also a counter for platform threads?

  • distribution summary (history) of pinned threads
  • number of failed attempts to submit virtual threads to platform threads

Under what conditions would this counter be incremented?

@@ -1,5 +1,5 @@
/*
* Copyright (c) 2023 Oracle and/or its affiliates.
* Copyright (c) 2023, 2024 Oracle and/or its affiliates.
*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a copyright update?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a remnant from an earlier iteration of the change I was contemplating. Not needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. File no longer in the changed list.

@tjquinno
Copy link
Member Author

tjquinno commented Jan 2, 2025

Some general questions @tjquinno ...

  • current number of active virtual threads (disabled by default)
  • total number of virtual thread starts (disabled by default)
  • total number of pinned threads

Is there also a counter for platform threads?

The existing Helidon built-in meters created in `SystemMetersProvider.java include some platform-thread related ones:

  • thread count
  • daemon thread count
  • maximum thread count

This JMX MBean makes a number of other thread-related values available:
https://docs.oracle.com/en/java/javase/21/docs/api/java.management/java/lang/management/ThreadMXBean.html

Probably the most useful value that's available we don't currently expose is the total started thread count. We could add that.

  • distribution summary (history) of pinned threads
    (I edited the description to change "history" to "histogram" which I originally intended!)

Do you have an example?

I don't have an interesting example of the value.

Here is an example of the output, taken from the updated doc that's part of this PR (obviously with no interesting output):

  "vthreads.recentPinned": {
      "count": 0,
      "max": 0.0,
      "mean": 0.0,
      "total": 0.0,
      "p0.5": 0.0,
      "p0.75": 0.0,
      "p0.95": 0.0,
      "p0.98": 0.0,
      "p0.99": 0.0,
      "p0.999": 0.0
    },
  • number of failed attempts to submit virtual threads to platform threads

Under what conditions would this counter be incremented?

Here's what the Java Flight Recorded doc says about the value it reports about pinned virtual threads.

  • distribution summary (history) of pinned threads
  • number of failed attempts to submit virtual threads to platform threads

Under what conditions would this counter be incremented?

See the previous link.

@tjquinno tjquinno force-pushed the 4.x-jfr-vthread-metrics branch from 67800cc to 8040b24 Compare January 2, 2025 22:06
@tjquinno
Copy link
Member Author

tjquinno commented Jan 2, 2025

The latest push accomplished:

  • Removed an unchanged file (except for copyright date) from the changed file list.
  • Added the new non-virtual thread system meter thread.starts to the built-in system meters
  • Changed the meter type for vthreads.recentPinned from distribution summary to timer (the two are basically equivalent except that the timer type displays an elapsed time instead of the more general data value).

…ecause the virtual threads recentPinned meter is now a timer
@tjquinno tjquinno requested a review from spericas January 2, 2025 23:11
@tjquinno tjquinno merged commit 01145f1 into helidon-io:main Jan 3, 2025
55 checks passed
@tjquinno tjquinno deleted the 4.x-jfr-vthread-metrics branch January 3, 2025 15:22
@vasanth-bhat
Copy link

vasanth-bhat commented Jan 11, 2025

Few comments,

  1. The implementation is creating the RecordingStream without passing any settings. Is this done intentionally? Both Recording() and RecordingStream() have alternate constructors that take settings map as input.

In JFR stream or JFR, the details as to which events are captured, threshold used for each event , stack capture etc depends on settings used to create the Recording (RecordingStream internally creates Recording) . JDK ships with 2 settings "default" and "profile".

Below are settings for ”jdk.VirtualThreadPinned” event in JDK bundled settings "default" and "profile" . This would means when these settings are used, a Pinned JFR event would be recorded only for cases where the carrier thread was pinned for 20ms or longer.

 <event name="jdk.VirtualThreadPinned">
      <setting name="enabled">true</setting>
      <setting name="stackTrace">true</setting>
      <setting name="threshold">20 ms</setting>
 </event>

However many cases the system also run with custom settings via custom JFC file, and can can customise settings for both built-in JDK provided as well as custom defined JFR events.

2 ”jdk.VirtualThreadPinned” JFR events are not generated for all pinning cases. For example carrier thread also get pinned in Java-21 when the vthread mounted on them execute Object.wait(). But these do not generate ”jdk.VirtualThreadPinned”. Same is true for pinning due to blocking operation in class initialiser ( for ex : static blocks), or pining due to certain blocking operation in native code. So primarily, we get pinning events only for blocking operation from sync blocks. I guess this is more for information/documentation purposes, so that one doesn't assume all pinning events are reported by this metric.

@tjquinno
Copy link
Member Author

We looked into this earlier and to get initial support for meters related to virtual threads in place quickly with the simplest ease-of-use we implemented the feature as written.

Some examples of the complexities:

  • The settings value in the -XX:StartFlightRecording:settings command-line qualifier can take multiple values. That means a Helidon user would need a way to choose which of the settings (among the custom ones specified on the command line or the predefined ones) to use as the basis for the Helidon RecordingStream.
  • The JFR-provided Configuration class allows look up by name any of the predefined profiles...those whose .jfc. files are in JAVA_HOME/lib/jfr. But Configuration.getConfiguration(String) and Configuration.getConfigurations() do not support config files located elsewhere. The Helidon user would need to provide Helidon configuration to specify which built-in name or custom file to use. The Helidon code would need to use that setting to select either a predefined configuration by name or a custom configuration by path.

All doable, but for the first implementation we chose to simplify things.

I've had this sort of enhancement in mind but finally recorded it as an issue: #9652 (generally more likely to get visibility than comments on a previously-merged PR). Add any further comments you might have there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Evaluate adding metrics for virtual threads
3 participants