Skip to content

Conversation

@mpirvu
Copy link
Contributor

@mpirvu mpirvu commented Nov 25, 2025

This PR is mainly for the benefit of JITServer.
Experiments have shows that addKnownObjectConstraints() function used by GlobalVP generates many server-client messages
when calling: fej9()->getObjectClassInfoFromObjectReferenceLocation() to fetch some class information about a particular known object.
This commit implements a caching mechanism for that class information as described below:

An entry in the knownObjectTable is extended to include the following
fields:

      uintptr_t *_jniReference; // This existed before this commit
      TR_OpaqueClassBlock *_clazz;
      TR_OpaqueClassBlock *_jlClass;
      bool _isFixedJavaLangClass;
      bool _isString;

When a new entry is added into the knownObjectTable only the "old" field _jniReference will be populated; the others will be set to default values, NULL or false.
If GlobalVP is asking about the other fields and they are not yet populated, the server will send a message, find the value of these other fields and store them into the knownObjectTable. If these fields are already populated when GlobalVP asks about them, they will be immediately retrieved from the knownObjectTable entry, saving a message.
Note that the same caching mechanism is going to be used at the client, but the benefit is expected to be low.
Experimental results have shown a 84% reduction of the messages sent by the server during addKnownObjectConstraints().
Note that in the future we may consider populating all fields of the knownObjectTable entry when that entry is added to the table.
This approach can eliminate all messages sent during addKnownObjectConstraints() but may create additional overhead for filling up fields that the server may never use.

There are two options that control this feature:
-Xjit:disableKnownObjectTableCaching disables the caching mechanism and always asks the client for the desired information.
-Xjit:enableKnownObjectTableCachingVerification when caching is enabled this option always sends a message to the client, fetches the desired information and then compares it against the cached information. If the two don't match, a fatal assert is triggered.

A follow on addKnownObjectConstraints() change in omr will actually make use of this infrastructure.

Depends on: eclipse-omr/omr#8056

@mpirvu mpirvu requested a review from dsouzai as a code owner November 25, 2025 17:15
@mpirvu mpirvu added comp:jit comp:jitserver Artifacts related to JIT-as-a-Service project labels Nov 25, 2025
@github-project-automation github-project-automation bot moved this to In progress in JIT as a Service Nov 25, 2025
@mpirvu mpirvu added the depends:omr Pull request is dependent on a corresponding change in OMR label Nov 25, 2025
@mpirvu
Copy link
Contributor Author

mpirvu commented Nov 25, 2025

@jdmpapin Could you please review this PR? There will be a follow up omr PR to actually use all this infrastructure. Thanks

@mpirvu
Copy link
Contributor Author

mpirvu commented Nov 25, 2025

FYI @vijaysun-omr.

@dsouzai dsouzai self-assigned this Nov 27, 2025
@jdmpapin jdmpapin self-requested a review November 27, 2025 20:55
Copy link
Contributor

@dsouzai dsouzai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, minor deduplication requested.

retrievedObjInfo._jniReference = existingObjInfo._jniReference;
retrievedObjInfo._clazz = getObjectClass(objectReference);
retrievedObjInfo._isString = isString(retrievedObjInfo._clazz);
retrievedObjInfo._jlClass = getClassClassPointer(retrievedObjInfo._clazz);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should really need to store _jlClass for every known object. It doesn't really meaningfully depend on the _clazz here. getClassClassPointer() just needs any old class so that it can find an instance of java/lang/Class. I understand though that VP needs to know whether we had it at this point. (It's pretty ridiculous that we can't just always guarantee that the JIT knows what it is... Maybe that could change, but let's set that aside.)

Reading on a bit, the values we determine here and the "FixedClass constraint" comment seem unnecessarily VP-specific.

It seems like all we should really need is _jniReference, _clazz, and the result of getClassFromJavaLangClass() (if appropriate), which we could call e.g. _reflectiveClass. The other logic could move (back) to VP:

  • VP can call getClassClassPointer() separately. The pointer is cached on the server.
  • VP can call isString() separately. The java/lang/String class pointer is also cached on the server.
  • The _isFixedJavaLangClass value is just the result of a pointer comparison.

Then VP could use the value from either _clazz or _reflectiveClass as needed. This would also prevent VP-isms from sneaking into potential uses of this API elsewhere, e.g. finding out which class a known object represents in vector API expansion. We would have to contend though with getClassClassPointer() starting to succeed at different times 😞 For example, maybe it fails here, so then we don't set _reflectiveClass, but then it succeeds later in VP.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to reduce the number of fields that are cached as much as possible. Since this will require more testing my plan is to merge this PR ASAP, backport to upcoming release v0.57.0 (deadline on 2025/12/09), and immediately open another PR with proposed solution, but deliver just to master (dev) branch.

This commit is mainly for the benefit of JITServer.
Experiments have shows that addKnownObjectConstraints() function used
by GlobalVP generates many server-client messages when calling:
fej9()->getObjectClassInfoFromObjectReferenceLocation() to fetch
some class information about a particular known object.
This commit implements a caching mechanism for that class information as described below:

An entry in the knownObjectTable is extended to include the following fields:

      uintptr_t *_jniReference; // This existed before this commit
      TR_OpaqueClassBlock *_clazz;
      TR_OpaqueClassBlock *_jlClass;
      bool _isFixedJavaLangClass;
      bool _isString;

When a new entry is added into the knownObjectTable, only the
"old" field _jniReference will be populated; the others will
be set to default values, NULL or false.
If GlobalVP is asking about the other fields and they are not
yet populated, the server will send a message, find the value
of these other fields and store them into the knownObjectTable.
If these fields are already populated when GlobalVP asks about
them, they will be immediately retrieved from the
knownObjectTable entry, saving a message.
Note that the same caching mechanism is going to be used at
the client, but the benefit is expected to be low.
Experimental results have shown a 84% reduction of the messages
sent by the server during addKnownObjectConstraints().

There are two options that control this feature:
-Xjit:disableKnownObjectTableCaching disables the caching
mechanism and always asks the client for the desired information.
-Xjit:enableKnownObjectTableCachingVerification when caching is
enabled this option always sends a message to the client,
fetches the desired information and then compares it against
the cached information. If the two don't match, a fatal assert is
triggered.

Depends on: eclipse-omr/omr#8056

Signed-off-by: Marius Pirvu <[email protected]>
@mpirvu
Copy link
Contributor Author

mpirvu commented Nov 28, 2025

I have addressed most of code review suggestions in the forced push 1981e13
The suggested restructuring of a knot entry will be done in a separate PR.

@mpirvu
Copy link
Contributor Author

mpirvu commented Nov 28, 2025

jenkins test sanity all jdk21

@mpirvu
Copy link
Contributor Author

mpirvu commented Nov 29, 2025

On ppcle there is a timeout for one of the CRIU tests:

ppcle
  cmdLineTester_criu_nonPortableRestore_5_FAILED		
		

15:25:52  Testing: Create CRIU checkpoint image and restore three times - testMillisDelayAfterCheckpointDone
15:25:52  Test start time: 2025/11/28 20:25:51 Coordinated Universal Time
15:25:52  Running command: bash /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu/criuScript.sh /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_ppc64le_linux_Personal_testList_0/aqa-tests/TKG/../../jvmtest/functional/cmdLineTests/criu /home/jenkins/workspace/Test_openjdk21_j9_sanity.functional_ppc64le_linux_Personal_testList_0/jdkbinary/j2sdk-image/bin/java " -Xgcpolicy:optthruput  -Xtrace:print={j9vm.684-696,j9vm.699,j9vm.717-747}" org.openj9.criu.TimeChangeTest testMillisDelayAfterCheckpointDone 3 false true
15:25:52  Time spent starting: 2 milliseconds
15:30:58  ***[TEST INFO 2025/11/28 20:30:51] ProcessKiller detected a timeout after 300000 milliseconds!***

On windows there are exceptions during jdk_vector_double128_j9_0

18:23:25  java.lang.Exception: failures: 9
18:23:25  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:104)
18:23:25  	at com.sun.javatest.regtest.agent.TestNGRunner.main(TestNGRunner.java:58)
18:23:25  	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
18:23:25  	at java.base/java.lang.reflect.Method.invoke(Method.java:586)
18:23:25  	at com.sun.javatest.regtest.agent.MainWrapper$MainTask.run(MainWrapper.java:138)
18:23:25  	at java.base/java.lang.Thread.run(Thread.java:1595)
18:23:25  
18:23:25  JavaTest Message: Test threw exception: java.lang.Exception: failures: 9
18:23:25  JavaTest Message: shutting down test
TEST RESULT: Failed. Execution failed: `main' threw exception: java.lang.Exception: failures: 9

@mpirvu
Copy link
Contributor Author

mpirvu commented Nov 30, 2025

A small grinder for the plinux criu problem passed.
I should also mention that the caching mechanism is not yet exploited. This will happen after a future omr change in VP.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:jit comp:jitserver Artifacts related to JIT-as-a-Service project depends:omr Pull request is dependent on a corresponding change in OMR

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

3 participants