Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerate String.indexOf(char) in JDK17+ #18819

Merged
merged 1 commit into from
Feb 4, 2024

Conversation

Akira1Saitoh
Copy link
Contributor

@Akira1Saitoh Akira1Saitoh commented Jan 25, 2024

Intrinsic methods for indexOf(char) in JITHelpers class is not used in JDK17+ because OpenJDK's java/lang/String class is used.
OpenJDK class library has intrinsic methods in java/lang/StringLatin1 and java/lang/StringUTF16 classes. The logic of those intrinsic methods are identical to those of JITHelpers class except that they are defined as static methods. This commit adds intrinsic methods of StringLatin1 and StringUTF16 to the recognized method table and update x/p/z/aarch64 codegen to generate the specialized instruction sequence which already exists for JITHelpers methods.

Related issue: #6404


Callgraph of String.indexOf(char) in JDK8 and JDK11

String indexOf_char_JDK8_11_callgraph

All intrinsic methods exist in JITHelpers class. All platforms offer specialized implementation for JITHelpers.intrinsicIndexOfLatin1(Object array, byte ch, int offset, int length) and JITHelpers.intrinsicIndexOfUTF16(Object array, char ch, int offset, int length).


Callgraph of String.indexOf(char) in JDK17

String indexOf_char_JDK17_callgraph

Note that red nodes are methods with @IntrinsicCandidate annotation.

Intrinsic methods of StringLatin1 and StringUTF16 classes are used in this version. We do not recognize intrinsic methods of these classes on any platform.
There are 2 methods that are annotated as @IntrinsicCandidate

  • StringLatin1.indexOfChar(byte[] value, int ch, int fromIndex, int max)
  • StringUTF16.indexOfChar(byte[] value, int ch, int fromIndex, int max)

StringUTF16.indexOfChar(byte[] value, int ch, int fromIndex, int max) calls checkBoundsBeginEnd and the actual search logic resides in StringUTF16.indexOfCharUnsafe(byte[] value, int ch, int fromIndex, int max). Thus, this change recognizes StringUTF16.indexOfCharUnsafe method rather than StringUTF16.indexOfChar.

The logic of StringLatin1.indexOfChar(byte[] value, int ch, int fromIndex, int max) and StringUTF16.indexOfCharUnsafe(byte[] value, int ch, int fromIndex, int max) are same as those of JITHelper intrinsic methods. This PR simply maps JITHelper intrinsics to those StringLatin1 and StringUTF16 methods.


Callgraph of String.indexOf(char) in JDK21

String indexOf_char_JDK21_callgraph

Almost same as Java17 except for the newly added API String.indexOf(int ch, int beginIndex, int endIndex). No changes for intrinsic methods.

@Akira1Saitoh
Copy link
Contributor Author

FYI @0xdaryl @knn-k

@knn-k knn-k added the comp:jit label Jan 25, 2024
@vijaysun-omr
Copy link
Contributor

Have you measured performance of doing this acceleration ? If so, could you share in brief (maybe a sentence or two for each platform that you worked on) ?

@Akira1Saitoh
Copy link
Contributor Author

I have measured performance using a simple String.indexOf() JMH benchmark and pingperf on x86 and aarch64.
The JMH benchmark shows a significant performance gain on both platforms but I do not see any gain with pingperf.

I used an x86 fyre machine for measuring x86 performance, and a Ubuntu VM on M1 Mac for measuring aarch64 performance.
I used following personal builds:


JMH performance

x86

Comparing scores with default option and -Xjit:disableFastStringIndexOf.

String length relative performance
2 197%
4 208%
8 257%
16 416%
32 443%
64 534%
128 621%
256 872%

AArch64

String length ratio
2 186%
4 229%
8 245%
16 371%
32 466%
64 569%
128 656%
256 665%

PingPerf performance

x86

  • 95% CI of the score with acceleration (20 runs)
    (8346.387825427491, 8556.94217457251)

  • 95% CI of the score without acceleration (20 runs)
    (8442.85178109, 8688.22321891)

AArch64

  • 95% CI of the score with acceleration (20 runs)
    (20122.713609905408, 20632.05839009459)

  • 95% CI of the score without acceleration (20 runs)
    (20303.53168033, 20768.44831967)

@knn-k
Copy link
Contributor

knn-k commented Jan 26, 2024

Jenkins test sanity.functional all jdk17

@knn-k
Copy link
Contributor

knn-k commented Jan 26, 2024

Build of functional tests failed with the following error:

[2024-01-26T07:45:03.760Z]     [javac] /home/jenkins/workspace/Test_openjdk17_j9_sanity.functional_aarch64_linux_Personal_testList_0/aqa-tests/functional/cmdLineTests/criu/src/org/openj9/criu/TestJDKCRAC.java:36: error: package jdk.crac does not exist
[2024-01-26T07:45:03.760Z]     [javac] 		jdk.crac.Core.checkpointRestore();

It looks the same as Issue #18822.

@vijaysun-omr
Copy link
Contributor

vijaysun-omr commented Jan 26, 2024

Are the different overloads (signatures) of String.indexOf all optimized (or possible to optimize)? There are also related methods like lastIndexOf or startsWith or endsWith that may be possible to optimize too, but it is perfectly fine to do that via a separate PR if you prefer.

@Akira1Saitoh
Copy link
Contributor Author

Akira1Saitoh commented Jan 29, 2024

All overloads with a character argument are optimized, but overloads with a String argument are not. In JDK11+, String overloads of String.indexOf call intrinsic methods in StringLatin1 and StringUTF16 classes, which do not utilize indexOfChar methods like JITHelpers do. (For example, JITHelpers.intrinsicIndexOfStringLatin1 calls JITHelpers.instrinsicIndexOfLatin1.) To optimize String.indexOf(String), we need the special implementation for intrinsic methods used in String search in StringLatin1 and StringUTF16 classes, which is only available in z codegen at the moment. I think adding them on x/p/aarch64 should be done on separate PRs because it would require a significant amount of effort.

Also, I think it would be appropriate to work on other methods like lastIndexOf via a separate PR. I will do some research on them, and post my findings to a separate issue.

EDIT: Findings about lastIndexOf(char) is posted to #18861.

@knn-k
Copy link
Contributor

knn-k commented Jan 29, 2024

Jenkins test sanity.functional alinux64,zlinux jdk17

Issue #18822 was fixed. Running tests again.

@Akira1Saitoh Akira1Saitoh marked this pull request as ready for review January 29, 2024 07:14
@Akira1Saitoh Akira1Saitoh requested a review from dsouzai as a code owner January 29, 2024 07:14
@vijaysun-omr
Copy link
Contributor

Thanks, I am fine with delivering related changes via separate PRs.

@vijaysun-omr
Copy link
Contributor

fyi @zl-wang @r30shah @dchopra001

@r30shah
Copy link
Contributor

r30shah commented Jan 29, 2024

I agree with the approach and change looks good to me, though would like @dchopra001 to comment as well and if possible help with verifying the performance on Z as well.

@zl-wang
Copy link
Contributor

zl-wang commented Jan 29, 2024

on Power, OpenJ9 had pretty optimal implementations of indexOf() already for both UTF16 and Latin1, specific at least for pre-POWER10 and POWER10 hardwares. Possible follow-up (to this item) is to see how to teach the various overloads going to indexOf(ch) implementations. @ymanton

@knn-k
Copy link
Contributor

knn-k commented Jan 30, 2024

The test failures (There are no deserialized methods at the client.) look the same as #18691.

Intrinsic methods for indexOf(char) in JITHelpers class is not used in JDK17+
because OpenJDK's java/lang/String class is used.
OpenJDK class library has intrinsic methods in java/lang/StringLatin1 and
java/lang/StringUTF16 classes. The logic of those intrinsic methods are
identical to those of JITHelpers class except that they are defined as
static methods. This commit adds intrinsic methods of StringLatin1 and
StringUTF16 to the recognized method table and update x/p/z/aarch64 codegen
to generate the specialized instruction sequence which already exists
for JITHelpers methods.

Signed-off-by: Akira Saitoh <[email protected]>
@Akira1Saitoh Akira1Saitoh force-pushed the StringIndexOfCharJDK17 branch from 29c6109 to 107ff02 Compare January 30, 2024 04:15
@Akira1Saitoh
Copy link
Contributor Author

Rebased the commit.

@knn-k
Copy link
Contributor

knn-k commented Jan 30, 2024

Jenkins test sanity.functional all jdk17

@knn-k
Copy link
Contributor

knn-k commented Jan 30, 2024

Test job on x86-64 windows is failing also in PR #18847 with Error creating temporary file.

Copy link
Contributor

@dchopra001 dchopra001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! From a z/codegen perspective the change LGTM.

@dchopra001
Copy link
Contributor

I downloaded the LoZ PR build here. Doing a quick run using the benchmark posted in #18819 (comment) to make sure Z is seeing performance improvements consistent with x86 and Aarch.

@dchopra001
Copy link
Contributor

Performance numbers (baseline = -Xjit:disableFastStringIndexOf).

Screenshot 2024-01-30 at 5 15 19 PM

So overall there's an improvement. Looking at the data the accelerated code is mostly faster and more consistent, reflecting SIMD acceleration.

The slowdown for length=1, length=2 can be explained by the overhead that comes with preparing for vector acceleration. I think this is a reasonable tradeoff and the performance impact likely won't be observed outside of a microbenchmark scenario.

@knn-k
Copy link
Contributor

knn-k commented Jan 31, 2024

Jenkins test sanity.functional win jdk17

Copy link
Contributor

@knn-k knn-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AArch64 changes look good to me.

@knn-k
Copy link
Contributor

knn-k commented Feb 2, 2024

Final call for additional comments. I am going to merge this PR on Monday if there is no objection.

@knn-k knn-k self-assigned this Feb 4, 2024
@knn-k knn-k merged commit 335e720 into eclipse-openj9:master Feb 4, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants