Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Native Image] G1 GC application crashes with segfault after a second heap dump using kill -SIGUSR1 <pid> #9894

Closed
1 of 2 tasks
ThoSap opened this issue Oct 16, 2024 · 3 comments
Assignees

Comments

@ThoSap
Copy link

ThoSap commented Oct 16, 2024

Describe the Issue

If I follow the following steps to create a heap dump using kill -SIGUSR1 <pid> on runtime, my application running with the G1 GC always crashes with a segfault after I make the second heap (after some time in between, for example, 30 minutes) dump using the command.
https://www.graalvm.org/jdk21/reference-manual/native-image/guides/create-heap-dump/#:~:text=Create%20Heap%20Dumps%20with%20SIGUSR1%20(Linux/macOS%20only)

I added -XX:+PrintGC -XX:+VerboseGC after this issue occurred for the first time, to see normal GC runs in runtime before the full GC heap dump crash.

Using the latest version of GraalVM can resolve many issues.

Latest JDK 21.0.5 version

GraalVM Version

Built using container image container-registry.oracle.com/graalvm/native-image:21.0.5
-> container-registry.oracle.com/graalvm/native-image:21.0.5-ol9-20241015
-> Oracle Container Registry Image ID 1e7548c6ff98
-> Repo Digest container-registry.oracle.com/graalvm/native-image@sha256:c10d7f10da5bfed22ad02887e96095b1c08d42c8e6bf03d774cded332fa444a9
-> Image ID (output from docker images) sha256:f8a4dcaa07bfa79d6e8f4dced149d28a86e802fcdfe20586b69dd93570c2e82d

java version "21.0.5" 2024-10-15 LTS
Java(TM) SE Runtime Environment Oracle GraalVM 21.0.5+9.1 (build 21.0.5+9-LTS-jvmci-23.1-b48)
Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 21.0.5+9.1 (build 21.0.5+9-LTS-jvmci-23.1-b48, mixed mode, sharing)

Operating System and Version

Linux mycontainername 5.15.0-209.161.7.2.el8uek.x86_64 #2 SMP Tue Aug 20 10:44:07 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux

Diagnostic Flag Confirmation

  • I tried the -H:ThrowMissingRegistrationErrors= flag.

Run Command

./application -Dquarkus.http.host=0.0.0.0 -Djava.util.logging.manager=org.jboss.logmanager.LogManager -XX:+PrintGC -XX:+VerboseGC

Expected Behavior

I should be able to create multiple heap dumps on runtime using kill -SIGUSR1 <pid> without a segfault.

Actual Behavior

Executing kill -SIGUSR1 <pid> twice (after some time in between, like 30 minutes) crashes the application with a segfault.

Steps to Reproduce

My application was built with Quarkus 3.15.1 and container-registry.oracle.com/graalvm/native-image:21.0.5-ol9-20241015 with the following native-image build command.
I enabled the heapdump using --enable-monitoring=heapdump,jfr,jvmstat and created the heapdump using https://www.graalvm.org/jdk21/reference-manual/native-image/guides/create-heap-dump/#:~:text=Create%20Heap%20Dumps%20with%20SIGUSR1%20(Linux/macOS%20only)

/usr/lib64/graalvm/graalvm-java21/bin/native-image \
-J-Dsun.nio.ch.maxUpdateArraySize=100 \
-J-Dlogging.initial-configurator.min-level=500 \
-J-DCoordinatorEnvironmentBean.transactionStatusManagerEnable=false \
-J-Dio.quarkus.caffeine.graalvm.recordStats=true \
-J-Djava.util.logging.manager=org.jboss.logmanager.LogManager \
-J-Dvertx.logger-delegate-factory-class-name=io.quarkus.vertx.core.runtime.VertxLogDelegateFactory \
-J-Dvertx.disableDnsResolver=true \
-J-Dio.netty.leakDetection.level=DISABLED \
-J-Dio.netty.allocator.maxOrder=3 \
-J-Duser.language=en \
-J-Duser.country=US \
-J-Dfile.encoding=UTF-8 \
--features=io.quarkus.runner.Feature,io.quarkus.runtime.graal.DisableLoggingFeature,oracle.jdbc.nativeimage.NativeImageFeature,io.quarkus.caffeine.runtime.graal.CacheConstructorsFeature,io.quarkus.jdbc.postgresql.runtime.graal.SQLXMLFeature \
-J--add-exports=java.security.jgss/sun.security.krb5=ALL-UNNAMED \
-J--add-exports=java.security.jgss/sun.security.jgss=ALL-UNNAMED \
-J--add-opens=java.base/java.text=ALL-UNNAMED \
-J--add-opens=java.base/java.io=ALL-UNNAMED \
-J--add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
-J--add-opens=java.base/java.util=ALL-UNNAMED \
-H:+UnlockExperimentalVMOptions \
-H:BuildOutputJSONFile=myapi-4.20.1-runner-build-output-stats.json \
-H:-UnlockExperimentalVMOptions \
-H:+UnlockExperimentalVMOptions \
-H:+GenerateBuildArtifactsFile \
-H:-UnlockExperimentalVMOptions \
--strict-image-heap \
--color=always \
-march=native \
--enable-sbom \
--gc=G1 \
--initialize-at-run-time=io.trino.jdbc.TrinoDriver \
-H:+UnlockExperimentalVMOptions \
-H:+AllowFoldMethods \
-H:-UnlockExperimentalVMOptions \
-J-Djava.awt.headless=true \
--no-fallback \
-H:+UnlockExperimentalVMOptions \
-H:+ReportExceptionStackTraces \
-H:-UnlockExperimentalVMOptions \
-J-Xmx18g \
-H:+AddAllCharsets \
--enable-url-protocols=http,https \
-H:NativeLinkerOption=-no-pie \
--enable-monitoring=heapdump,jfr,jvmstat \
-H:+UnlockExperimentalVMOptions \
-H:-UseServiceLoaderFeature \
-H:-UnlockExperimentalVMOptions \
-J--add-exports=org.graalvm.nativeimage/org.graalvm.nativeimage.impl=ALL-UNNAMED \
--exclude-config \
com\.oracle\.database\.jdbc \
/META-INF/native-image/native-image\.properties \
--exclude-config \
com\.oracle\.database\.jdbc \
/META-INF/native-image/reflect-config\.json \
--exclude-config \
io\.netty\.netty-codec \
/META-INF/native-image/io\.netty/netty-codec/generated/handlers/reflect-config\.json \
--exclude-config \
io\.netty\.netty-handler \
/META-INF/native-image/io\.netty/netty-handler/generated/handlers/reflect-config\.json \
myapi-4.20.1-runner \
-jar \
myapi-4.20.1-runner.jar

The native-image is running in the base container image registry.access.redhat.com/ubi9/ubi-minimal:9.4.

Additional Context

No response

Run-Time Log Output and Error Messages

https://gist.github.com/ThoSap/e6b01fe3677c2e89a3716a1ac66e85eb

@ThoSap ThoSap changed the title [Native Image] Application crashes after the second heap dump using kill -SIGUSR1 <pid> [Native Image] Application crashes with segfault after the second heap dump using kill -SIGUSR1 <pid> Oct 16, 2024
@ThoSap
Copy link
Author

ThoSap commented Oct 17, 2024

To clarify, if I make a heap dump immediately after the first one or only with a few minutes in between, the application does not crash.
It only crashes if I make a heap dump after 20 to 30 minutes of runtime.

@ThoSap ThoSap changed the title [Native Image] Application crashes with segfault after the second heap dump using kill -SIGUSR1 <pid> [Native Image] G1 GC application crashes with segfault after a second heap dump using kill -SIGUSR1 <pid> Oct 17, 2024
@selhagani selhagani self-assigned this Oct 17, 2024
@selhagani
Copy link
Member

Hi @ThoSap,

Thank you for reaching out to us!
Could you please provide us with a concise reproducer that I can test locally on my machine for your issue alongside the steps needed to reproduce it?

@selhagani
Copy link
Member

As I haven't heard back from you in over three weeks, I will be closing this issue for now. If you
need further assistance or have any updates, please feel free to reach out at any time.
Thank you for your understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants