Skip to content

Close and Reopen JIT log files across a checkpoint/restore#20935

Merged
mpirvu merged 1 commit intoeclipse-openj9:masterfrom
dsouzai:criuLogFiles
Jan 28, 2025
Merged

Close and Reopen JIT log files across a checkpoint/restore#20935
mpirvu merged 1 commit intoeclipse-openj9:masterfrom
dsouzai:criuLogFiles

Conversation

@dsouzai
Copy link
Contributor

@dsouzai dsouzai commented Jan 14, 2025

A common best practice in CRIU mode, at least in OpenJ9, is to close all open files before a checkpoint, and reopen them on restore. The JIT would keep log files such as the vlog and rtLog open across a checkpoint/restore boundary. This PR closes these files on checkpoint and reopens them on restore.

@dsouzai dsouzai added comp:jit criu Used to track CRIU snapshot related work labels Jan 14, 2025
@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 14, 2025

A consequence of this change is that when disableSuffixLogs is not specified, a different file is opened. However, this is consistent with what happens for example in the GC.

@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 17, 2025

@mpirvu could you please review?

@mpirvu mpirvu self-assigned this Jan 17, 2025
Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is openNewRTLog(), openNewVlog() and openLogFilesIfNeeded() going to be used anymore?
Also, what about the compilation logs?

@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 17, 2025

Also, what about the compilation logs?

I believe these get opened at the start of a compilation and are closed at the end of one.

@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 17, 2025

One thing I"m not sure about actually is what happens right now if we specify disableSuffixLogs; I think as the code stands, it'll overwrite the existing file, which is not what we want. So I may need to look into that.

@dsouzai dsouzai marked this pull request as draft January 17, 2025 18:03
@mpirvu
Copy link
Contributor

mpirvu commented Jan 17, 2025

Do you to handle disableSuffixLogs in this PR or separately?

@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 17, 2025

Do you to handle disableSuffixLogs in this PR or separately?

I figure I should handle it here; I think otherwise, it'll break some of those cmdLineTester tests.

A common best practice in CRIU mode, at least in OpenJ9, is to close all
open files before a checkpoint, and reopen them on restore. The JIT
would keep log files such as the vlog and rtLog open across a
checkpoint/restore boundary. This commit closes these files on
checkpoint and reopens them on restore.

Signed-off-by: Irwin D'Souza <[email protected]>
@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 17, 2025

Also, what about the compilation logs?

I believe these get opened at the start of a compilation and are closed at the end of one.

I thought this was the case, but it may not be the case... I think we hold on to the file handles so that they can be used by different threads. I'll have to look deeper into this.

@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 21, 2025

Also, what about the compilation logs?

I believe these get opened at the start of a compilation and are closed at the end of one.

I thought this was the case, but it may not be the case... I think we hold on to the file handles so that they can be used by different threads. I'll have to look deeper into this.

It is very involved to close and re-open compilation logs across a checkpoint/restore because of the fact that they can be in option sets, shared by different option sets, and shared by different compilation threads. So, I think that's something that should be done in another PR. It may also involve OMR changes.

@dsouzai dsouzai marked this pull request as ready for review January 21, 2025 20:58
@dsouzai
Copy link
Contributor Author

dsouzai commented Jan 24, 2025

@mpirvu this should be good for review again.

Copy link
Contributor

@mpirvu mpirvu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mpirvu
Copy link
Contributor

mpirvu commented Jan 27, 2025

jenkins test sanity all jdk21

@mpirvu
Copy link
Contributor

mpirvu commented Jan 27, 2025

jenkins test sanity all jdk17

@mpirvu
Copy link
Contributor

mpirvu commented Jan 28, 2025

Most tests have passed.
Windows builds never finish due to in infra issue.
On AIX there was a failure for jdk_security2_0 due to infra: "Cannot contact p8-java1-ibm07: java.lang.InterruptedException". This is known.

This PR is ready to be merged.

@mpirvu mpirvu merged commit 615be6a into eclipse-openj9:master Jan 28, 2025
@dsouzai dsouzai mentioned this pull request Jun 17, 2025
30 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:jit criu Used to track CRIU snapshot related work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants