Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[help] hibernation support #82

Open
julie-de-ville opened this issue Oct 14, 2024 · 18 comments
Open

[help] hibernation support #82

julie-de-ville opened this issue Oct 14, 2024 · 18 comments

Comments

@julie-de-ville
Copy link

julie-de-ville commented Oct 14, 2024

Is hibernation resume support included automatically? I don't see a module for it, and my system supports hibernation, but it will not resume after suspending to disk. I am using gentoo with linux version 6.6.52

@desultory
Copy link
Owner

It does not currently have hibernation support, but this shouldn't be very hard to add.

Have you tried using the "resume=" kernel command line arg? I have not tested this but assumed it could be used to resume off of unencrypted swap.

@julie-de-ville
Copy link
Author

That would be great, I found it much better than dracut. I have the resume parameter in grub.

@desultory
Copy link
Owner

Do you have encrypted swap? I think it shouldn't be too hard to add support for simple resuming, I'm just not sure why the builtin kernel parameter doesn't work alone. Maybe it ignores that option if an initrd is used?

@desultory
Copy link
Owner

https://wiki.gentoo.org/wiki/Custom_Initramfs/Hibernation

I've hesitated to add support in the initrd as there are many things which can go wrong. I think passing the supplied info to /sys/power/resume could be enough.

@desultory
Copy link
Owner

0175133
I'm not sure about this, but I think it may be a reasonable start? resume= expects a device path, I'm not sure if it makes sense to resolve a UUID.

@desultory
Copy link
Owner

Right now, it enters a fail state if a resume partition is passed and it fails to resume, this means it won't normally boot. I'm not sure how much to consider those warnings about data loss. If it hibernated, and you reboot without considering the saved state, there could potentially be serious data loss, similar to if you did a hard shutdown. I think most systems just attempt to resume from swap if possible, but continue if not. I'm going to check/test a bit more

@desultory
Copy link
Owner

That would be great, I found it much better than dracut. I have the resume parameter in grub.

did you manually add the parameter? I think for the sake of safety, I will be forcing resume attempts if resume= is set. It's potentially very dangerous to start a system fresh if it expects to return from a hibernation state.

@julie-de-ville
Copy link
Author

Yes, I manually added it to grub. I booted twice in that manner, but I haven't attempted resuming again since I have narrowed it down to initramfs. That is a good idea.

@desultory
Copy link
Owner

desultory commented Oct 14, 2024

Yes, I manually added it to grub. I booted twice in that manner, but I haven't attempted resuming again since I have narrowed it down to initramfs. That is a good idea.

yeah it's probably not safe to resume right now, you could have a bit of data loss each time as it expects to later resume from the current ram state.

As far as I know, there is no way to know if a system should resume at boot time, other than the passed kernel cmdline parameters. It's safest to prevent booting if that was passed but cannot be performed.


I've gotten some help looking into this, and I think it is probably safe to boot if it can't resume, and the device is found. If the resume source device cannot be found, something is wrong and booting will stop (in the current form the resume module takes)

@julie-de-ville
Copy link
Author

julie-de-ville commented Oct 14, 2024

As for path/uuid, I have my resume parameter set to the mapped, decrypted luks partition, at /dev/mapper/gentoo-root. Btw, I am using btrfs on an encrypted luks partition, and for S5 I suspend to a swapfile on the root subvolume, if that helps. Also, I had to set the resume_offset as a parameter as well, since I am using a swapfile.

@desultory
Copy link
Owner

desultory commented Oct 14, 2024

As for path/uuid, I have my resume parameter set to the mapped, decrypted luks partition, at /dev/mapper/gentoo-root. Btw, I am using btrfs on an encrypted luks partition, and for S5 I suspend to a swapfile on the root subvolume, if that helps. Also, I had to set the resume_offset as a parameter as well, since I am using a swapfile.

resume should be set to your swap partition, the support I just added only supports plain swap, I may add support for encrypted swap too.

As it is, it will boot normally if it cant resume using the provided resume path (using a partuuid is best), if it cannot find the source device, it will enter a fail state.

@julie-de-ville
Copy link
Author

julie-de-ville commented Oct 14, 2024

Oh I see, my swap is encrypted so I don't think I will be able to test it safely, though I would really like to get it working, so if there is anything I can do to help lmk. I tried to get it working with dracut by adding the crypt and resume modules, and including /etc/crypttab, but it hung on a black screen with a spinning wheel.

@desultory
Copy link
Owner

desultory commented Oct 15, 2024

Oh I see, my swap is encrypted so I don't think I will be able to test it safely, though I would really like to get it working, so if there is anything I can do to help lmk. I tried to get it working with dracut by adding the crypt and resume modules, and including /etc/crypttab, but it hung on a black screen with a spinning wheel.

Using encrypted swap is somewhat complex. I'd have to add a new method specifically for opening that, which can run first. The real tricky part is that would likely need to attempt to run on every boot. When you're booting fresh, that will just be a waste of time because it will not be able to resume.

you mentioned /dev/mapper/gentoo-root as being your resume device, is that your root partition? Do you have a separate partition that is luks encrypted, or are you using lvm? resuming from swap files is especially difficult because the file offset on the disk must be set.

As for path/uuid, I have my resume parameter set to the mapped, decrypted luks partition, at /dev/mapper/gentoo-root. Btw, I am using btrfs on an encrypted luks partition, and for S5 I suspend to a swapfile on the root subvolume, if that helps. Also, I had to set the resume_offset as a parameter as well, since I am using a swapfile.

Are you sure the resume_offset you found is correct? That is my only assumption why dracut may fail, unless it doesn't properly support hibernation from luks devices. Did it ever ask for the key for your root device? I really hesitate to even open luks devices because I'm not sure if opening them has any chance at writing anything. If you touch storage devices at all between hibernation and resuming, that be may harmful.

@desultory
Copy link
Owner

this isn't really the best UX, but maybe you could take advantage of the fact that it fails, and then choose whether you want to manually run "crypt_init" which will run the cryptsetup unlock procedure or tell it to ignore resuming. Then you can exit the recovery shell, and on the second pass it will see the resume source and use that.

@julie-de-ville
Copy link
Author

Thank you, how would I go about doing that? If you could point me in the right direction, that's be great. The resume offset is correct; if the offset is not calculated correctly, it will not hibernate at all, and now it hibernates but doesn't resume.

@desultory
Copy link
Owner

Thank you, how would I go about doing that? If you could point me in the right direction, that's be great. The resume offset is correct; if the offset is not calculated correctly, it will not hibernate at all, and now it hibernates but doesn't resume.

If you add the recovery kernel cmdline arg, when it fails it should open a bash shell, from there you can try to do things manually and see what works. crypt_init is a function in ugrd which should run the luks procedures for your device. if you can do that, it may work. handle_resume should do the whole procedure to resume based on your kernel command line, but you could try to manually echo something to /sys/power/resume

@julie-de-ville
Copy link
Author

Hi, I didn't see your last reply. Would it be possible to get an initramfs to resume from encrypted hibernation using those cmdline args, or would I have to do it manually each time?

@desultory
Copy link
Owner

Hi, I didn't see your last reply. Would it be possible to get an initramfs to resume from encrypted hibernation using those cmdline args, or would I have to do it manually each time?

A user has made a patch which seems to make it work, I'm considering adding it but still unsure about safety, see: https://www.kernel.org/doc/html/latest/power/swsusp.html

I don't think it should be an issue since the mount is made after the resume stage, but I'm not sure how this will work with other storage which is mounted, such as a device for keyfiles or headers.

diff --git a/src/ugrd/fs/resume.py b/src/ugrd/fs/resume.py
index 38f5ddd..9255d9c 100644
--- a/src/ugrd/fs/resume.py
+++ b/src/ugrd/fs/resume.py
@@ -1,7 +1,8 @@
 __version__ = "0.4.0"
 
+from zenlib.util import contains, unset
 
-def handle_resume(self) -> None:
+def _resume(self) -> None:
     """Returns a bash script handling resume from hibernation.
     Checks that /sys/power/resume is writable, resume= is set, and noresume is not set, if so,
     checks if PARTUUID= is in the resume var, and tries to use blkid to find the resume device.
@@ -17,7 +18,7 @@ def handle_resume(self) -> None:
     return [
         "resumeval=$(readvar resume)",  # read the cmdline resume var
         'if ! check_var noresume && [ -n "$resumeval" ] && [ -w /sys/power/resume ]; then',
-        '    if echo "$resumeval" | grep -q "PARTUUID="; then',  # resolve partuuid to device
+        '    if echo "$resumeval" | grep -q -E "^PARTUUID=|^UUID="; then',  # resolve partuuid or uuid to device
         '        resume=$(blkid -t "$resumeval" -o device)',
         "    else",
         "        resume=$resumeval",
@@ -35,3 +36,11 @@ def handle_resume(self) -> None:
         "    fi",
         "fi",
     ]
+
+@unset('late_resume')
+def handle_resume(self) -> None:
+    return _resume(self)
+
+@contains('late_resume', "Using late resume.")
+def late_resume(self) -> None:
+    return _resume(self)
\ No newline at end of file
diff --git a/src/ugrd/fs/resume.toml b/src/ugrd/fs/resume.toml
index c4732dd..fa74744 100644
--- a/src/ugrd/fs/resume.toml
+++ b/src/ugrd/fs/resume.toml
@@ -2,3 +2,9 @@ cmdline_strings = [ "resume" ]
 
 [imports.init_early] 
 "ugrd.fs.resume" = [ "handle_resume" ]
+
+[imports.init_premount]
+"ugrd.fs.resume" = [ "late_resume" ]
+
+[custom_parameters]
+late_resume = "bool"
\ No newline at end of file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants