Skip to content

Conversation

@reith
Copy link
Contributor

@reith reith commented Jul 11, 2025

My image throws a kernel panic:

* Running fsck on: /dev/mapper/luksvg-gentoo
/dev/mapper/luksvg-gentoo: clean, 635733/20721152 files, 6987471/114859008 blocks
/etc/profile: line 420: /sys/block/dm-0/uevent: Success
[   10.201899] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
[   10.203771] CPU: 0 PID: 1 Comm: init Not tainted 6.12.34-gentoo-dist #1

The problem is sourcing the uevent file. I'm not exactly sure how it crashed the init but it doesn't seem we have to read from there.

Also, I could disable fake_dm_udev, by commenting the line in init file, and my systemd system booted just fine. Not sure if #181 was necessary.

My setup:

Ext4 rootfs partition on LVM and LVM on LUKS.
Systemd 257.7

I haven't changed default config and ugrd was able to detect LUKS and LVM automatically, great job :)

@reith reith mentioned this pull request Jul 11, 2025
@desultory
Copy link
Owner

desultory commented Jul 11, 2025

what version of ugrd did you replicate the crash with? In older versions, it didn't check for the file before sourcing, and used the source command instead of . which could cause issues.

That looks reasonable, but I'm not sure why dev may be needed over uevent

Using dev may be more compatible, but the uevent file seems designed to be easily sourced.

If you're using the stable release on gentoo, these bugs may still be present.

Interestingly, the fakeudev module was created because systemd systems would fail, saying they could not find the root volume. Maybe something has changed so this is no longer necessary?

@reith
Copy link
Contributor Author

reith commented Jul 11, 2025

what version of ugrd did you replicate the crash with? In older versions, it didn't check for the file before sourcing, and used the source command instead of . which could cause issues.

ugrd 2.0.1. This machine has a fresh Gentoo install. ugrd generated a bootable image the first time. Then I ran a full world update, which updated ~170 packages, including systemd, and the crash started to appear. I believe both initramfs images were generated by ugrd 2.0.1 (qlop -mv ugrd doesn't show any other version).

That looks reasonable, but I'm not sure why dev may be needed over uevent

I can read from uevent; the crash happens when the script sources it. So, eval "$(cat ${dm}/uevent)" works fine too, which suggests the content of file is shell-friendly too. I have no idea why sourcing crashes the init; maybe there is some kind of trap that interprets sourcing as the main script being killed? I don't know.

If you're using the stable release on gentoo, these bugs may still be present.

It's ~amd64.

@desultory
Copy link
Owner

desultory commented Jul 11, 2025

what version of ugrd did you replicate the crash with? In older versions, it didn't check for the file before sourcing, and used the source command instead of . which could cause issues.

ugrd 2.0.1. This machine has a fresh Gentoo install. ugrd generated a bootable image the first time. Then I ran a full world update, which updated ~170 packages, including systemd, and the crash started to appear. I believe both initramfs images were generated by ugrd 2.0.1 (qlop -mv ugrd doesn't show any other version).

That looks reasonable, but I'm not sure why dev may be needed over uevent

I can read from uevent; the crash happens when the script sources it. So, eval "$(cat ${dm}/uevent)" works fine too, which suggests the content of file is shell-friendly too. I have no idea why sourcing crashes the init; maybe there is some kind of trap that interprets sourcing as the main script being killed? I don't know.

If you're using the stable release on gentoo, these bugs may still be present.

It's ~amd64.

thanks for the info, can you share the contents of that file at runtime? you can use the debug module (ugrd.base.debug) and check it manually that way.

It's probably better to just parse the file like your patch does it anyways, im just interested in what may cause this file to not exist or be unsuitable for sourcing.

@reith
Copy link
Contributor Author

reith commented Jul 11, 2025

thanks for the info, can you share the contents of that file at runtime?

MAJOR=253
MINOR=0
DEVNAME=dm-0
DEVTYPE=disk
DISKSEQ=2

@desultory
Copy link
Owner

thanks for the info, can you share the contents of that file at runtime?

MAJOR=253
MINOR=0
DEVNAME=dm-0
DEVTYPE=disk
DISKSEQ=2

this looks fine to me? I can also source it just fine with multiple shells. I wonder what the problem is? did you try manually sourcing it in the debug shell?

@reith
Copy link
Contributor Author

reith commented Jul 11, 2025

thanks for the info, can you share the contents of that file at runtime?

MAJOR=253
MINOR=0
DEVNAME=dm-0
DEVTYPE=disk
DISKSEQ=2

this looks fine to me? I can also source it just fine with multiple shells. I wonder what the problem is? did you try manually sourcing it in the debug shell?

ugrd.base.debug module gives me a shell before opening LUKS device. At that point, these block devices are not mapped and the files do not exist.

I'd edited the etc/profile to spawn shells in fake_dm_udev, where I could source the files with no problems, but it seems sourcing them in another shell is not exactly like doing it directly in etc/profile.

Also the "Success" in the logs:

/etc/profile: line 420: /sys/block/dm-0/uevent: Success

may suggest the sourcing itself succeeded but there was a side effect that caused the init to be killed (or something interpret that it was killed)

@desultory
Copy link
Owner

thanks for the info, can you share the contents of that file at runtime?

MAJOR=253
MINOR=0
DEVNAME=dm-0
DEVTYPE=disk
DISKSEQ=2

this looks fine to me? I can also source it just fine with multiple shells. I wonder what the problem is? did you try manually sourcing it in the debug shell?

ugrd.base.debug module gives me a shell before opening LUKS device. At that point, these block devices are not mapped and the files do not exist.

I'd edited the etc/profile to spawn shells in fake_dm_udev, where I could source the files with no problems, but it seems sourcing them in another shell is not exactly like doing it directly in etc/profile.

Also the "Success" in the logs:

/etc/profile: line 420: /sys/block/dm-0/uevent: Success

may suggest the sourcing itself succeeded but there was a side effect that caused the init to be killed (or something interpret that it was killed)

im not sure why the behavior would differ? are you using a shell other than bash?

Your method seems more compatible anyways, but I'd like to make a test case for this, for a case where your method works but mine does not.

@reith
Copy link
Contributor Author

reith commented Jul 12, 2025

I discovered the crash only happens with bash 5.3. I downgraded bash to 5.2 (Gentoo's 5.2_p37-r3) and image boots fine. I recreated the image with a fresh build of bash 5.3 and it crashed again.

@desultory
Copy link
Owner

desultory commented Jul 12, 2025

I discovered the crash only happens with bash 5.3. I downgraded bash to 5.2 (Gentoo's 5.2_p37-r3) and image boots fine. I recreated the image with a fresh build of bash 5.3 and it crashed again.

Thanks for checking the shell version.

I asked in #bash in libera and emanuele6 says there is a new check which compares the filesize listed to the contents, and uevent may show being larger than it actually is. Seems it may be a minor bug from added checks.

In any case, your method looks more reliable and should get rid of the risks associated with sourcing stuff.


Another user hit the same issue and your patch fixed it :D

@desultory
Copy link
Owner

The only minor thing I'd adjust, is maybe it can use the check for the device mapper name as a check for attempting to log it.

Maybe this is a decent check, because it could prevent it from adding entries where not necessary? I'm really not sure, if it's skipping from a lack of a name, maybe it could log a warning for that case?

I think it would be fine if it simply used the name info for additional logging, and didn't fail to function if that info was missing. On that note, maybe it could log the found maj/minor at a debug level?

Something like:
log a warning if the name cannot be found,
log info if a name is found (with no extra info)
log debug info about the exact major/minor used, possibly even the exact file touched.

@desultory desultory merged commit 27d14af into desultory:main Jul 13, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants