Skip to content

fs/ntfs3: Fix memory corruption when page_size changes #6218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 21, 2024

Conversation

popcornmix
Copy link
Collaborator

The rework in fs/ntfs3: Reduce stack usage
changes log->page_size but doesn't change the associated log->page_mask and log->page_bits.

That results in the bytes value in read_log_page
getting a negative value, which is bad when it is
passed to memcpy.

The kernel panic can be observed when connecting an ntfs formatted drive that has previously been connected to a Windows machine to a Raspberry Pi 5, which by defauilt uses a 16K kernel pagesize.

Fixes: 865e7a7 ("fs/ntfs3: Reduce stack usage")

@6by9
Copy link
Contributor

6by9 commented Jun 11, 2024

It'd be worth throwing that patch upstream for review.

@PatrickGlatt
Copy link

PatrickGlatt commented Jun 12, 2024

I’m affected by the issue with my over 35 external NTFS HDDs (my user there is Patrick_G): https://forums.raspberrypi.com/viewtopic.php?p=2218370#p2218370

I could test it with my Pi 5. Please let me know if there is a build I could install, as the user “dom” suggested. It seems like the current pull request is failing? Should I try running “sudo rpi-update pulls/6218”?

@pelwell
Copy link
Contributor

pelwell commented Jun 12, 2024

Yes - the checkpatch failure is only advisory.

@PatrickGlatt
Copy link

PatrickGlatt commented Jun 13, 2024

So, this took me some time. I attempted the update three times this evening, and the first attempt failed completely. If your Raspberry Pi has already updated to 6.6 and some disk issues persist, it might be better to downgrade to 6.1.

Here’s the method that worked to get the Pi running:
1.) Shut down the Raspberry Pi 5 and disconnect all disks.
2.) Run sudo rpi-update pulls/6218 and restart the Pi.
3.) Clean the media folder from dead disk folders using sudo rm -rf /media/[username]/*.
4.) Connect all drives to the pi one by one.
5.) Restart the Pi. (Attention mounting the disks takes now some time, for me its over 2 minutes with 30 disks)

BUT! However, one issue still persists. The disks mount very, very slowly with this 6.6 kernel. It takes over 2 minutes to mount my 30 USB HDD drives. With kernel 6.1, it was about 30 seconds. But it seems this updated kernel 6.6 doesn’t freeze anymore.

I will now run the pi with this 6.6 kernel but something is still broken with mounting a lot of ntfs drives. Specially after a reboot.
my kernel Version is now: Linux raspberrypi 6.6.32-v8-16k+ #1 SMP PREEMPT Tue Jun 11 16:25:32 UTC 2024 aarch64 GNU/Linux

@PatrickGlatt
Copy link

PatrickGlatt commented Jun 16, 2024

Update2: After a restart (crash or power loss idk) 3 HDDs are missing again. Seems this fix didnt help :(
Moved back to kernel 6.1 and everything works fine again. HDDs needed to be connected manually again.

@popcornmix
Copy link
Collaborator Author

Post the dmesg output from the 6.6 boot with missing drives.

@popcornmix
Copy link
Collaborator Author

@6by9 submited but no response so far:
https://lore.kernel.org/ntfs3/CAEi1pCSQePMo4X_RvMfYmpxYwmuamhd8=1OXgNCU-N2BBdTXPg@mail.gmail.com/T/#t

@popcornmix
Copy link
Collaborator Author

Two reports of this fixing the issue here

@PatrickGlatt
Copy link

Are the logs still available? or only a few hour days?
Any idea why the mounting of ntfs drives is so slow on 6.6 and on 6.1 not?

@popcornmix
Copy link
Collaborator Author

Are the logs still available? or only a few hour days?

dmesg logs are available until they are overwritten.
On a normally working system that should still be there, but you may lose the start if there are many messages.

But you can always reboot and capture a fresh log (I assume the failure is repeatable?)

dmesg | pastebinit and post the url.

Any idea why the mounting of ntfs drives is so slow on 6.6 and on 6.1 not?

No, and that's not going to be related to this issue. You should create a new one.

The rework in fs/ntfs3: Reduce stack usage
changes log->page_size but doesn't change the associated
log->page_mask and log->page_bits.

That results in the bytes value in read_log_page
getting a negative value, which is bad when it is
passed to memcpy.

The kernel panic can be observed when connecting an
ntfs formatted drive that has previously been connected
to a Windows machine to a Raspberry Pi 5, which by defauilt
uses a 16K kernel pagesize.

Fixes: 865e7a7 ("fs/ntfs3: Reduce stack usage")
Signed-off-by: Dom Cobley <[email protected]>
@popcornmix popcornmix marked this pull request as ready for review June 21, 2024 10:42
@popcornmix
Copy link
Collaborator Author

No response after a week from submission upstream.

Suggest we just merge this as it has reports of fixing a clear error.

@pelwell pelwell merged commit 7af85d5 into raspberrypi:rpi-6.6.y Jun 21, 2024
11 of 12 checks passed
@popcornmix popcornmix deleted the ntfsfix branch June 21, 2024 11:36
popcornmix added a commit to raspberrypi/firmware that referenced this pull request Jun 21, 2024
popcornmix added a commit to raspberrypi/rpi-firmware that referenced this pull request Jun 21, 2024
@lategoodbye
Copy link
Contributor

@6by9 submited but no response so far: https://lore.kernel.org/ntfs3/CAEi1pCSQePMo4X_RvMfYmpxYwmuamhd8=1OXgNCU-N2BBdTXPg@mail.gmail.com/T/#t

Thanks for reporting this upstream. Maybe the maintainer is not comfortable with bug reports from a vendor tree plus a non-recent Kernel version (6.9.x or 6.10-rcX is more welcome). At least the regression is on the tracker list, so in case the maintainer doesn't react this would be escalated.

@popcornmix
Copy link
Collaborator Author

I can confirm that bug is present in our rpi-6.10-y tree, and the fix still works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants