-
Notifications
You must be signed in to change notification settings - Fork 283
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix infinite copy attempts when tmp and final dir are the same, but different #256
base: main
Are you sure you want to change the base?
Conversation
perhaps this is a dumb question, but wouldn't it be a lot simpler and more robust to just rename the file, and if that fails, we can assume the files are on different volumes and fall back to copy. I don't quite understand why we check:
and only attempt the rename in that case. To cover the case where the final path points to the same file, I suppose we would technically have to |
I initially thought that I'm not sure why the original authors went with this implementation, but now I see no reason not to go with the "try rename then copy" route. I'll check it out, test & update the request afterwards. |
Alright @arvidn, I've tested & pushed the simpler solution. From what I can tell, it appears that rename is more or less equal to hardlink + remove, and neither works reliably across mount points. I've updated the request post accordingly. Nevertheless, changing the "copy & rename" fallback strategy to simply "copy" ought to be enough to achieve "correctness" in all cases, there may be performance (or temporary space) penalties in the fallback cases. We might still want to check e.g. if the directory paths given are equivalent via |
'This PR has been flagged as stale due to no activity for over 60 |
See Chia-Network/chia-blockchain#5863
To elaborate on the title, when plotting and final and tmp2 dirs are the same canonically, but their paths are given differently (e.g. one expllicitly, the other as '.'), the code enters an infinite loop trying to copy
"./plot-k32-[...].plot.2.tmp" to "$PWD/plot-k32-[...].plot.2.tmp"
with an obvious error that the file exists.In the issue above (in the chia-blockchain) I suggested that files should simply be renamed when both paths point to the same filesystem, but apparently
std::filesystem
doesn't exposest_dev
field explicitly, so instead the code attempts to create a hard link and reverts back to copying when it fails (e.g. when the files reside on different filesystems, or the filesystem doesn't support hard links).To summarize the changes:
Before the final rename, the.plot.2.tmp
file is first hardlinked from final dir to tmp2 dir. If the hard linking fails, the previous copy routine is used. Final rename requires either a successful copy or a hard link.fs::rename
doesn't recognize that the underlying filesystems for temp 2 and final directories are the same.The code may still fail, if the tmp2 dir and final dir are essentially the same, but given through different canonical paths. An example of this happening is mounting the same block device under different mount points, e.g. mounting /dev/sda1 as/mnt/plots
and/mnt/tmp2
, and passing the mount points accordingly. The potential fixes for this are:Always attempt to hardlink final plot to tmp2 plot before going the copy route; still requires care in case the hard link fails, but the directories are essentially the same;Usestd::filesystem::equivalent
instead of comparing the parent paths;Use purestat
syscall without thestd::filesystem
wrapper to comparest_dev
of tmp2 and final dirs; if they match rename directly from tmp2 to final plot file, otherwise go safely the copy & rename routeThe code should now properly handle cases where the same physical directory is mounted in different logical locations, although platform-dependent caveats may prove to make them suboptimal. It seems the original idea of recognizing both
/mnt/plots
and/mnt/tmp2
as the same filesystem and using hard links/rename in those cases doesn't work in many cases due to Linux deliberate (?) limitations (c.f. https://unix.stackexchange.com/a/380033). That's why I ended up with direct copy instead of copy + rename.