-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement snapshot/incremental reload functionality #293
Conversation
#include "libelastio-snap.h" | ||
|
||
#define RELOAD_SCRIPT_PATH "/etc/elastio/dla" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does dla
stand for here? Also I read this as the path to the reload script is /etc/elastio/dla
, but looking at how this is used, it looks more like that's a directory that contains reload scripts.
I think this would benefit from an explanatory comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, to me, it's also unclear what exactly dla
means, but this directory name was used initially in dattobd
. I'll add an explanatory comment.
lib/libelastio-snap.c
Outdated
return 0; | ||
} | ||
|
||
return 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know nothing about the established conventions in this code, so I won't second guess them here. But the returning 0 for success and non-zero on error is a bit shocking to my hasnt-written-C-code-in-ages sensibilities. The purpose of this comment is just to make sure this is the right convention for the code here. Elsewhere in this change set I see -1 used as an error signal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think -1
makes indeed more sense here. Replaced this.
lib/libelastio-snap.c
Outdated
return -1; | ||
} | ||
|
||
ret = sscanf(buf, "/usr/bin/elioctl reload-%s -c %u %s %s %u %s", mode, &rp->cache_size, rp->bdev, rp->cow, &rp->minor, ignore_errors); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to make this assumption about the path to elioctl?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we see that installation scripts use exactly this path:
dist/initramfs/dracut/module-setup.sh: inst /usr/bin/elioctl
dist/initramfs/initramfs-tools/hooks/elastio-snap:copy_exec /usr/bin/elioctl
dist/initramfs/initrd/boot-elastio-snap.sh:#%programs: /usr/bin/elioctl
But I think it would be prettier defining this path above near RELOAD_SCRIPT_PATH
@e-kov what do you think about this PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@skypodolsky I'm glad this laconic change makes incrementals/snapshots functionality working after reboot.
However there are few concerns:
- Logging. I'd like to have it, but not from the library or not to stdout. See corresponding comment.
- Docs. It would be great to update some docs (doc/elioctl.8 or README.md) with an info about reload scripts in the
dla
dir which are managed by the library. There is almost nothing about it even now, when users of this driver can manage some scripts there manually. Maybe this info can be added to the description of thereload-*
commands. - Possibility to generate incorrect reload scripts after deletion of the existing ones. Here we can just stop to update them or generate them correctly using another sources. See appropriate comments.
Feel free to ignore my comments or use your judgement to handle some. The PR is already approved by respected experienced developers in the field of drivers and backups 😉
lib/libelastio-snap.c
Outdated
struct reload_script_params rp; | ||
if (elastio_snap_get_reload_params(minor, &rp)) | ||
printf("get script params failed\n"); | ||
|
||
strcpy(rp.cow, cow); | ||
|
||
if (elastio_snap_set_reload_params(minor, true, &rp)) | ||
printf("update script params failed\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you should not call elastio_snap_set_reload_params
in case if elastio_snap_get_reload_params
returns an error. I've removed /etc/elastio/dla/reload_0.sh
after few snapshot and it's appeared again after transition-to-snapshot
. As result, new reload_0.sh
contains incorrect params:
/usr/bin/elioctl reload-snapshot -c 0 ^A /cow3 0
^@
I'd suggest to use info about a snap from /proc/elastio-snap-info
or by an info IOCTL and rewrite reload script each time without parsing the existing one (or maybe not???).
[nit] Also I don't know what ^@
does mean. Does it something legitimate as EOL or not. It always appears in the reload_[M].sh
. However scripts are working fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @e-kov, thank you for such a detailed review. Indeed, I think that these symbols you saw were caused by incorrect parsing of a script file. I have reconsidered this place: now, if an error took place while parsing the script file, we don't allow any transitions.
lib/libelastio-snap.c
Outdated
struct reload_script_params rp; | ||
if (elastio_snap_get_reload_params(minor, &rp)) | ||
printf("get script params failed\n"); | ||
|
||
if (elastio_snap_set_reload_params(minor, false, &rp)) | ||
printf("update script params failed\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably you should not call elastio_snap_set_reload_params
in case if elastio_snap_get_reload_params
returns an error. It's same as in the comment for elastio_snap_transition_snapshot
.
lib/libelastio-snap.c
Outdated
struct reload_script_params rp; | ||
if (elastio_snap_get_reload_params(minor, &rp)) | ||
printf("get script params failed\n"); | ||
|
||
rp.cache_size = cache_size; | ||
|
||
if (elastio_snap_set_reload_params(minor, rp.snapshot, &rp)) | ||
printf("update script params failed\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same about elastio_snap_set_reload_params
call after failed elastio_snap_get_reload_params
call.
lib/libelastio-snap.c
Outdated
strcpy(rp.cow, cow); | ||
|
||
if (elastio_snap_set_reload_params(minor, true, &rp)) | ||
printf("update script params failed\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logging. It was not present here before. Logically small wrapper around IOCTLs has nothing to log. And it looks fine when this library is used from elioctl
... But it is a big problem for elastio
CLI. Uncontrolled by elastio
writes to STDOUT will break JSON outputs. Hence all "job's" output is screwed up and not suitable then for the support folks.
As an option, STDERR, is not a good option too. In this case we'll break progress reporting of elastio
. It's not so dramatic, but still a bug.
I'd suggest to write errors into dmesg. Another option is to return errors w/o logging here and then log appropriate messages from elioctl
or whatever uses this library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that clarification, I didn't know such a thing about elastio
. I have replaced all logging with an error that is returned if script-related stuff didn't work properly.
47b2692
to
79a3458
Compare
The main concept is to create a reload script when the snapshot is created or changes its state, allowing the kernel to call it before the root file system is mounted. This functionality has been implemented in libelastio-snap as it's used in both elastio and elioctl.
Works everywhere but Fedora 38, Ubuntu 20.04 and Ubuntu 22.04. The relevant issue is here: #292,
Closes #289