-
Notifications
You must be signed in to change notification settings - Fork 196
Class VFS inherits FilesystemBase. #5584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
4203cf0
to
27c1cd0
Compare
27c1cd0
to
8429142
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for un-statusing VFS
!
Left some small suggestions, as well as some larger-scoped items to consider in the future.
vfs.remove_dir(uris[i]); | ||
if (vfs.is_file(commit_uris_to_delete[i])) { | ||
vfs.remove_file(commit_uris_to_delete[i]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the commit should be deleted first. remove_dir
is not atomic, and if it fails in the middle of deleting the fragment directory, the array will be left in a corrupted state.
// Create array directory | ||
throw_if_not_ok(resources.vfs().create_dir(array_uri)); | ||
resources.vfs().create_dir(array_uri); | ||
|
||
// Create array schema directory | ||
URI array_schema_dir_uri = | ||
array_uri.join_path(constants::array_schema_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_schema_dir_uri)); | ||
resources.vfs().create_dir(array_schema_dir_uri); | ||
|
||
// Create the enumerations directory inside the array schema directory | ||
URI array_enumerations_uri = | ||
array_schema_dir_uri.join_path(constants::array_enumerations_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_enumerations_uri)); | ||
resources.vfs().create_dir(array_enumerations_uri); | ||
|
||
// Create commit directory | ||
URI array_commit_uri = array_uri.join_path(constants::array_commits_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_commit_uri)); | ||
resources.vfs().create_dir(array_commit_uri); | ||
|
||
// Create fragments directory | ||
URI array_fragments_uri = | ||
array_uri.join_path(constants::array_fragments_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_fragments_uri)); | ||
resources.vfs().create_dir(array_fragments_uri); | ||
|
||
// Create array metadata directory | ||
URI array_metadata_uri = | ||
array_uri.join_path(constants::array_metadata_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_metadata_uri)); | ||
resources.vfs().create_dir(array_metadata_uri); | ||
|
||
// Create fragment metadata directory | ||
URI array_fragment_metadata_uri = | ||
array_uri.join_path(constants::array_fragment_meta_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_fragment_metadata_uri)); | ||
resources.vfs().create_dir(array_fragment_metadata_uri); | ||
|
||
// Create dimension label directory | ||
URI array_dimension_labels_uri = | ||
array_uri.join_path(constants::array_dimension_labels_dir_name); | ||
throw_if_not_ok(resources.vfs().create_dir(array_dimension_labels_uri)); | ||
resources.vfs().create_dir(array_dimension_labels_uri); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was actually thinking that we should remove this. It's always been a no-op on object storage and on file storage it's not necessary since #5480 and just creates potentially unnecessary directories. Same with all calls to create_dir
in library code.
Or not, memfs does not automatically create parent directories.
if (vfs.is_file(commit_uri)) { | ||
vfs.remove_file(commit_uri); | ||
} | ||
|
||
bool is_dir = false; | ||
throw_if_not_ok(vfs.is_dir(fragment_uris_to_vacuum[i], &is_dir)); | ||
if (is_dir) { | ||
throw_if_not_ok(vfs.remove_dir(fragment_uris_to_vacuum[i])); | ||
if (vfs.is_dir(fragment_uris_to_vacuum[i])) { | ||
vfs.remove_dir(fragment_uris_to_vacuum[i]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several instances of "if is file/dir, then remove file/dir" that add unnecessary I/O and latency. We should just do the removal and ignore errors if the file/directory did not exist1. .NET's File.Remove
for example does not throw and returns a bool
to communicate whether the file existed.
Will open a story about this.
Footnotes
-
This is what will actually happen with
remove_dir
on object storage; it will make a list and remove everything, and if there's nothing to remove it won't fail. We should make sure the behavior in all backends is consistent. ↩
if (!vfs_->is_file(uri_)) { | ||
vfs_->touch(uri_); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!vfs_->is_file(uri_)) { | |
vfs_->touch(uri_); | |
} | |
vfs_->touch(uri_); |
VFS::touch
itself makes the check for the file's existence. For the clouds where we do conditional requests, there might be a concern that we are always going to do a more expensive write operation, but I believe the reduced roundtrips are worth it, and this code path runs only in the user-facing VFS C API.
// If new_uri exists, delete it or raise an error based on `force` | ||
bool is_file; | ||
RETURN_NOT_OK(this->is_file(new_uri, &is_file)); | ||
if (is_file) | ||
RETURN_NOT_OK(remove_file(new_uri)); | ||
if (this->is_file(new_uri)) { | ||
remove_file(new_uri); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to Posix
? IIRC the object storage backends will overwrite the file themselves, and on Windows the destination file will be removed only for Win::copy_file
to throw because it is not implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address in followup.
// If new_uri exists, delete it or raise an error based on `force` | ||
bool is_file; | ||
RETURN_NOT_OK(this->is_file(new_uri, &is_file)); | ||
if (is_file) | ||
RETURN_NOT_OK(remove_file(new_uri)); | ||
if (this->is_file(new_uri)) { | ||
remove_file(new_uri); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with copy_file
. BTW we should strive towards removing as much logic from the VFS
class as possible and pushing it down to the backends, or shared helpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address in followup.
CHECK(!vfs.is_file(largefile)); | ||
CHECK(!vfs.is_file(smallfile)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catch2 discourages writing CHECK(!foo)
, providing CHECK_FALSE(foo)
in its place.
There are too many instances in our tests that you don't have to do anything now; will open a story later.
Class
VFS
inherits base classFilesystemBase
.Resolves CORE-294.
TYPE: IMPROVEMENT
DESC: Class
VFS
inherits base classFilesystemBase
.