-
Notifications
You must be signed in to change notification settings - Fork 261
Fix NeuralynxRawIO
reading in one-dir mode with nested directories
#1777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NeuralynxRawIO
reading in one-dir mode with nested directories
#1777
Conversation
NeuralynxRawIO
with nested dictsNeuralynxRawIO
reading in one-dir mode with nested directories
if filename is not None: | ||
include_filenames = [filename] | ||
warnings.warn("`filename` is deprecated and will be removed. Please use `include_filenames` instead") | ||
warnings.warn("`filename` is deprecated and will be removed in v1.0. Please use `include_filenames` instead") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From here:
#1440 (comment)
@PeterNSteinmetz, we are trying to keep you in the loop. We are planning on making some adjustments in gap detection across our RawIOs (see #1773), which means we are working on various IOs in anticipation of our more general BaseRawIO level changes. If you have time we would love you to check out this PR and #1779. If you don't have time we can move forward but we appreciate all your help so if you have discussion points as we make these changes feel free to let us know. |
I think this change, #1777, will cause it to throw a ValueError is a directory name is provided in include_filenames. It will not look for files in sub-directories. |
I don't understand, the current behavior on master is that to throw an error in both "one-dir" and "multiple-files" when there is a nested folder: python-neo/neo/rawio/neuralynxrawio/neuralynxrawio.py Lines 217 to 232 in 72ec76a
|
Yes, I guess it throws an error with sub-directories and one-dir mode. Sorry I thought you were quoting the proposed fix above. I see in the commit list the changes. Those appear that they should work. Is there a test added for this? |
Ah, that makes sense. Regarding the test, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a couple very cosmetic things.
_, ext = os.path.splitext(filename) | ||
ext = ext[1:] # remove dot | ||
ext = ext.lower() # make lower case for comparisons | ||
if ext not in self.extensions: | ||
continue | ||
ext = ext[1:].lower() # remove dot and make lower case |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were converting the other part to Pathlib. Why not here? Not that it is required all in one PR. Just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not want the PR to grow too big.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am happy to do a full change to pathlib in another PR if you think is worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not necessary for me so I would say let's just put that on the back burner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
Co-authored-by: Zach McKenzie <[email protected]>
Co-authored-by: Zach McKenzie <[email protected]>
import pathlib | ||
from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the double import? if we need pathlib couldn't you just use pathlib.Path below? Although I vaguely remember you telling me that imports from the standard library are basically free.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did not see the other and reverted to my habit of importing like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
else: # one-dir mode | ||
# For one-dir mode, get all files from directory | ||
dir_path = Path(self.dirname) | ||
file_paths = [item for item in sorted(dir_path.iterdir()) if item.is_file()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just for my own learning is there any risk of doing sorted
here? If different os's lead to different sortings could that lead to some non-deterministic behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was the previous behavior. I think that it will be more uniform if we sort rather that if we just collect directly. That is, is more deterministic with sorting not less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only bring this up because I think I tried something like this on Mac vs Windows and found that for some file type they had a different behavior. So per OS it is deterministic, but if we make assumption about order across OS it might not be true. But I don't remember and I don't have a citation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, that's surprising to me. The order is just based on the file name so I guess if the are full paths the slashes might change the thing?
Anyway, I think we should just preserve the old behavior and order by file name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Now it should be exactly as it was before.
Users sometimes place extra directories and files under the Neuralynx root. Today, we exclude by extension, but in “one-dir” mode the reader asserts that every entry must be a file, which raises an error:
python-neo/neo/rawio/neuralynxrawio/neuralynxrawio.py
Lines 217 to 231 in 72ec76a
We should fix this to (1) make the workflow more forgiving for users and (2) align behavior with
include_filenames
, which already allows nested folders.Proposed change: restructure file discovery into three steps.
Gather candidates by mode:
Path.is_file()
(silently ignore subdirectories).Apply the excluded-filenames filter.
Keep only paths with valid Neuralynx extensions.
This ensures directories are ignored in “one-dir” mode while preserving strict validation in “multiple-files” mode, and it removes redundant extension checks from the main processing loop.
cc @weiglszonja