Skip to content

Conversation

h-mayorquin
Copy link
Contributor

Users sometimes place extra directories and files under the Neuralynx root. Today, we exclude by extension, but in “one-dir” mode the reader asserts that every entry must be a file, which raises an error:

if self.rawmode == "one-dir":
filenames = sorted(os.listdir(self.dirname))
else:
filenames = self.include_filenames
filenames = [f for f in filenames if f not in self.exclude_filenames]
full_filenames = [os.path.join(self.dirname, f) for f in filenames]
for filename in full_filenames:
if not os.path.isfile(filename):
raise ValueError(
f"Provided Filename is not a file: "
f"{filename}. If you want to provide a "
f"directory use the `dirname` keyword"
)

We should fix this to (1) make the workflow more forgiving for users and (2) align behavior with include_filenames, which already allows nested folders.

Proposed change: restructure file discovery into three steps.

  1. Gather candidates by mode:

    • multiple-files: validate existence only for the explicitly provided filenames.
    • one-dir: list directory entries and keep only regular files via Path.is_file() (silently ignore subdirectories).
  2. Apply the excluded-filenames filter.

  3. Keep only paths with valid Neuralynx extensions.

This ensures directories are ignored in “one-dir” mode while preserving strict validation in “multiple-files” mode, and it removes redundant extension checks from the main processing loop.

cc @weiglszonja

@h-mayorquin h-mayorquin changed the title Fix one-dir mode in NeuralynxRawIO with nested dicts Fix NeuralynxRawIO reading in one-dir mode with nested directories Sep 18, 2025
if filename is not None:
include_filenames = [filename]
warnings.warn("`filename` is deprecated and will be removed. Please use `include_filenames` instead")
warnings.warn("`filename` is deprecated and will be removed in v1.0. Please use `include_filenames` instead")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here:
#1440 (comment)

@h-mayorquin h-mayorquin marked this pull request as ready for review September 19, 2025 14:48
@zm711
Copy link
Contributor

zm711 commented Sep 19, 2025

@PeterNSteinmetz, we are trying to keep you in the loop. We are planning on making some adjustments in gap detection across our RawIOs (see #1773), which means we are working on various IOs in anticipation of our more general BaseRawIO level changes. If you have time we would love you to check out this PR and #1779. If you don't have time we can move forward but we appreciate all your help so if you have discussion points as we make these changes feel free to let us know.

@PeterNSteinmetz
Copy link
Contributor

I think this change, #1777, will cause it to throw a ValueError is a directory name is provided in include_filenames. It will not look for files in sub-directories.

@h-mayorquin
Copy link
Contributor Author

I think this change, #1777, will cause it to throw a ValueError is a directory name is provided in include_filenames. It will not look for files in sub-directories.

I don't understand, the current behavior on master is that to throw an error in both "one-dir" and "multiple-files" when there is a nested folder:

if self.rawmode == "one-dir":
filenames = sorted(os.listdir(self.dirname))
else:
filenames = self.include_filenames
filenames = [f for f in filenames if f not in self.exclude_filenames]
full_filenames = [os.path.join(self.dirname, f) for f in filenames]
for filename in full_filenames:
if not os.path.isfile(filename):
raise ValueError(
f"Provided Filename is not a file: "
f"{filename}. If you want to provide a "
f"directory use the `dirname` keyword"
)

@PeterNSteinmetz
Copy link
Contributor

Yes, I guess it throws an error with sub-directories and one-dir mode. Sorry I thought you were quoting the proposed fix above. I see in the commit list the changes. Those appear that they should work. Is there a test added for this?

@h-mayorquin
Copy link
Contributor Author

Yes, I guess it throws an error with sub-directories and one-dir mode. Sorry I thought you were quoting the proposed fix above. I see in the commit list the changes. Those appear that they should work. Is there a test added for this?

Ah, that makes sense.

Regarding the test,
Yes, I added a test to ensure that random nested directories do not throw the reader off.

Copy link
Contributor

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a couple very cosmetic things.

Comment on lines 236 to +253
_, ext = os.path.splitext(filename)
ext = ext[1:] # remove dot
ext = ext.lower() # make lower case for comparisons
if ext not in self.extensions:
continue
ext = ext[1:].lower() # remove dot and make lower case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You were converting the other part to Pathlib. Why not here? Not that it is required all in one PR. Just curious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not want the PR to grow too big.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy to do a full change to pathlib in another PR if you think is worth it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessary for me so I would say let's just put that on the back burner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

Comment on lines 58 to 59
import pathlib
from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the double import? if we need pathlib couldn't you just use pathlib.Path below? Although I vaguely remember you telling me that imports from the standard library are basically free.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did not see the other and reverted to my habit of importing like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

else: # one-dir mode
# For one-dir mode, get all files from directory
dir_path = Path(self.dirname)
file_paths = [item for item in sorted(dir_path.iterdir()) if item.is_file()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And just for my own learning is there any risk of doing sorted here? If different os's lead to different sortings could that lead to some non-deterministic behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the previous behavior. I think that it will be more uniform if we sort rather that if we just collect directly. That is, is more deterministic with sorting not less.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only bring this up because I think I tried something like this on Mac vs Windows and found that for some file type they had a different behavior. So per OS it is deterministic, but if we make assumption about order across OS it might not be true. But I don't remember and I don't have a citation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, that's surprising to me. The order is just based on the file name so I guess if the are full paths the slashes might change the thing?

Anyway, I think we should just preserve the old behavior and order by file name.

Copy link
Contributor Author

@h-mayorquin h-mayorquin Sep 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Now it should be exactly as it was before.

@zm711 zm711 merged commit ab3d0f5 into NeuralEnsemble:master Sep 24, 2025
3 checks passed
@h-mayorquin h-mayorquin deleted the neuralynx_fix_one_dir_nested_directory branch September 24, 2025 17:30
@zm711 zm711 added this to the 0.14.3 milestone Oct 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants