Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex fix in io.py #357

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion src/yt_idefix/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ def _read_fluid_selection(self, chunks, selector, fields, size):
2 4.998045e+00 3.400969e-03 1458 single_file little rho vx1 vx2 vx3 prs tr1 tr2 tr3 Temp ndens PbykB mach
3 7.497932e+00 3.386245e-03 2186 single_file little rho vx1 vx2 vx3 prs tr1 tr2 tr3 Temp ndens PbykB mach
"""
if (match := re.search(r"\d{4}", self.ds.filename)) is not None:
if (match := re.search(r"\d{4}(?!.*\d{4})", self.ds.filename)) is not None:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to find documentation for ?! (negative look-ahead), but I never used this. I'll ask a couple naive questions to make sure I understand how this works.
Surrounding () makes it a non-capturing group, is that right ?
So, the content of match is unchanged when the search does succeed (good), but we now reject file names that have more than one substring matching \d{4} ?

If the goal is to only match the file name (excluding the directory), a simpler fix would be

Suggested change
if (match := re.search(r"\d{4}(?!.*\d{4})", self.ds.filename)) is not None:
if (match := re.search(r"\d{4}", self.ds.basename)) is not None:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@neutrinoceros You are right. I wanted to just make this take the last match with 4 digits as it contains the filename. But I think what you are suggesting is also pretty good and will work.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, since this is a pretty different approach I'll open a separate PR then.

entry = int(match.group())
else:
raise RuntimeError(f"Failed to parse output number from {self.ds.filename}")
Expand Down
Loading