-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added the ability to read xdmf dumps from Idefix
#336
base: main
Are you sure you want to change the base?
Conversation
[DOC] Minor correction in documentation
This is awesome, thanks a lot for doing this ! In you test script, note that import yt_idefix can be omitted if you have Otherwise, regarding the PR itself, keep in mind we'll need to add at least one formal test for it, but as it'll require using |
quick update: I have a lot going on at the moment and I don't think I'll be able to prioritize this review before next week, but I'm not forgetting about it. Thanks for your patience ! |
No hurries at all. |
I just pushed a couple changes to the class hierarchy and xdmf tests so that parametrized tests still pass with your dataset included (but I didn't commit the data itself yet). |
702e6ec
to
991580c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this took me so long to get back to you. I pushed a couple simplifications myself and added some additional comments and suggestions. I think we need to try and deduplicate as much as possible before I can merge.
src/yt_idefix/data_structures.py
Outdated
key_entry = "" | ||
for key in base_groups: | ||
if "Timestep_" in key: | ||
key_entry = key | ||
break | ||
if key_entry == "": | ||
fileh.close() | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible that key == ""
? if not, we should simplify the logic here as
key_entry = "" | |
for key in base_groups: | |
if "Timestep_" in key: | |
key_entry = key | |
break | |
if key_entry == "": | |
fileh.close() | |
return False | |
for key in base_groups: | |
if "Timestep_" in key: | |
break | |
else: | |
fileh.close() | |
return False |
(in case you to read more about this uncommon pattern: https://docs.python.org/3/tutorial/controlflow.html#else-clauses-on-loops)
src/yt_idefix/data_structures.py
Outdated
attributes = list(fileh[f"/{key_entry}"].attrs.keys()) | ||
if "version" not in attributes: | ||
fileh.close() | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can probably make this slightly more efficient by reading keys eagerly instead of greedily
attributes = list(fileh[f"/{key_entry}"].attrs.keys()) | |
if "version" not in attributes: | |
fileh.close() | |
return False | |
for key in fileh[f"/{key_entry}"].attrs.keys(): | |
if key == "version": | |
break | |
else: | |
fileh.close() | |
return False |
src/yt_idefix/data_structures.py
Outdated
|
||
# parse the grid | ||
coords = h5_io.read_grid_coordinates( | ||
self.filename, geometry=self.attributes["geometry"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the only difference (seems to me) with PlutoXdmfDataset._parse_parameter_file
is where the geometry is read from, though they can be unified by using the already set self.geometry
attribute instead. I'll push a commit to this effect.
base_groups = list(h5f.keys()) | ||
key_entry = "" | ||
for key in base_groups: | ||
if "Timestep_" in key: | ||
key_entry = key | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base_groups = list(h5f.keys()) | |
key_entry = "" | |
for key in base_groups: | |
if "Timestep_" in key: | |
key_entry = key | |
break | |
for key in h5f.keys(): | |
if "Timestep_" in key: | |
key_entry = key | |
break | |
else: | |
key_entry = "" |
(though this should be equivalent to what you have, I am not sure about the else
clause here. I think we should probably raise a RuntimeError directly)
self.current_time = float(h5f[f"/{key_entry}"].attrs["time"]) | ||
version_line = h5f[f"/{key_entry}"].attrs["version"][0].decode() | ||
attributes = list(h5f[f"/{key_entry}"].attrs.keys()) | ||
self.attributes = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we shouldn't need to add arbitrary attributes to Dataset
instances: that's already what self.parameters
is for. Is it somehow not appropriate here ?
match = re.search(r"\d+\.\d+\.?\d*[-\w+]*", version_line) | ||
if match is None: | ||
warnings.warn( | ||
"Could not determine code version from file HDF5 file attribute", | ||
stacklevel=2, | ||
) | ||
return "unknown" | ||
|
||
return match.group() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to avoid warnings here because it's very hard to know what stacklevel value is relevant (and it may depend on the situation)
match = re.search(r"\d+\.\d+\.?\d*[-\w+]*", version_line) | |
if match is None: | |
warnings.warn( | |
"Could not determine code version from file HDF5 file attribute", | |
stacklevel=2, | |
) | |
return "unknown" | |
return match.group() | |
if (res := re.match(r"\d+\.\d+\.?\d*[-\w+]*", version_line)) is not None: | |
return res.group() | |
else: | |
return "" |
self.geometry = Geometry(self.parameters["definitions"]["geometry"]) | ||
|
||
@override | ||
def _set_code_unit_attributes(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function, as well as invalid_unit_combinations
and default_units
, seem to be 100% identical to what's in StaticPlutoDataset
. Is this so ? It's not obvious to me how to avoid duplication here, but I think we should give it a try.
self.parameters["definitions"][attribute] = self.attributes[attribute] | ||
|
||
|
||
class PlutoVtkDataset(VtkMixin, StaticPlutoDataset): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this file is becoming a mess, I get it, but reorganizing the code makes git blame
less useful so I tend to avoid moving classes around just for the sake of organizing. It should also be avoided in a PR that doesn't litteraly anything else because it inflates the diff and makes reviewing harder than it needs to be.
Yes sure. I agree with you. Let me go through this and get back to you. |
@neutrinoceros A quick update. I need a few more days to get through everything you suggested and address them as I'm traveling right now. Sorry for the delay. |
Thanks, but no rush ! |
@neutrinoceros As discussed a few days ago, here is the first implementation of a working reader to parse Idefix xdmf dumps.
While developing the HDF5 IO module on
Idefix
, I have taken a special attention to make the files contain additional attributes that make them self-contained and hence don't needdefinitions.hpp
oridefix.ini
to process the data. I am aware that my implementation is far from being the cleanest, and simply does the job for the time being. But I believe that it is a good head start and can be trimmed and matured to a polished version. I'm open to take in comments to improve this.To test, I'm using the following code with a KHI data dump from the following PR. I'm also sharing the data used in this test here in this link.
which gives the following output:
I compared this with the following ParaView output:
which seems to agree well with each other.
The
yt
code output is the following