Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added the ability to read xdmf dumps from Idefix #336

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

dutta-alankar
Copy link
Contributor

@dutta-alankar dutta-alankar commented Oct 3, 2024

@neutrinoceros As discussed a few days ago, here is the first implementation of a working reader to parse Idefix xdmf dumps.

While developing the HDF5 IO module on Idefix, I have taken a special attention to make the files contain additional attributes that make them self-contained and hence don't need definitions.hpp or idefix.ini to process the data. I am aware that my implementation is far from being the cleanest, and simply does the job for the time being. But I believe that it is a good head start and can be trimmed and matured to a polished version. I'm open to take in comments to improve this.

To test, I'm using the following code with a KHI data dump from the following PR. I'm also sharing the data used in this test here in this link.

import yt
import yt_idefix

num = 1
ds = yt.load(f"data.{num:04d}.flt.h5")

# Create gas density slices in all three axes.
p = yt.SlicePlot(ds, "z", ("gas", "density"))
p.set_cmap(field=("gas", "density"), cmap="viridis")
p.save()

which gives the following output:

data 0001 dbl h5_Slice_z_density

I compared this with the following ParaView output:

data 0001 dbl h5_Slice_z_density_Paraview

which seems to agree well with each other.

The yt code output is the following

yt : [INFO     ] 2024-10-04 11:42:12,973 Parameters: current_time              = 0.01003431438520486
yt : [INFO     ] 2024-10-04 11:42:12,974 Parameters: domain_dimensions         = [1024  256    1]
yt : [INFO     ] 2024-10-04 11:42:12,974 Parameters: domain_left_edge          = [0. 0. 0.]
yt : [INFO     ] 2024-10-04 11:42:12,975 Parameters: domain_right_edge         = [4. 1. 1.]
yt : [INFO     ] 2024-10-04 11:42:12,975 Parameters: cosmological_simulation   = 0
yt : [INFO     ] 2024-10-04 11:42:13,137 xlim = 0.000000 4.000000
yt : [INFO     ] 2024-10-04 11:42:13,138 ylim = 0.000000 1.000000
yt : [INFO     ] 2024-10-04 11:42:13,141 xlim = 0.000000 4.000000
yt : [INFO     ] 2024-10-04 11:42:13,141 ylim = 0.000000 1.000000
yt : [INFO     ] 2024-10-04 11:42:13,143 Making a fixed resolution buffer of (('gas', 'density')) 800 by 800
yt : [INFO     ] 2024-10-04 11:42:13,586 Saving plot data.0001.flt.h5_Slice_z_density.png

@neutrinoceros
Copy link
Owner

This is awesome, thanks a lot for doing this !
I will make sure to review in detail soon, but just a couple very quick comments right off the bat:

In you test script, note that

import yt_idefix

can be omitted if you have yt>=4.2.0 (which I think yt_idefix requires anyway).
In the upcoming yt 4.4.0, you'll also be able to install as pip install 'yt[idefix]' so users don't even need to know that yt_idefix is a separate package.

Otherwise, regarding the PR itself, keep in mind we'll need to add at least one formal test for it, but as it'll require using git-lfs (which I'm allergic to), we can do that part later / separately.

@neutrinoceros neutrinoceros linked an issue Oct 4, 2024 that may be closed by this pull request
@neutrinoceros
Copy link
Owner

quick update: I have a lot going on at the moment and I don't think I'll be able to prioritize this review before next week, but I'm not forgetting about it. Thanks for your patience !

@dutta-alankar
Copy link
Contributor Author

No hurries at all.

@neutrinoceros
Copy link
Owner

I just pushed a couple changes to the class hierarchy and xdmf tests so that parametrized tests still pass with your dataset included (but I didn't commit the data itself yet).

Copy link
Owner

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this took me so long to get back to you. I pushed a couple simplifications myself and added some additional comments and suggestions. I think we need to try and deduplicate as much as possible before I can merge.

Comment on lines 869 to 876
key_entry = ""
for key in base_groups:
if "Timestep_" in key:
key_entry = key
break
if key_entry == "":
fileh.close()
return False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that key == "" ? if not, we should simplify the logic here as

Suggested change
key_entry = ""
for key in base_groups:
if "Timestep_" in key:
key_entry = key
break
if key_entry == "":
fileh.close()
return False
for key in base_groups:
if "Timestep_" in key:
break
else:
fileh.close()
return False

(in case you to read more about this uncommon pattern: https://docs.python.org/3/tutorial/controlflow.html#else-clauses-on-loops)

Comment on lines 877 to 880
attributes = list(fileh[f"/{key_entry}"].attrs.keys())
if "version" not in attributes:
fileh.close()
return False
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can probably make this slightly more efficient by reading keys eagerly instead of greedily

Suggested change
attributes = list(fileh[f"/{key_entry}"].attrs.keys())
if "version" not in attributes:
fileh.close()
return False
for key in fileh[f"/{key_entry}"].attrs.keys():
if key == "version":
break
else:
fileh.close()
return False


# parse the grid
coords = h5_io.read_grid_coordinates(
self.filename, geometry=self.attributes["geometry"]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the only difference (seems to me) with PlutoXdmfDataset._parse_parameter_file is where the geometry is read from, though they can be unified by using the already set self.geometry attribute instead. I'll push a commit to this effect.

Comment on lines +928 to +933
base_groups = list(h5f.keys())
key_entry = ""
for key in base_groups:
if "Timestep_" in key:
key_entry = key
break
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
base_groups = list(h5f.keys())
key_entry = ""
for key in base_groups:
if "Timestep_" in key:
key_entry = key
break
for key in h5f.keys():
if "Timestep_" in key:
key_entry = key
break
else:
key_entry = ""

(though this should be equivalent to what you have, I am not sure about the else clause here. I think we should probably raise a RuntimeError directly)

self.current_time = float(h5f[f"/{key_entry}"].attrs["time"])
version_line = h5f[f"/{key_entry}"].attrs["version"][0].decode()
attributes = list(h5f[f"/{key_entry}"].attrs.keys())
self.attributes = {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, we shouldn't need to add arbitrary attributes to Dataset instances: that's already what self.parameters is for. Is it somehow not appropriate here ?

Comment on lines +947 to +955
match = re.search(r"\d+\.\d+\.?\d*[-\w+]*", version_line)
if match is None:
warnings.warn(
"Could not determine code version from file HDF5 file attribute",
stacklevel=2,
)
return "unknown"

return match.group()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid warnings here because it's very hard to know what stacklevel value is relevant (and it may depend on the situation)

Suggested change
match = re.search(r"\d+\.\d+\.?\d*[-\w+]*", version_line)
if match is None:
warnings.warn(
"Could not determine code version from file HDF5 file attribute",
stacklevel=2,
)
return "unknown"
return match.group()
if (res := re.match(r"\d+\.\d+\.?\d*[-\w+]*", version_line)) is not None:
return res.group()
else:
return ""

self.geometry = Geometry(self.parameters["definitions"]["geometry"])

@override
def _set_code_unit_attributes(self):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function, as well as invalid_unit_combinations and default_units, seem to be 100% identical to what's in StaticPlutoDataset. Is this so ? It's not obvious to me how to avoid duplication here, but I think we should give it a try.

self.parameters["definitions"][attribute] = self.attributes[attribute]


class PlutoVtkDataset(VtkMixin, StaticPlutoDataset):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is becoming a mess, I get it, but reorganizing the code makes git blame less useful so I tend to avoid moving classes around just for the sake of organizing. It should also be avoided in a PR that doesn't litteraly anything else because it inflates the diff and makes reviewing harder than it needs to be.

@dutta-alankar
Copy link
Contributor Author

Sorry this took me so long to get back to you. I pushed a couple simplifications myself and added some additional comments and suggestions. I think we need to try and deduplicate as much as possible before I can merge.

Yes sure. I agree with you. Let me go through this and get back to you.

@dutta-alankar
Copy link
Contributor Author

@neutrinoceros A quick update. I need a few more days to get through everything you suggested and address them as I'm traveling right now. Sorry for the delay.

@neutrinoceros
Copy link
Owner

Thanks, but no rush !

@neutrinoceros neutrinoceros added enhancement New feature or request code: Idefix Specific to Idefix io: hdf5 Specific to xdmf/hdf5 files labels Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code: Idefix Specific to Idefix enhancement New feature or request io: hdf5 Specific to xdmf/hdf5 files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: add support for idefix-xdfm
2 participants