Add open_files function to read GPM files from list of file paths#81
Add open_files function to read GPM files from list of file paths#81
Conversation
|
HI @kmuehlbauer ! I add this in mind the entire week so I spent 2 hours this morning to implement it. Can you try if it works well for you use case and report possible improvements? To avoid netCDF locking, I typically run it by initializing a dask client as follow: FYI: The PR tests fails for a minor problem related to an update of the polars library, but this affect a specific functionality of the software which should not concern you. I will fix it as soon as I have time. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #81 +/- ##
==========================================
- Coverage 91.18% 90.83% -0.36%
==========================================
Files 135 135
Lines 17214 17537 +323
==========================================
+ Hits 15696 15929 +233
- Misses 1518 1608 +90 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Adds a new gpm.open_files entry point intended to open arbitrary GPM files from explicit file paths (rather than relying on filename-based parsing), addressing Issue #80 by inferring the product from file metadata when possible.
Changes:
- Add
open_filesAPI to open one/many filepaths and infer product attributes for decoding. - Make decoding/coordinate logic more tolerant of unknown
productand addgpm_api_productattributes during finalization. - Misc maintenance updates across tests, CI, tooling, docs, and some visualization/utilities.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| tutorials/tutorial_03_SR_GR_Matching.ipynb | Minor text/metadata updates in tutorial notebook. |
| tutorials/tutorial_03_SR_GR_Calibration.ipynb | Minor text/metadata updates in tutorial notebook. |
| pyproject.toml | Packaging metadata + Ruff ignore list adjustments. |
| gpm/visualization/eda.py | Add Ruff suppression on plt.subplots() assignment. |
| gpm/visualization/cross_section.py | Add Ruff suppression on variable reassignment. |
| gpm/utils/pyresample.py | Remove unused local variables in remap routine. |
| gpm/utils/collocation.py | Adjust xr.concat options for compatibility. |
| gpm/tests/test_io/test_download.py | Tighten warning assertions to per-call scope. |
| gpm/tests/test_dataset/test_granule_files.py | Add basic test coverage for open_files. |
| gpm/tests/test_bucket/test_routines.py | Add Ruff suppression comment on regex match. |
| gpm/retrievals/retrieval_2a_radar.py | Add Ruff suppression on assignment. |
| gpm/retrievals/retrieval_1b_c_pmw.py | Add MWCC-H hail probability retrieval. |
| gpm/io/products.py | Add cached loader for products_attributes.yaml. |
| gpm/io/download.py | Add Ruff suppression on tuple unpacking. |
| gpm/io/checks.py | Make check_product accept optional product_type. |
| gpm/gv/routines.py | Add Ruff suppression on assignments. |
| gpm/etc/products_attributes.yaml | New metadata mapping for product inference. |
| gpm/dataset/granule.py | Allow scan_modes=None and attempt scan-mode autodetection. |
| gpm/dataset/decoding/dataarray_attrs.py | Remove per-variable product tagging from attr standardization. |
| gpm/dataset/decoding/coordinates.py | Skip product-specific coordinate logic when product is None. |
| gpm/dataset/dataset.py | Add open_files + product inference and scan-mode handling changes. |
| gpm/dataset/coords.py | Support 1D geolocation arrays (along-track only). |
| gpm/dataset/conventions.py | Add add_gpm_api_product and adjust finalization ordering/guards. |
| gpm/bucket/dataframe.py | Adjust Polars casting behavior in pl_cut. |
| gpm/init.py | Export open_files and set a global xarray option. |
| docs/source/07_maintainers_guidelines.rst | Remove CodeBeat mention. |
| README.md | Badge table and stated supported Python versions updated. |
| CONTRIBUTING.rst | Remove CodeBeat mention. |
| .pre-commit-config.yaml | Update hook versions and tweak enabled hooks. |
| .github/workflows/tests_windows.yaml | Update Windows CI matrix/schedule and actions versions. |
| .github/workflows/tests.yaml | Update CI matrix/schedule and actions versions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Prework
What kind of change does this PR introduce? (check at least one)
Does this PR introduce a breaking change? (check one)
If yes, please describe the impact and communicate accordingly:
The PR fulfills these requirements:
bugfix-<some_key>-<word>doc-<some_key>-<word>tutorial-<some_key>-<word>feature-<some_key>-<word>refactor-<some_key>-<word>optimize-<some_key>-<word>fix #xxx[,#xxx], where "xxx" is the issue number)Summary
This PR adds the gpm.open_files function which allows to read a list of GPM files given the specified filepaths.
This PR address #80.