Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/open heterogeneous #362

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

steph-ben
Copy link

@steph-ben steph-ben commented Dec 4, 2023

When working with heterogeneous GRIBs, it's often complicated to have a unified xr.Dataset with all data available, like opening NetCDF files.

This PR introduce a filter_heterogeneous argument, allowing :

  • Looping on all possible combinations of paramId/typeOfLevel/stepType
  • When needed, bump coordinate name to prevent collision

Compared to cfgrib.open_datasets, it allows to :

  • Get a single xr.Dataset from a GRIB file
  • Allow usage of xr.open_mfdataset() for opening many files

I try to summarize our expectations & proposed solutions here : https://gist.github.com/steph-ben/2cd80f033d8f00c365bb170142206bae

Example of usage, as per README.rst :

>>> import xarray as xr
>>> xr.open_dataset(
                'nam.t00z.awp21100.tm00.grib2', 
                engine='cfgrib',
               backend_kwargs={'filter_heterogeneous': True}
)
<xarray.Dataset>
Dimensions:                                  (y: 65, x: 93, isobaricInhPa: 19,
                                              pressureFromGroundLayer: 5,
                                              isobaricInhPa1: 5,
                                              pressureFromGroundLayer1: 2,
                                              pressureFromGroundLayer2: 2,
                                              heightAboveGroundLayer: 2)
Coordinates:
    time                      datetime64[ns] ...
    step                      timedelta64[ns] ...
    meanSea                   float64 ...
    latitude                  (y, x) float64 ...
    longitude                 (y, x) float64 ...
    valid_time                datetime64[ns] ...
    surface                   float64 ...
  * isobaricInhPa             (isobaricInhPa) float64 1e+03 950.0 ... 100.0
    cloudBase                 float64 ...
    cloudTop                  float64 ...
    maxWind                   float64 ...
    isothermZero              float64 ...
    tropopause                float64 ...
  * pressureFromGroundLayer   (pressureFromGroundLayer) float64 3e+03 ... 1.5...
  * isobaricInhPa1            (isobaricInhPa1) float64 1e+03 850.0 ... 250.0
    heightAboveGround         float64 ...
    heightAboveGround1        float64 ...
    heightAboveGround2        float64 ...
  * pressureFromGroundLayer1  (pressureFromGroundLayer1) float64 9e+03 1.8e+04
  * pressureFromGroundLayer2  (pressureFromGroundLayer2) float64 9e+03 1.8e+04
    atmosphereSingleLayer     float64 ...
  * heightAboveGroundLayer    (heightAboveGroundLayer) float64 1e+03 3e+03
    pressureFromGroundLayer3  float64 ...
    pressureFromGroundLayer4  float64 ...
Dimensions without coordinates: y, x
Data variables:
    prmsl__meanSea__instant                  (y, x) float32 ...
    gust__surface__instant                   (y, x) float32 ...
    gh__isobaricInhPa__instant               (isobaricInhPa, y, x) float32 ...
    gh__cloudBase__instant                   (y, x) float32 ...
    gh__cloudTop__instant                    (y, x) float32 ...
    gh__maxWind__instant                     (y, x) float32 ...
    gh__isothermZero__instant                (y, x) float32 ...
    t__isobaricInhPa__instant                (isobaricInhPa, y, x) float32 ...
    t__cloudTop__instant                     (y, x) float32 ...
    t__tropopause__instant                   (y, x) float32 ...
    t__pressureFromGroundLayer__instant      (pressureFromGroundLayer, y, x) float32 ...
    r__isobaricInhPa__instant                (isobaricInhPa, y, x) float32 ...
    r__isothermZero__instant                 (y, x) float32 ...
    r__pressureFromGroundLayer__instant      (pressureFromGroundLayer, y, x) float32 ...
    w__isobaricInhPa__instant                (isobaricInhPa, y, x) float32 ...
    u__isobaricInhPa__instant                (isobaricInhPa, y, x) float32 ...
    u__tropopause__instant                   (y, x) float32 ...
    u__maxWind__instant                      (y, x) float32 ...
    u__pressureFromGroundLayer__instant      (pressureFromGroundLayer, y, x) float32 ...
    v__isobaricInhPa__instant                (isobaricInhPa, y, x) float32 ...
    v__tropopause__instant                   (y, x) float32 ...
    v__maxWind__instant                      (y, x) float32 ...
    v__pressureFromGroundLayer__instant      (pressureFromGroundLayer, y, x) float32 ...
    absv__isobaricInhPa__instant             (isobaricInhPa1, y, x) float32 ...
    mslet__meanSea__instant                  (y, x) float32 ...
    sp__surface__instant                     (y, x) float32 ...
    orog__surface__instant                   (y, x) float32 ...
    t2m__heightAboveGround__instant          (y, x) float32 ...
    r2__heightAboveGround__instant           (y, x) float32 ...
    u10__heightAboveGround__instant          (y, x) float32 ...
    v10__heightAboveGround__instant          (y, x) float32 ...
    tp__surface__accum                       (y, x) float32 ...
    acpcp__surface__accum                    (y, x) float32 ...
    csnow__surface__instant                  (y, x) float32 ...
    cicep__surface__instant                  (y, x) float32 ...
    cfrzr__surface__instant                  (y, x) float32 ...
    crain__surface__instant                  (y, x) float32 ...
    cape__surface__instant                   (y, x) float32 ...
    cape__pressureFromGroundLayer__instant   (pressureFromGroundLayer1, y, x) float32 ...
    cin__surface__instant                    (y, x) float32 ...
    cin__pressureFromGroundLayer__instant    (pressureFromGroundLayer2, y, x) float32 ...
    pwat__atmosphereSingleLayer__instant     (y, x) float32 ...
    pres__cloudBase__instant                 (y, x) float32 ...
    pres__cloudTop__instant                  (y, x) float32 ...
    pres__maxWind__instant                   (y, x) float32 ...
    hlcy__heightAboveGroundLayer__instant    (heightAboveGroundLayer, y, x) float32 ...
    trpp__tropopause__instant                (y, x) float32 ...
    pli__pressureFromGroundLayer__instant    (y, x) float32 ...
    4lftx__pressureFromGroundLayer__instant  (y, x) float32 ...
    unknown__surface__instant                (y, x) float32 ...
Attributes:
    GRIB_edition:            2
    GRIB_centre:             kwbc
    GRIB_centreDescription:  US National Weather Service - NCEP
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             US National Weather Service - NCEP
    history:                 2023-12-04T16:56 GRIB to CDM+CF via cfgrib-0.9.1...

@FussyDuck
Copy link

FussyDuck commented Dec 4, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ steph-ben
❌ benchimols
You have signed the CLA already but the status is still pending? Let us recheck it.

@steph-ben steph-ben changed the title [Draft] Feature/open heterogeneous Feature/open heterogeneous Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants