Skip to content

ESMDataSource._open_dataset fails if coordinate variable precision changes #724

@charles-turner-1

Description

@charles-turner-1

Description

When trying to open a multi-file dataset where the coordinate precision changes halfway through, this error is encountered:

...

esm_ds.to_dask(
    xarray_open_kwargs = {
        "decode_timedelta" : False,
    },
)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File [/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/intake_esm/source.py:287](https://are.nci.org.au/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/intake_esm/source.py#line=286), in ESMDataSource._open_dataset(self)
    286         else:
--> 287             raise exc
    289 self._ds.attrs[OPTIONS['dataset_key']] = self.key

File [/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/intake_esm/source.py:272](https://are.nci.org.au/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/intake_esm/source.py#line=271), in ESMDataSource._open_dataset(self)
    271 try:
--> 272     self._ds = xr.combine_by_coords(
    273         datasets, **self.xarray_combine_by_coords_kwargs
    274     )
    275 except ValueError as exc:

File [/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/xarray/structure/combine.py:983](https://are.nci.org.au/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/xarray/structure/combine.py#line=982), in combine_by_coords(data_objects, compat, data_vars, coords, fill_value, join, combine_attrs)
    981     # Perform the multidimensional combine on each group of data variables
    982     # before merging back together
--> 983     concatenated_grouped_by_data_vars = tuple(
    984         _combine_single_variable_hypercube(
    985             tuple(datasets_with_same_vars),
    986             fill_value=fill_value,
    987             data_vars=data_vars,
    988             coords=coords,
    989             compat=compat,
    990             join=join,
    991             combine_attrs=combine_attrs,
    992         )
    993         for vars, datasets_with_same_vars in grouped_by_vars
    994     )
    996 return merge(
    997     concatenated_grouped_by_data_vars,
    998     compat=compat,
   (...)
   1001     combine_attrs=combine_attrs,
   1002 )

File [/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/xarray/structure/combine.py:984](https://are.nci.org.au/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/xarray/structure/combine.py#line=983), in <genexpr>(.0)
    981     # Perform the multidimensional combine on each group of data variables
    982     # before merging back together
    983     concatenated_grouped_by_data_vars = tuple(
--> 984         _combine_single_variable_hypercube(
    985             tuple(datasets_with_same_vars),
    986             fill_value=fill_value,
    987             data_vars=data_vars,
    988             coords=coords,
    989             compat=compat,
    990             join=join,
    991             combine_attrs=combine_attrs,
    992         )
    993         for vars, datasets_with_same_vars in grouped_by_vars
    994     )
    996 return merge(
    997     concatenated_grouped_by_data_vars,
    998     compat=compat,
   (...)
   1001     combine_attrs=combine_attrs,
   1002 )

File [/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/xarray/structure/combine.py:671](https://are.nci.org.au/g/data/xp65/public/apps/med_conda/envs/analysis3-25.06/lib/python3.11/site-packages/xarray/structure/combine.py#line=670), in _combine_single_variable_hypercube(datasets, fill_value, data_vars, coords, compat, join, combine_attrs)
    670     if not (indexes.is_monotonic_increasing or indexes.is_monotonic_decreasing):
--> 671         raise ValueError(
    672             "Resulting object does not have monotonic"
    673             f" global indexes along dimension {dim}"
    674         )
    676 return concatenated

ValueError: Resulting object does not have monotonic global indexes along dimension yh

which then percolates through to an ESMDataSourceError.

This is the offending line:

293     self._ds = xr.combine_by_coords(
                        datasets, **self.xarray_combine_by_coords_kwargs
                    )

Crucially, it is possible to directly open the paths from the datastore directly with xr.open_mfdataset, so I see no reason why we can't in principle fix this:

pathlist = esm_ds.df['path'].tolist()
ds = xr.open_mfdataset(pathlist,decode_timedelta=False,parallel=True, chunks={'time' : -1})
print(ds)
<xarray.Dataset> Size: 14GB
Dimensions:     (time: 1068, yh: 2204, xh: 1440, nv: 2)
Coordinates:
  * xh          (xh) float64 12kB -279.9 -279.6 -279.4 ... 79.38 79.62 79.88
  * yh          (yh) float64 18kB -80.94 -80.94 -80.87 ... 89.84 89.95 89.95
  * nv          (nv) float64 16B 1.0 2.0
  * time        (time) object 9kB 1900-01-16 12:00:00 ... 1988-12-16 12:00:00
Data variables:
    wfo         (time, yh, xh) float32 14GB dask.array<chunksize=(12, 142, 240), meta=np.ndarray>
    average_T1  (yh, time) datetime64[ns] 19MB dask.array<chunksize=(1142, 12), meta=np.ndarray>
    average_T2  (yh, time) datetime64[ns] 19MB dask.array<chunksize=(1142, 12), meta=np.ndarray>
    average_DT  (yh, time) float64 19MB dask.array<chunksize=(1142, 12), meta=np.ndarray>
    time_bnds   (yh, time, nv) object 38MB dask.array<chunksize=(1142, 12, 2), meta=np.ndarray>
Attributes:
    NumFilesInSet:     1
    title:             ACCESS-OM3
    associated_files:  areacello: access-om3.mom6.static.nc
    grid_type:         regular
    grid_tile:         N/A

Version information: output of intake_esm.show_versions()

I've reproduced this with intake_esm versions 2025.2.3 and 2024.2.6:

  • 2025.2.3
import intake_esm
intake_esm.show_versions()
INSTALLED VERSIONS
------------------

cftime: 1.6.4
dask: 2025.5.1
fastprogress: 1.0.3
fsspec: 2025.5.1
gcsfs: 2025.5.1
intake: 2.0.8
intake_esm: 2025.2.3
netCDF4: 1.7.2
pandas: 2.2.3
requests: 2.32.3
s3fs: 2025.5.1
xarray: 2025.4.0
zarr: 2.18.7
  • 2025.2.6
import intake_esm; intake_esm.show_versions()"

INSTALLED VERSIONS
------------------

cftime: 1.6.4
dask: 2024.11.2
fastprogress: 1.0.3
fsspec: 2024.10.0
gcsfs: 2024.10.0
intake: 0.7.0
intake_esm: 2024.2.6
netCDF4: 1.6.5
pandas: 2.2.3
requests: 2.32.3
s3fs: 2024.10.0
xarray: 2024.2.0
zarr: 2.18.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions