Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with parallel=True, no mask in input grid but error related to mask. #405

Open
axelschweiger opened this issue Nov 25, 2024 · 5 comments

Comments

@axelschweiger
Copy link

Experimenting with parallel=True for regridding some larger datasets. I found the documentation that the output grid has to have a data variable and followed the instructions to just make one. But I'm encountering a strange problem with the input grid. My input grid only contains lat and lon, each 1d arrays. I have tried to actually specify a mask but that gives the same result. I'm using bilinear gridding so no bounds should be necessary. It works when parallel=False albeit very slowly. Thanks for any help on this.

image

I am getting an error that the mask.shape isn't the same as the lon.shape even though I don' have a mask and the input lon.shape is only 1dimensional

image

@axelschweiger
Copy link
Author

Correction (messed up). If I don't specify a mask on the input I get an error about not having lat, lon values or being CF complient (see below). If I do, I get the above error about the wrong shape. I tried to transpose the mask but that doesn't work either.

image

@aulemahal
Copy link
Collaborator

aulemahal commented Nov 25, 2024

Hum, these errors seem to come before anything in the "parallel regridder" code is touched.
I see you have no attributes in lat and lon. Even though, it should work anyway with those names, I would suggest adding units='degrees_north' and units='degrees_east'.

Also, can you send a printout of your dataset with the mask added ?

Finally, as said in the other PR, I don't think xESMF's parallel option is actually helping you here. As both grids fit in your RAM and as your source grid is much bigger than the destination, it won't be faster than doing it with parallel=False. It might even be slower. Best performance, should be with cli ESMF with MPI.

@axelschweiger
Copy link
Author

Thanks for the reply. I tried with setting the units attributes, no difference. See below for input grid details. I guess the "parallel" option isn't going to get me anywhere faster. I had tried ESMF_RegridWeightGen script as described here:
#405 but get error messages that overflow my disk (I guess there is no way to turn of the logging? Was wondering if there is an environment variable but can't find anything).

20241121 134756.230 ERROR PET0 ESMCI_DistGrid.C:5101 ESMCI::DistGrid::getSequenceInde Invalid argument - SeqIndex type mismatch detected

image

image

Here is the code snippet that creates the error:

sc,vc,zc = imUtils.getAncilData(gridName="864X640") # reads output grid coordinates
stype='vector'

# 
imask = da.ones((17280,2880),
               dtype=bool, chunks=(100, 100))
grid_in= xr.Dataset(
    data_vars=dict(
        mask=(["lon","lat"], imask)),
    coords=dict(
        lon=(["lon"], ib.lon.data),
        lat=(["lat"], ib.lat.data),
    )
)

grid_in['lat'].attrs['units']='degrees_north'
grid_in['lon'].attrs['units']='degrees_east'

grid_out={'lon':vc.lon.chunk({"y_sn":100,"x_ew":100}),
          'lat':vc.lat.chunk({"y_sn":100,"x_ew":100})}
grid_out=xr.Dataset(grid_out)
#grid_in['lat'].attrs['units']='degrees_north'
#grid_in['lon'].attrs['units']='degrees_east']

reuse_flag=False

omask = da.ones((grid_out.dims['y_sn'],
                 grid_out.dims['x_ew']),
                 dtype=bool, chunks=(100, 100))

grid_out["mask"] = (grid_out.dims, omask)

regridder = xe.Regridder(grid_in, grid_out, 'bilinear',reuse_weights=reuse_flag, 
                         filename='gebco_weights864X640_5x5.'+stype+'.bl.nc',
                         unmapped_to_nan=True, parallel=True)



@aulemahal
Copy link
Collaborator

I know it sounds stupid, but just for testing, could you try transposing the mask ? :

grid_in= xr.Dataset(
    data_vars=dict(
        mask=(["lat","lon"], imask.T)),
    coords=dict(
        lon=(["lon"], ib.lon.data),
        lat=(["lat"], ib.lat.data),
    )
)

I think there's some hardcoding of the dimension order going on, which shouldn't be the case with xarray-based suff like this...

@axelschweiger
Copy link
Author

I had tried this before. I also tried to reverse the order of lat lon in xarray which just changes the error to lat being the offending variable. See error below.


ValueError Traceback (most recent call last)
Cell In[26], line 9
4 #grid_in={'lon':ib.lon.data,'lat':ib.lat.data}
5 # 'lon_b':ib.lon_b,'lat_b':ib.lat_b}
6 #grid_in=xr.Dataset(grid_in)
7 imask = da.ones((17280,2880),
8 dtype=bool, chunks=(100, 100))
----> 9 grid_in= xr.Dataset(
10 data_vars=dict(
11 mask=(["lon","lat"], imask.T)),
12 coords=dict(
13 lon=(["lon"], ib.lon.data),
14 lat=(["lat"], ib.lat.data),
15 )
16 )
18 grid_in['lat'].attrs['units']='degrees_north'
19 grid_in['lon'].attrs['units']='degrees_east'

File ~/anaconda3/envs/pangeo310/lib/python3.10/site-packages/xarray/core/dataset.py:605, in Dataset.init(self, data_vars, coords, attrs)
602 if isinstance(coords, Dataset):
603 coords = coords.variables
--> 605 variables, coord_names, dims, indexes, _ = merge_data_and_coords(
606 data_vars, coords, compat="broadcast_equals"
607 )
609 self._attrs = dict(attrs) if attrs is not None else None
610 self._close = None

File ~/anaconda3/envs/pangeo310/lib/python3.10/site-packages/xarray/core/merge.py:575, in merge_data_and_coords(data_vars, coords, compat, join)
573 objects = [data_vars, coords]
574 explicit_coords = coords.keys()
--> 575 return merge_core(
576 objects,
577 compat,
578 join,
579 explicit_coords=explicit_coords,
580 indexes=Indexes(indexes, coords),
581 )

File ~/anaconda3/envs/pangeo310/lib/python3.10/site-packages/xarray/core/merge.py:761, in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value)
756 prioritized = _get_priority_vars_and_indexes(aligned, priority_arg, compat=compat)
757 variables, out_indexes = merge_collected(
758 collected, prioritized, compat=compat, combine_attrs=combine_attrs
759 )
--> 761 dims = calculate_dimensions(variables)
763 coord_names, noncoord_names = determine_coords(coerced)
764 if explicit_coords is not None:

File ~/anaconda3/envs/pangeo310/lib/python3.10/site-packages/xarray/core/variable.py:3208, in calculate_dimensions(variables)
3206 last_used[dim] = k
3207 elif dims[dim] != size:
-> 3208 raise ValueError(
3209 f"conflicting sizes for dimension {dim!r}: "
3210 f"length {size} on {k!r} and length {dims[dim]} on {last_used!r}"
3211 )
3212 return dims

ValueError: conflicting sizes for dimension 'lon': length 17280 on 'lon' and length 2880 on {'lon': 'mask', 'lat': 'mask'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants