-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Hi! I stumbled across this intake driver and it looks like what I need, so thank you so much for your work!
I have already made an example for myself where I have a bunch of file locations that are on a thredds server, and then I read them in with xr.open_mfdataset(), like so:
ds = xr.open_mfdataset(filelocs, drop_variables=['siglay','siglev','Itime2'], parallel=True, compat='override',
combine='by_coords', data_vars='minimal', coords='minimal')
This ended up being about 2 minutes for 127 files.
But, I need to get this combined dataset represented in an intake catalog, which is where intake-thredds comes in. I think I have properly mapped the keywords I used in xr.open_mfdataset to the API in intake-thredds like this:
cat_url = 'https://opendap.co-ops.nos.noaa.gov/thredds/catalog/NOAA/LEOFS/MODELS/catalog.xml'
source = intake.open_thredds_merged(
cat_url, path=[date.strftime('%Y'),
date.strftime('%m'),
date.strftime('%d'),
date.strftime(f'nos.leofs.fields.????.%Y%m%d.t12z.nc')],
concat_kwargs={"dim": "time",
'data_vars': 'minimal',
'coords': 'minimal',
'compat': 'override',
'combine_attrs': "override"
},
xarray_kwargs=dict(
drop_variables=['siglay','siglev','Itime2'],
),
)
But, when I then try to look at the resultant combined lazily loaded Dataset with source.to_dask(), it takes forever to try to load and breaks with "Bad Gateway" before it can finish. The only difference I think is that I don't see how to use parallel=True in the call to intake.open_thredds_merged which I used in calling xr.open_mfdataset.
Is there a way to use parallel=True? Thank you for your help!