Skip to content

FastHerbie doesn't always download multiple files at once #439

@williamhobbs

Description

@williamhobbs

I've noticed that FastHerbie doesn't seem to download multiple grib files at once if I use the FH.xarray('search_string') convention.

In contrast, it does download the files simultaneously, and goes much faster all around, if I download files first with FH.download('search_string') followed by FH.xarray('search_string').

Is this expected? I don't think it is, but maybe I'm doing something wrong or misunderstanding.

If I run this:

# define initialization date and fxx range
init_date = pd.to_datetime('2024-04-12 12:00')
fxx_range = range(24, 34, 3)

# get Herbie object
FH = FastHerbie(DATES=[init_date], model='ifs',product='enfo',fxx=fxx_range)

# search for ":ssrd:sfc:" and NOT ":ssrd:sfc:g"
# (the "g" is right after sfc if there is no member number)
# regex based on https://superuser.com/a/1335688
search_str_ssrd = '^(?=.*:ssrd:sfc:)(?:(?!:ssrd:sfc:g).)*$'

# straight to xarray
ds = FH.xarray(search_str_ssrd)

it takes about 4 minutes, and if I watch the appropriate directory, it looks like .grib2 files are only downloading one at a time.

If I run this:

# new init_date, same fxx_range
init_date = pd.to_datetime('2024-04-13 12:00')

# get Herbie object
FH = FastHerbie(DATES=[init_date], model='ifs',product='enfo',fxx=fxx_range)

# same search string, download grib files first
FH.download(search_str_ssrd)

# *then* go to xarray dataset
ds = FH.xarray(search_str_ssrd)

it downloads all the files at once and takes about 1 min 30 sec to complete.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions