Skip to content

concurrent reads with dask and xarray #2

Open
@scottyhq

Description

@scottyhq

I've never been clear on whether or not xarray+dask computations using threads are hampered by file locks. It's complicated because GDAL-->rasterio-->xarray-->dask are all involved.

Rasterio recently updated their example of concurrent processing. This example specifically reads files from local disk rather than over a network.
https://rasterio.readthedocs.io/en/latest/topics/concurrency.html

There is some good discussion here about multiple threads reading the same file concurrently here
pangeo-data/pangeo-example-notebooks#21 (comment)

And finally the rasterio mailing list has some relevant discussion
https://rasterio.groups.io/g/main/topic/72528118#468

A simple example that shows timing improvement would go a long way in helping to clarify this for people. This might require a PR to xarray to deal with locks...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions