Skip to content

ERA5 daily dataset #75

@lopsided

Description

@lopsided

Hi,

I am excited to try out the shiny new ERA5 anemoi dataset, but for my purposes I need this in daily format.

I could do this manually with zarr/dask etc, but I imagine it could be somewhat painful and I don't want to break any of the nice optimisations/encoding/chunking that is set up on the 6-hrly dataset. I would like to write the daily data to a new zarr dataset rather than do aggregation on the fly, as I think that will be big bottleneck for ML training.

I saw this: https://anemoi.readthedocs.io/projects/datasets/en/latest/howtos/using/01-interpolate-step-dataset-combination.html#interpolate-step which looks like it could be useful, but it's not clear how values are generated at the lower temporal resolution. Eg, precipitation is 6-hour accumulation, so we would want to sum the 4 values at hours 6/12/18/24, but wind speeds are instantaneous, so we would want to take the mean of the 4 values at hours 0/6/12/18. How is the aggregation done here?

Would you suggest using anemoi to generate the daily dataset? Are there any tips or suggestions for doing this? Or would I be better off just writing a script to do it all manually?

Many thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    To be triaged

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions