Skip to content

Predicate option for BatchGenerator #162

Open
@cmdupuis3

Description

@cmdupuis3

Is your feature request related to a problem?

When you create a batch generator, what happens when you have data with NaNs? For example, if we consider an ocean data set, like a map of sea surface temperature, you may iterate through different regions where the stencil is valid, partially valid, or completely full of NaNs. The fact that xbatcher can't filter for these situations means that if you need this, you will have to apply filters inside the batch loop, meaning that you will end up with load imbalances.

Describe the solution you'd like

I would like to see an option in BatchGenerator for a selection predicate. Basically, you would pass a function to BatchGenerator that takes slices as inputs, and evaluates to either True or False. BatchGenerator would then use the result to select only the slices that returned True, thereby restoring load balance.

Describe alternatives you've considered

No response

Additional context

I think this is similar to #158

Metadata

Metadata

Assignees

No one assigned

    Labels

    duplicateThis issue or pull request already existsfeature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions