Skip to content

Missing data allowed when training, but not with tree interpreters #983

Open
@fburgaud

Description

@fburgaud

This issue is self-explanatory.
Objects like DML have an allow_missing parameter that allows null values as an input.
This can work fine in two scenarios:

  • when the underlying models can deal with missing values
  • when a feature preprocessor is used to deal with nulls before the data is fed to the underlying models

In both those scenarios, it's possible to train DML, but when trying to use a tree interpreter, those use X directly for both the tree and the DML predictions. It does not leverage the feature preprocessor used by the DML object, just the raw X matrix.

Ideally, we'd allow a separate way to deal with null values for the tree interpreters, so they are not completely unavailable when training DML with null values.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions