Feature Request: Add support for retrieving mispredicted examples with predictions and ground truth

First of all, thank you for the great work on the evaluate library — it's been incredibly useful for benchmarking and analyzing model performance!

I’d like to suggest a feature idea that could enhance the interpretability and debugging process when evaluating models. Specifically, it would be great if evaluate could provide a way to retrieve the list of mispredicted examples, including both the model's prediction and the corresponding ground truth.

This could be especially useful for Creating visualizations or reports of incorrect predictions

Thanks again 🙌

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add support for retrieving mispredicted examples with predictions and ground truth #672

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add support for retrieving mispredicted examples with predictions and ground truth #672

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions