Skip to content

iter_confusion_matrices a bottleneck #4

@yasirs

Description

@yasirs

I am experimenting with very large datasets (~ 1e6 to 1e7 points). It seems that storing the data as (threshold, label) tuples and then computing the measures and confusion matrices in python is much, much slower than keeping the data in numpy arrays (where available), and doing vectorized operations on the arrays. I don't know if there is interest in something like this.

I might attempt to implement something like that to be abe to handle the large datasets.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions