Skip to content

Re-implement scikit-learn's search trees with numba #9

Open
@benbovy

Description

@benbovy

This could be done at a later stage, if we choose to go down this way.

The implementation approach used in scikit-learn is interesting in several aspects:

  • kd-tree and ball tree are built as thin layers on top of a common, binary tree implementation

  • all tree data is pre-allocated, which could make easier the re-implementation with numba and perhaps could facilitate experimenting with those structures and dask.

I think numba is now mature enough and supported in various distribution so that we can use it as a dependency. I'm not sure if numba's jitted classes are very mature and/or we could avoid using it here, though.

The biggest advantage of using numba is just-in-time compilation that allows very flexible metric functions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions