Open
Description
This could be done at a later stage, if we choose to go down this way.
The implementation approach used in scikit-learn is interesting in several aspects:
-
kd-tree and ball tree are built as thin layers on top of a common, binary tree implementation
-
all tree data is pre-allocated, which could make easier the re-implementation with numba and perhaps could facilitate experimenting with those structures and dask.
I think numba is now mature enough and supported in various distribution so that we can use it as a dependency. I'm not sure if numba's jitted classes are very mature and/or we could avoid using it here, though.
The biggest advantage of using numba is just-in-time compilation that allows very flexible metric functions.
Metadata
Metadata
Assignees
Labels
No labels