Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Python ArviZ's KDE implementation #6

Open
sethaxen opened this issue Aug 7, 2023 · 0 comments
Open

Port Python ArviZ's KDE implementation #6

sethaxen opened this issue Aug 7, 2023 · 0 comments
Labels
enhancement New feature or request

Comments

@sethaxen
Copy link
Member

sethaxen commented Aug 7, 2023

The main differences between KernelDensity.jl and Python arviz's KDE implementation are

  • arviz uses an "experimental" bandwidth defined as the average of Silverman's bandwidth and the Improved Sheather-Jones bandwidth (described in https://doi.org/10.1214/10-AOS799). This default is based on a simulation study by @tomicapretto (a version can be found at https://github.com/tomicapretto/density_estimation). While Silverman's rule oversmooths and is bad for multimodal distributions, ISJ is good for multimodal distributions but undersmooths. The average of the two is a useful compromise that is not too much more expensive.
  • KernelDensity.jl does not automatically pad by default, so generally the density either extends way beyond the data limits or wraps around at the data limits. Neither of these are great. One solution is to increase the number of user-selected points by ~4 bandwidths on both sides when convolving. Instead of discarding the padded parts of the KDE, following https://doi.org/10.1111/j.2517-6161.1971.tb00855.x and Section 2.10 of https://doi.org/10.1201/9781315140919, arviz reflects the data set within 4 bandwidths of the boundary. (EDIT: for a normal kernel, the ISJ paper shows that this approach is equivalent to replacing the normal kernel with a diffusion kernel on the interval defined by the data range)

These features can and should probably be upstreamed to KernelDensity. However, we will probably still have our own kde method that wraps KernelDensity.kde so that we can change the default settings.

Other optional features that could be ported would be

  • adaptive KDE
  • circular KDE

but this could be left for future work, as these features are probably not commonly used.

@sethaxen sethaxen added the enhancement New feature or request label Aug 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant