-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Extended persistence and lower star filtrations #561
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Let's think about what comes next !
Following @wreise's demonstration that numerical errors can be introduced in the downard sweep if the input has 64-bit float precision (because |
Note: In discussion with @wreise, we noted that in the case of extended persistence one cannot dispose of all zero-persistence pairs. Zero-persistence pairs created in "different sweeps" are essential bars in regular persistence! An example is an isolated point. |
I had a pretty thorough look, and imo, it does what it claims to. :) The new additions should be covered by the tests, apart from the plotting method. |
@wreise thanks for the thorough look! Of course there are still docs to write but more importantly we can't really ship this without intervening in some way on the downstream vectorization methods, even if it's just by saying "this is not supported with extended persistence" in some cases. |
Another point is that the input validation is inadequate still (my fault) because really this transformer should only work with sparse input, and yet we don't throw a useful error if the input is dense. |
I agree - the pr is far from ready :) |
Reference issues/PRs
#337 #546
Types of changes
Description
Begins adding support for extended persistence via a new class
LowerStarFlagPersistence
and a new plotting functionplot_extended_diagram
. Does not yet address downstream processing of extended diagrams. There is also a new data structure for extended persistence diagrams. An extended persistence diagram is a 2D ndarray of shape(n_features, 4)
where the first 3 columns are as for ordinary persistence (birth-death-dimension), and the fourth is either1
or-1
:1
when the feature was born and died during the same "sweep",-1
otherwise. This allows to partition the extended diagram into the usual 4 portions:The extended persistence diagrams are obtained by "coning".
Numerical stability issues have not yet been addressed.
Checklist
flake8
to check my Python changes.pytest
to check this on Python tests.