-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[ENH] OWSilhouettePlot: displays average silhouette #7092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #7092 +/- ##
=======================================
Coverage 88.72% 88.72%
=======================================
Files 332 332
Lines 73396 73406 +10
=======================================
+ Hits 65121 65131 +10
Misses 8275 8275 🚀 New features to boost your workflow:
|
Hm, what about putting this on the graph, say at the top left, in two lines? Silhouette scores for groups are already there. The second reason is that if it is a part of the figure, it is saved together with the figure. |
self.avg_silhouette_label.setText( | ||
f"<b>Silhouette:</b> {avg_score:.4f}") | ||
else: | ||
self.avg_silhouette_label.setText("<b>Silhouette:</b> N/A") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must it be bold? I don't think we use bold anywhere else (or at least not often).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used bold in Data Info widget, I copied the style from there. But I agree it sticks out, and will correct. Let me work on this in the following days.
@@ -384,6 +399,8 @@ def _update(self): | |||
self.Warning.nan_distances( | |||
count_nandist, s="s" if count_nandist > 1 else "") | |||
|
|||
self._update_avg_silhouette() # Update the average silhouette display |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this comment necessary?
I know: copilot and similar tools add heaps and heaps of comments to explain the code, but they often just state the obvious. PEP8 says they're distracting.
I was working on this in parallel with Blaž. Here is my proposed solution: #7106 |
Issue
Silhouette Plot would display silhouette score per group, but not the average silhouette score which may serve as a reference to judge if the group is below or above the average.
Also, learning about average silhouette score may be beneficial when judging on the quality of some clustering that precedes Silhouette Plot widget, say, a Hierarchical Clustering or DBSCAN and Silhouette Plot combination.
Description of changes
This pull request introduces an info box that reports on the average silhouette score:
The style of reporting (the label in bold and the number in normal text) follows that of an Info widget. When input data is missing, the box displays N/A.
Unit tests should most likely be added. I have currently tested the widget by coupling it in the workflow with k-means, that also reports on the average silhouette:
Includes