Skip to content

[ENH] OWSilhouettePlot: displays average silhouette #7092

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

BlazZupan
Copy link
Contributor

@BlazZupan BlazZupan commented May 23, 2025

Issue

Silhouette Plot would display silhouette score per group, but not the average silhouette score which may serve as a reference to judge if the group is below or above the average.

image

Also, learning about average silhouette score may be beneficial when judging on the quality of some clustering that precedes Silhouette Plot widget, say, a Hierarchical Clustering or DBSCAN and Silhouette Plot combination.

Description of changes

This pull request introduces an info box that reports on the average silhouette score:

image

The style of reporting (the label in bold and the number in normal text) follows that of an Info widget. When input data is missing, the box displays N/A.

Unit tests should most likely be added. I have currently tested the widget by coupling it in the workflow with k-means, that also reports on the average silhouette:

image
Includes
  • Code changes
  • Tests
  • Documentation

@BlazZupan BlazZupan changed the title OWSilhouettePlot: displays average silhouette [Enh] OWSilhouettePlot: displays average silhouette May 23, 2025
@BlazZupan BlazZupan changed the title [Enh] OWSilhouettePlot: displays average silhouette [ENH] OWSilhouettePlot: displays average silhouette May 23, 2025
Copy link

codecov bot commented May 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.72%. Comparing base (3c36b1c) to head (52453dc).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7092   +/-   ##
=======================================
  Coverage   88.72%   88.72%           
=======================================
  Files         332      332           
  Lines       73396    73406   +10     
=======================================
+ Hits        65121    65131   +10     
  Misses       8275     8275           
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@janezd
Copy link
Contributor

janezd commented May 23, 2025

Hm, what about putting this on the graph, say at the top left, in two lines?

Silhouette scores for groups are already there. The second reason is that if it is a part of the figure, it is saved together with the figure.

self.avg_silhouette_label.setText(
f"<b>Silhouette:</b> {avg_score:.4f}")
else:
self.avg_silhouette_label.setText("<b>Silhouette:</b> N/A")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must it be bold? I don't think we use bold anywhere else (or at least not often).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used bold in Data Info widget, I copied the style from there. But I agree it sticks out, and will correct. Let me work on this in the following days.

@@ -384,6 +399,8 @@ def _update(self):
self.Warning.nan_distances(
count_nandist, s="s" if count_nandist > 1 else "")

self._update_avg_silhouette() # Update the average silhouette display
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment necessary?

I know: copilot and similar tools add heaps and heaps of comments to explain the code, but they often just state the obvious. PEP8 says they're distracting.

@ajdapretnar
Copy link
Contributor

I was working on this in parallel with Blaž. Here is my proposed solution: #7106

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants