fix: assign majority class label instead of index in majority voting #58
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Last year I used this package for my bachelor thesis on comparing supervised vs. unsupervised SOMs. When I set supervised_iterations=0 the results were unexpectedly poor. For the thesis I worked around it with my own majority-voting implementation (based on your paper), which produced much better accuracy.
On the final day I, by chance, found the bug in the library but didn’t have time to open a PR. Submitting the fix now.
In
SOMClassifier.py,_init_super_somhas this loop:With an loop that contains this
This assigns
node_classto the index of the max count, not the label.Correct fix:
Now the node is classified by the most frequent label, not its index.
BEFORE
AFTER
Small disclaimer
As this was quite some time ago, I haven't fully 100% verified that this was the bug that I originally found. But I'm pretty sure it is. The graphs are generated from the code below, it's a modified slice of the code I used for my bachelor thesis work so it's not the clearest example.
Code