[MRG] ENH: safe-level SMOTE#626
[MRG] ENH: safe-level SMOTE#626laurallu wants to merge 5 commits intoscikit-learn-contrib:masterfrom
Conversation
|
This pull request introduces 1 alert when merging bcc3069 into 321b751 - view on LGTM.com new alerts:
|
Codecov Report
@@ Coverage Diff @@
## master #626 +/- ##
==========================================
+ Coverage 97.93% 97.98% +0.05%
==========================================
Files 83 84 +1
Lines 4784 4911 +127
==========================================
+ Hits 4685 4812 +127
Misses 99 99
Continue to review full report at Codecov.
|
|
Thanks for the contribution. You will need to add tests to check that the new function is giving expected results. |
|
Oh I see that you mentioned it now :) |
|
I just added some tests. Any suggestions? |
|
This pull request introduces 1 alert when merging 394d686 into 321b751 - view on LGTM.com new alerts:
|
imblearn/over_sampling/_smote.py
Outdated
| sampling_strategy=BaseOverSampler._sampling_strategy_docstring, | ||
| random_state=_random_state_docstring, | ||
| ) | ||
| class SLSMOTE(BaseSMOTE): |
imblearn/over_sampling/_smote.py
Outdated
|
|
||
| self.m_neighbors = m_neighbors | ||
|
|
||
| def _assign_sl(self, nn_estimator, samples, target_class, y): |
There was a problem hiding this comment.
I would use the name _assign_safe_levels unless it hurts the readability in the calling code.
|
Thanks!
|
|
This pull request introduces 1 alert when merging fd11e32 into afbf781 - view on LGTM.com new alerts:
|
I've made changes accordingly. I think it's probably ready to go through a detailed review. |
|
I would suggest moving this implementation into smote_variants. The idea behind this move is to benchmark the smote variants on a common benchmark on a large number of datasets and include in imbalanced-learn only the versions that show an advantage. You can see the discussion and contribute to it: https://github.com/gykovacs/smote_variants/issues/14 @laurallu would this strategy would be fine with you? |
|
Since, Safe Level SMOTE exists there IMHO I believe that we should review @laurallu PR and merge it in |
|
OK, let's do that. Let's open an issue to discuss the inclusion criterion to explain what we are expecting in the future. I will review this PR in a near future. |
|
Thanks for pointing out the smote_variants to me. I will check it out. I would love to see the inclusion criterion too since I might code up more methods. |
This is an implementation of the safe-level SMOTE proposed in the following paper:
C. Bunkhumpornpat, K. Sinapiromsaran, C. Lursinsap, "Safe-level-SMOTE: Safe-level-synthetic minority over-sampling technique for handling the class imbalanced problem," In: Theeramunkong T.,
Kijsirikul B., Cercone N., Ho TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science, vol 5476. Springer, Berlin, Heidelberg, 475-482, 2009.
Todo list: