Skip to content

Commit 50ab116

Browse files
committed
Update examples for MultiMha
1 parent 3319755 commit 50ab116

File tree

3 files changed

+49
-5
lines changed

3 files changed

+49
-5
lines changed

README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -236,6 +236,45 @@ print(feat_selector.SUPPORT)
236236
```
237237
Or you better read the document from: https://mafese.readthedocs.io/en/latest/
238238

239+
3) I got this type of error
240+
```python
241+
raise ValueError("Existed at least one new label in y_pred.")
242+
ValueError: Existed at least one new label in y_pred.
243+
```
244+
How to solve this?
245+
246+
+ This occurs only when you are working on a classification problem with a small dataset that has many classes. For
247+
instance, the "Zoo" dataset contains only 101 samples, but it has 7 classes. If you split the dataset into a
248+
training and testing set with a ratio of around 80% - 20%, there is a chance that one or more classes may appear
249+
in the testing set but not in the training set. As a result, when you calculate the performance metrics, you may
250+
encounter this error. You cannot predict or assign new data to a new label because you have no knowledge about the
251+
new label. There are several solutions to this problem.
252+
253+
+ 1st: Use the SMOTE method to address imbalanced data and ensure that all classes have the same number of samples.
254+
255+
```python
256+
from imblearn.over_sampling import SMOTE
257+
import pandas as pd
258+
from mafese import Data
259+
260+
dataset = pd.read_csv('examples/dataset.csv', index_col=0).values
261+
X, y = dataset[:, 0:-1], dataset[:, -1]
262+
263+
X_new, y_new = SMOTE().fit_resample(X, y)
264+
data = Data(X_new, y_new)
265+
```
266+
267+
+ 2nd: Use different random_state numbers in split_train_test() function.
268+
```python
269+
import pandas as pd
270+
from mafese import Data
271+
272+
dataset = pd.read_csv('examples/dataset.csv', index_col=0).values
273+
X, y = dataset[:, 0:-1], dataset[:, -1]
274+
data = Data(X, y)
275+
data.split_train_test(test_size=0.2, random_state=10) # Try different random_state value
276+
```
277+
239278

240279
For more usage examples please look at [examples](/examples) folder.
241280

examples/wrapper/exam_multimha.py

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,17 @@
1111
data.split_train_test(test_size=0.2)
1212

1313
list_optimizers = ("OriginalWOA", "OriginalGWO", "OriginalTLO", "OriginalGSKA")
14-
list_paras = [{"epoch": 50, "pop_size": 30}, ]*4
14+
list_paras = [
15+
{"name": "WOA", "epoch": 20, "pop_size": 30},
16+
{"name": "GWO", "epoch": 20, "pop_size": 30},
17+
{"name": "TLO", "epoch": 20, "pop_size": 30},
18+
{"name": "GSKA", "epoch": 20, "pop_size": 30}
19+
]
1520
feat_selector = MultiMhaSelector(problem="classification", estimator="knn",
1621
list_optimizers=list_optimizers, list_optimizer_paras=list_paras,
1722
transfer_func="vstf_01", obj_name="AS")
1823

19-
feat_selector.fit(data.X_train, data.y_train, n_trials=3, n_jobs=3, verbose=False)
24+
feat_selector.fit(data.X_train, data.y_train, n_trials=2, n_jobs=2, verbose=False)
2025
feat_selector.export_boxplot_figures()
2126
feat_selector.export_convergence_figures()
2227

run_fs.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@
88
from sklearn.svm import SVC
99

1010

11-
data = get_dataset("Arrhythmia")
12-
data.split_train_test(test_size=0.2)
11+
data = get_dataset("ecoli")
12+
data.split_train_test(test_size=0.2, random_state=2)
1313
print(data.X_train.shape, data.X_test.shape) # (361, 279) (91, 279)
1414

1515
feat_selector = MhaSelector(problem="classification", estimator="knn",
16-
optimizer="OriginalTLO", optimizer_paras=None,
16+
optimizer="OriginalTLO", optimizer_paras={"epoch": 50, "pop_size": 30},
1717
transfer_func="vstf_01", obj_name="AS")
1818
feat_selector.fit(data.X_train, data.y_train, fit_weights=(0.9, 0.1), verbose=True)
1919
X_selected = feat_selector.transform(data.X_train)

0 commit comments

Comments
 (0)