@@ -236,6 +236,45 @@ print(feat_selector.SUPPORT)
236236```
237237Or you better read the document from: https://mafese.readthedocs.io/en/latest/
238238
239+ 3 ) I got this type of error
240+ ``` python
241+ raise ValueError (" Existed at least one new label in y_pred." )
242+ ValueError : Existed at least one new label in y_pred.
243+ ```
244+ How to solve this?
245+
246+ + This occurs only when you are working on a classification problem with a small dataset that has many classes. For
247+ instance, the "Zoo" dataset contains only 101 samples, but it has 7 classes. If you split the dataset into a
248+ training and testing set with a ratio of around 80% - 20%, there is a chance that one or more classes may appear
249+ in the testing set but not in the training set. As a result, when you calculate the performance metrics, you may
250+ encounter this error. You cannot predict or assign new data to a new label because you have no knowledge about the
251+ new label. There are several solutions to this problem.
252+
253+ + 1st: Use the SMOTE method to address imbalanced data and ensure that all classes have the same number of samples.
254+
255+ ``` python
256+ from imblearn.over_sampling import SMOTE
257+ import pandas as pd
258+ from mafese import Data
259+
260+ dataset = pd.read_csv(' examples/dataset.csv' , index_col = 0 ).values
261+ X, y = dataset[:, 0 :- 1 ], dataset[:, - 1 ]
262+
263+ X_new, y_new = SMOTE().fit_resample(X, y)
264+ data = Data(X_new, y_new)
265+ ```
266+
267+ + 2nd: Use different random_state numbers in split_train_test() function.
268+ ``` python
269+ import pandas as pd
270+ from mafese import Data
271+
272+ dataset = pd.read_csv(' examples/dataset.csv' , index_col = 0 ).values
273+ X, y = dataset[:, 0 :- 1 ], dataset[:, - 1 ]
274+ data = Data(X, y)
275+ data.split_train_test(test_size = 0.2 , random_state = 10 ) # Try different random_state value
276+ ```
277+
239278
240279For more usage examples please look at [ examples] ( /examples ) folder.
241280
0 commit comments