This project focuses on exploring and classifying exoplanets using machine learning (ML) techniques.
- Project Pipeline:
- Exploratory Data Analysis (EDA):
- The project visualizes the data and detects for anomalies based on which it makes certain assessments.
- The data is further cleaned and processed so as to make it suitable for model training.
- Modeling:
- K-Nearest Neighbors (KNN) algorithm is used here for classification.
- Performance Metrics:
- Evaluated using ROC (Receiver Operating Characteristic) curves.
- AUC (Area Under the Curve) is computed to assess the model's classification accuracy.
- Confusion Matrix is used to analyze the performance of the classification model.
- Handling Data Imbalance:
- SMOTE (Synthetic Minority Over-sampling Technique) is applied to handle class imbalance and improve model performance.
- Exploratory Data Analysis (EDA):
- The following project is my first machine-learning project so for certain concepts I have also linked ytvideos(inline-comments) that I have referred to, for better understanding of the project