Skip to content

Comparative Analysis of ML Models (KNN, SVM, Linear Regression) on Mushroom Dataset

License

Notifications You must be signed in to change notification settings

Arfazrll/Mushroom-Exploration-Comparative-Analysis

Repository files navigation

🍄 Mushroom Exploration with ML Models

Comparative Analysis of ML Models (KNN, SVM, Linear Regression) on Mushroom Dataset


Project Description

Mushroom Dataset Exploration: Leveraging KNN, SVM, and Linear Models is a machine learning project aimed at exploring and analyzing the Mushroom Dataset from the UCI Machine Learning Repository. This dataset contains descriptive information about various mushroom species, including physical characteristics such as shape, color, surface texture, and odor. The goal is to classify mushrooms as either edible or poisonous.

This project implements multiple machine learning approaches for classification, focusing on the following models:

  • K-Nearest Neighbors (KNN): Classifies samples based on their proximity to nearest neighbors.
  • Support Vector Machine (SVM): Separates classes using a hyperplane with maximum margin.
  • Linear Regression: Serves as a baseline model for performance comparison.

🔑 Key Features

  • Data Preprocessing:

    • Handling categorical variables.
    • Addressing missing values.
    • Normalizing data for better model performance.
  • Data Exploration:

    • Visualizations to understand data distributions.
    • Identifying patterns and correlations between features.
  • Model Benchmarking:

    • Compare accuracy, precision, recall, and F1-score of each model.
  • Model Evaluation:

    • Analyzing performance using confusion matrix and classification report.

🛠️ Technologies Used

  • Python: Programming language used for data analysis and machine learning.
  • Pandas: For data manipulation and preprocessing.
  • Scikit-learn: For implementing machine learning models (KNN, SVM, Linear Regression).
  • Matplotlib and Seaborn: For data visualization.
  • NumPy: For numerical computations.

🚀 How to Use

  1. Clone this repository:

    git clone <repository_url>
    cd <repository_folder>
  2. Run the Jupyter Notebook: Open the Jupyter Notebook file in your preferred environment (e.g., Google Colab or Jupyter Notebook).

  3. Explore the Data:

    • Preprocess and clean the data.
    • Visualize the features and explore patterns.
    • Train and evaluate KNN, SVM, and Linear Regression models.

📊 Results and Evaluation

  • The performance comparison of the models will be displayed using metrics such as:

    • Accuracy
    • Precision
    • Recall
    • F1-score
  • Confusion Matrix and Classification Report will provide insights into model performance and help identify strengths and weaknesses.


📝 Conclusion

This project was developed during my association with Cyber Academy at the Cyber-Physical Systems Lab, showcasing the integration of practical machine learning techniques with domain expertise. It demonstrates how mushroom characteristics can be effectively used for classification while comparing the strengths and weaknesses of commonly used machine learning algorithms.

By exploring the Mushroom Dataset, this project provides valuable insights into machine learning techniques and how they can be applied to real-world classification problems.


About

Comparative Analysis of ML Models (KNN, SVM, Linear Regression) on Mushroom Dataset

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published