Insurance_PredictionML

Machine Learning - Medical Insurance Claims Prediction

Refer to attached ML_Models.pptx in this repository to access full business report

Overview

This project focuses on leveraging machine learning techniques to enhance business decision-making. By analyzing real-world datasets, the goal is to develop predictive models that provide actionable insights for business operations. The primary objective is to improve model performance, business interpretability, and generalizability to future data.

Business Problem

Organizations increasingly rely on data-driven decision-making to optimize operations and improve customer experience. This project explores the following business case:

Medical Insurance Claims Prediction: Estimating claim costs based on patient demographics and medical history.

Data Sources

Insurance Company Dataset: Medical cost prediction dataset sourced from Kaggle.

Approach

Implemented and compared multiple machine learning models to optimize prediction accuracy and business interpretability. The following steps were undertaken:

1. Data Preprocessing

Handled missing values and performed exploratory data analysis (EDA).
Feature engineering and selection to improve model relevance.
Scaled numerical features and encoded categorical variables.

2. Model Development

Experimented with different machine learning models, ensuring a balance between accuracy and business applicability.

Regression Models Used: Linear Regression Ridge Regression Lasso Regression Decision Tree Regressor Random Forest Regressor Gradient Boosting Regressor Stacking Regressor Neural Network (NLP Regressor)

Model Evaluation Metrics Used: Mean Squared Error (MSE) Mean Absolute Error (MAE) R² Score Median Absolute Error Results & Insights : Random Forest and Gradient Boosting models performed the best with high R² scores and low error metrics.

3. Model Optimization

To improve model performance, applied:

Hyperparameter tuning using GridSearchCV.
Feature selection using SelectKBest and recursive feature elimination (RFE).
Regularization techniques (L1/L2) for regression models.
Cross-validation to ensure robustness and generalizability.

4.Provided Business Insights & Impact, Visuals

-Provided key observations, findings in every step, refer to the code file directly to understand further -Look into the slide deck, download the raw file directly for easy access!

5.Technologies Used

Python: Data processing and model development.
Scikit-learn: Machine learning models and hyperparameter tuning.
TensorFlow/Keras: Neural network implementation.
Pandas & NumPy: Data manipulation.
Matplotlib & Seaborn: Data visualization and many more.

More information available in the code file and attached PPT.

Contact

For inquiries or collaborations, feel free to connect with me on [www.linkedin.com/in/himarohinimallina] or check out more of my work on (https://github.com/z5450851HimaMallina).

Thank you

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ML_Models.pptx		ML_Models.pptx
ML_insurance.ipynb		ML_insurance.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Insurance_PredictionML

Overview

Business Problem

Data Sources

Approach

1. Data Preprocessing

2. Model Development

3. Model Optimization

4.Provided Business Insights & Impact, Visuals

5.Technologies Used

Contact

About

Uh oh!

Releases

Packages

Languages

z5450851HimaMallina/Insurance_PredictionML

Folders and files

Latest commit

History

Repository files navigation

Insurance_PredictionML

Overview

Business Problem

Data Sources

Approach

1. Data Preprocessing

2. Model Development

3. Model Optimization

4.Provided Business Insights & Impact, Visuals

5.Technologies Used

Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages