AutoInland-Vehicle-Insurance-Claim-Challenge

Introduction

Title	Text
Intro	In 2021 during three months, Nigerian car insurance company held a competition in African data science competition platform called `Zindi`. In this competition the organizer wanted to know wheter or not a client will submit a vehicle insurance claim in the next 3 months. In this competition 600+ competitors participated.
Data	The dataset consisted of Train == 12000, Test == 1200, Sample_Submition, Nigerian_State_LGA_Name.
Metrics	F1_score for evaluating our algorithm.
ML Task	Binary Classification task.

Problems

The dataset was unbalanced.
It had missing values in some columns.
Age column had outliers.
Despite distinct IDs duplicated rows existed.
State and LGA column names were incorrect.
Some duplicated rows had different target.

Solved

Used RandomOverSampler algorithm to oversample the minority class.
I tried to impute NaNs with Iterative-Imputer and KNN-Imputer.
I used absolute value of Age to fix negative values.
When I deleted duplicated values I got lower F1_score in public LB so I did not fix it. But in private LB I found out I should have deleted it.
Interestingly I used Nigerian_State_LGA_Name dataset to correct Names in LGA and State.
I again did not fix duplicated rows with different targets.

Unsolved

Did not pay attention to scaling, transforming, feature selection, which led to overfitting.
rather than following ML rules I followed what public LB told me about duplicated rows.
I did not use Stacking or boosting from ensembles efficiently.

Algorithms Used

CatBoost for binary Classification.
Iterative-Imputer with ExtraTrees for Imputing Missing Values by Label-Encoding the categorical dtype.
RandomOverSampler for Over-Sampling minority class.
Others.

🛠 Tech Tools

👾
⚙️
💻

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
Dataset		Dataset
Certificate.png		Certificate.png
Data_Preprocess.ipynb		Data_Preprocess.ipynb
LICENSE		LICENSE
Model.ipynb		Model.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AutoInland-Vehicle-Insurance-Claim-Challenge

Introduction

Problems

Solved

Unsolved

Algorithms Used

🛠 Tech Tools

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AutoInland-Vehicle-Insurance-Claim-Challenge

Introduction

Problems

Solved

Unsolved

Algorithms Used

🛠 Tech Tools

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages