Titanic Machine Learning from Disaster: Classification model investigation

Objective: Fit Logistic Regression model to Titanic – Machine Learning from Disaster Kaggle competition data to predict survival rates and measure accuracy. Compared performance of different classifier models, not focussed on optimising for accuracy due to assignment time limits Reference – Will Cukierski. Titanic - Machine Learning from Disaster. https://kaggle.com/competitions/titanic, 2012. Kaggle.

Summary: Conducted an exploratory data analysis and generated new Adult/Child and bucketed Fare features. Generated and compared performance of classification models by splitting the test data - Logistic Regression, SVM, KNN, Decision Tree and Naive Bayes. Due to imbalanced dataset where majority training data being for those who didn't survive, the models struggled to predict those that actually survived.

Tools used: Python, pandas, numpy, matplotlib, seaborn, scikitlearn, Logistic Regression, SVM, KNN, Decision Tree, Naive Bayes

Files:

train.csv, test.csv - data input. Separate train and test (without target) data set provided. Competition data where have to upload generated predicition for test data to get accuracy result. Used train data for modelling and testing by splitting. Test data went through exploratory data analysis but was not modelled.

Titanic_survival_prediction - Final DBoland.ipynb - code

Titanic Survival Classification_DBoland.pptx - Presentation of data analysis, visualisation and linear regression (View raw to view online and interact with links)

Titanic Survival Classification_DBoland.pdf - Pdf of presentation of data analysis, visualisation and linear regression

Model comparison.xlsx - Excel table comparing model parameters and results (View raw to view online)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Titanic Machine Learning from Disaster: Classification model investigation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Model comparison.xlsx		Model comparison.xlsx
README.md		README.md
Titanic Survival Classification_DBoland.pdf		Titanic Survival Classification_DBoland.pdf
Titanic Survival Classification_DBoland.pptx		Titanic Survival Classification_DBoland.pptx
Titanic_survival_prediction - Final DBoland.ipynb		Titanic_survival_prediction - Final DBoland.ipynb
test.csv		test.csv
train.csv		train.csv

Ddbol/Titanic_data_ML_classification

Folders and files

Latest commit

History

Repository files navigation

Titanic Machine Learning from Disaster: Classification model investigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages