Bank Term Deposit Subscription Prediction: Portfolio Project

Portfolio Highlight: End-to-end ML pipeline for predicting term deposit subscriptions using advanced feature engineering and a tuned Artificial Neural Network (ANN).

This project demonstrates how to build a robust, interpretable, and business-relevant machine learning solution for bank marketing. The workflow—from data cleaning to ANN optimization—mirrors real-world data science best practices and is designed for portfolio presentation.

Project Overview

Goal: Predict which bank customers will subscribe to a term deposit using a modern ML pipeline, with a focus on business value and model interpretability.

Key Steps:

Data cleaning & feature engineering
Exploratory data analysis (EDA)
Feature selection
ANN model development & tuning
Evaluation & business recommendations

Data Overview & Preprocessing

Dataset: 41,188 samples, 21 features (demographics, financials, campaign data). Target: y (term deposit subscription).

Sample Data:

"age";"job";"marital";"education";"default";"housing";"loan";"contact";"month";"day_of_week";"duration";"campaign";"pdays";"previous";"poutcome";"emp.var.rate";"cons.price.idx";"cons.conf.idx";"euribor3m";"nr.employed";"y"
56;"housemaid";"married";"basic.4y";"no";"no";"no";"telephone";"may";"mon";261;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
57;"services";"married";"high.school";"unknown";"no";"no";"telephone";"may";"mon";149;1;999;0;"nonexistent";1.1;93.994;-36.4;4.857;5191;"no"
... (see full dataset)

Cleaning Steps:

Removed duplicates
Replaced "unknown" in categorical columns with mode
Encoded categorical features (one-hot, cyclic for months)

Project Structure

.
├── data/                 # Raw and processed datasets
├── notebooks/            # Jupyter notebooks for exploration, modeling, and evaluation
├── docs/                 # Project documentation, reports, and presentations
├── screenshots/          # Visualizations for portfolio and reporting
├── README.md             # Project overview and guide
└── requirements.txt      # Python dependencies

Quick Start

Clone the repository:

git clone https://github.com/imaddde867/Bank-Term-Deposit-Prediction.git
cd Bank-Term-Deposit-Prediction

Set up your environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
pip install -r requirements.txt

Launch the Jupyter Notebook:
```
jupyter notebook notebooks/ML-Final.ipynb
```
Explore the notebook for the full workflow and code.

Usage

The main workflow is in notebooks/ML-Final.ipynb:

Data loading, cleaning, and EDA
Feature engineering and selection
ANN model building, tuning, and evaluation

See docs/ for reports and rationale.

Technical Deep Dive

Data Processing & Feature Engineering

Removed duplicates, handled 'unknown' values, encoded categoricals (one-hot, cyclic for months)
Scaled numerical features (StandardScaler, MinMaxScaler)
Ordinal encoding for education

Feature Selection

Correlation analysis to drop redundant features
Random Forest for feature importance (top 19 features retained)

Model Development: Artificial Neural Network (ANN)

Why ANN? Artificial Neural Networks (ANNs) are powerful for capturing complex, non-linear relationships in high-dimensional data. Here, an ANN outperformed tree-based models in AUC and generalization.

Architecture:

Input: 19 features
3 hidden layers (128, 64, 32 neurons), each with ReLU, BatchNorm, Dropout
Output: 1 neuron (sigmoid)

Training:

Early stopping, learning rate scheduling
5-fold cross-validation for robust accuracy
Hyperparameter tuning (batch size, learning rate, dropout, optimizer)

Results and Analysis

Performance Metrics:

Metric	Class 0 (No)	Class 1 (Yes)
Precision	0.90	0.76
Recall	0.99	0.16
F1-score	0.94	0.27
Support	3636	482

Overall:

Accuracy: 0.8961
ROC AUC Score: 0.7777
5-fold CV Accuracy: 0.8992 (±0.0022)

Classification Report:

              precision    recall  f1-score   support

           0       0.90      0.99      0.94      3636
           1       0.76      0.16      0.27       482

    accuracy                           0.90      4118
   macro avg       0.83      0.58      0.61      4118
weighted avg       0.88      0.90      0.87      4118

Confusion Matrix:

[[3611   25]
 [ 403   79]]

Rationale for Model Selection

Why ANN?

Handles complex, non-linear relationships in mixed data
Outperformed tree-based models in AUC and generalization
Regularization (Dropout, BatchNorm) and learning rate scheduling ensured stability
Despite class imbalance, provided actionable leads for marketing
Probability outputs allow for business-driven threshold tuning

Conclusion & Future Work

This project showcases a full ML pipeline for a real-world business problem, with a focus on ANN modeling and interpretability. The approach is generalizable to other imbalanced, high-dimensional business tasks.

Next Steps:

Explore advanced imbalance techniques (SMOTE, class weights)
Try ensemble models
Further feature engineering
Adjust classification threshold for business needs

License

MIT License. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bank Term Deposit Subscription Prediction: Portfolio Project

Table of Contents

Project Overview

Data Overview & Preprocessing

Project Structure

Quick Start

Usage

Technical Deep Dive

Data Processing & Feature Engineering

Feature Selection

Model Development: Artificial Neural Network (ANN)

Results and Analysis

Rationale for Model Selection

Conclusion & Future Work

License

About

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
docs		docs
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

imaddde867/Bank-Term-Deposit-Prediction

Folders and files

Latest commit

History

Repository files navigation

Bank Term Deposit Subscription Prediction: Portfolio Project

Table of Contents

Project Overview

Data Overview & Preprocessing

Project Structure

Quick Start

Usage

Technical Deep Dive

Data Processing & Feature Engineering

Feature Selection

Model Development: Artificial Neural Network (ANN)

Results and Analysis

Rationale for Model Selection

Conclusion & Future Work

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks