Payment Default Case Study

This is a case study that investigates payment defaults and seeks to answer the business question:

Payment defaults are detrimental to the business and are a significant cost factor. Are there any key trends in the data which can help avoid default-prone customers in the future?

Clients.csv, Payments.csv, and other CSV exports: raw data used for analysis.
Jupyter notebooks used in final analysis:
- ExploratoryAnalysis.ipynb - data exploration, summary statistics, visualizations, and initial insights.
- LogBinomial.ipynb - log-binomial modelling alternatives and related diagnostics.
- Notes.ipynb - analysis notes and observations.
- Aaron Galligan Case Study PowerPoint.pptx - A slide deck of findings with a target audience of stakeholders for the business in question.
CaseStudy/ - copies of the main notebooks for archival purposes.

Objective

Identify trends and predictors of payment default using the provided client and payments data. The goal is to surface actionable signals that help reduce future defaults by flagging higher-risk customers or informing changes to underwriting, pricing, or collection processes.

Data

Key data files used in the analysis:

Clients.csv — client-level attributes (demographics, entity type, etc.).
Payments.csv — transaction and payment histories, including default flags or indicators.

Note: exported CSVs under Exported CSV's/ include useful aggregates such as percentage of defaults by entity type.

Analysis approach

Data cleaning and merging: handled missing values, normalized columns, and joined client and payment records to build an analysis dataset.
Exploratory data analysis (EDA): computed default rates across categories (e.g., entity type), visualized distributions, and examined temporal trends.
Modeling: trained logistic and log-binomial models to estimate relationships between predictors and default probability. Evaluated model performance with AUC/ROC, confusion matrices, and calibration checks.
Diagnostics and interpretation: inspected coefficients, marginal effects, and partial dependence plots to identify strong predictors.

Main findings (summary)

The notebooks contain full results and figures and high-level takeaways such as:

Certain entity types and client segments show higher default rates (see Exported CSV's/percentage of defaults by entity type.csv).
Behavioral/payment history signals (late payments, missed payments, or erratic payment patterns) are strong predictors of future default.
Features related to client lifecycle (newer accounts vs established ones), aggregated balances, and prior delinquencies also increase default risk.
Logistic and log-binomial models provide similar directional results; model performance will vary with feature engineering and sampling choices.

Recommendations

Use a risk-scoring model (e.g., logistic regression) in pre-screening to flag high-risk applicants. Periodically retrain with fresh data.
Enrich models with payment-behavior features: time since last payment, frequency of late payments, and changes in payment amounts.
Consider differentiated terms or higher deposits for higher-risk segments, and targeted collections strategies.
Monitor metrics (default rate, model AUC, population stability) and run A/B tests before rolling out changes.

How to reproduce

Prerequisites:

Python 3.8+ (recommended)
Jupyter or JupyterLab
Common data science libraries: pandas, numpy, scikit-learn, statsmodels, matplotlib, seaborn

Next steps

Feature engineering: create more temporal- and sequence-based features from payment histories.
Advanced models: try gradient boosting or ensemble models and compare against logistic baselines.
Deployment: wrap the scoring model in a simple service or batch job for periodic scoring.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
__pycache__		__pycache__
.gitignore		.gitignore
Case Study Excel.xlsx		Case Study Excel.xlsx
Case Study PowerPoint.pptx		Case Study PowerPoint.pptx
ClientStats.py		ClientStats.py
Clients.csv		Clients.csv
ExploratoryAnalysis.ipynb		ExploratoryAnalysis.ipynb
LogBinomial.ipynb		LogBinomial.ipynb
Notebook Logistic.ipynb		Notebook Logistic.ipynb
Notes.ipynb		Notes.ipynb
Payments.csv		Payments.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Payment Default Case Study

Contents

Objective

Data

Analysis approach

Main findings (summary)

Recommendations

How to reproduce

Next steps

About

Uh oh!

Releases

Packages

Languages

aaron-galligan/BusinessCaseStudy

Folders and files

Latest commit

History

Repository files navigation

Payment Default Case Study

Contents

Objective

Data

Analysis approach

Main findings (summary)

Recommendations

How to reproduce

Next steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages