This project performs Recency-Frequency-Monetary (RFM) analysis on a large online retail dataset from Kaggle/UCI to better understand customer purchase behavior and segment customers for targeted marketing.
The dataset contains over 500,000 transaction records from a UK-based online retailer between 2009 and 2011.
For each customer, I calculated:
-
β° Recency β How recently the customer made a purchase
-
π Frequency β How often the customer made purchases
-
π° Monetary β How much total money the customer spent
Customers were scored on a 1β5 scale using quantiles, with higher scores indicating more valuable customers (e.g., recent, frequent, high spenders). These scores were combined into RFM segments.
Based on RFM scores, customers were grouped into segments:
- π₯ Champions Recent, frequent, and high-spending customers
- π― Loyal Customers Repeat buyers with strong spending
- π Potential Loyalists New customers with promising behavior
β οΈ At Risk Previously active customers now inactive- β Lost Inactive customers with minimal value
These groups help businesses create targeted marketing strategies such as loyalty rewards for champions and win-back campaigns for at-risk customers.
Used Matplotlib and Seaborn to create:
π Heatmaps showing average R, F, M scores by segment
-
Cleaned and processed over 500,000 transaction records
-
Calculated total spend and RFM metrics per customer
-
Assigned quantile-based RFM scores for segmentation
-
Grouped customers into actionable marketing segments
-
Visualized segment profiles and distributions
-
Python
-
Pandas β data manipulation
-
Matplotlib β data visualization
-
Seaborn β heatmaps and charts
-
Jupyter Notebook β analysis environment
Syed Danish Ahmed
Aspiring Data Scientist | Computer Engineering Student
If you found this project useful, please β the repo. Your support is appreciated!
Dataset Source: Kaggle - https://www.kaggle.com/datasets/lakshmi25npathi/online-retail-dataset