Automated Detection & Cell-Type Classification Using Deep Learning

Developed binary and multiclass CNN models using histology images from 99 patients.

Accurate classification of cell nuclei in histopathology images is essential for early cancer diagnosis and pathology workflow efficiency. This project aims to automate the classification of colon cell nuclei using machine learning, with two core objectives:

1.Detect whether a nucleus is cancerous 2.Classify each nucleus into its corresponding medically relevant cell type

The goal is to reduce diagnostic workload and provide consistent, scalable support for pathology workflows.

Model Development Approach

1. Exploratory Data Analysis (EDA)**

Examined class distribution, staining patterns, and morphological differences Identified imbalance and variability across tissue samples Visualised representative image patches to understand dataset quality and noise

2. Patient-Level Data Splitting

To avoid data leakage and ensure realistic evaluation, the dataset was split by patient, not by image. This prevents patches from the same patient appearing in both training and validation sets.

3. Data Pipeline with Augmentation & Class Balancing

Applied image augmentations (flips, rotations, colour jittering) to improve generalisation. Implemented class-balancing strategies to address heavy imbalance across cell types. Built a reproducible preprocessing pipeline for loading and transforming patches.

4. Performance Metrics Selection

Evaluated model performance using:

Accuracy, Precision, Recall, and F1-score
AUC for binary cancer detection
Confusion matrices to assess per-class behaviour These metrics provide a robust understanding of both binary malignancy detection and multi-class cell type classification.

5. Baseline Deep Learning Model — Custom CNN

Developed a baseline convolutional neural network (CNN) tailored for small image patches:

Convolution + pooling layers for feature extraction
Fully connected layers for classification
Softmax output for multi-class prediction

6. Model Optimization

Identifing of overfitting/underfitting, and use optimization Techniques to address fitting issues (dropout, regularization, etc.).

7. Model Performance and Robustness.

Final Model Accuracy: Clearly demonstrates achieving good performance aligned with established goals or benchmarks.
Robustness and Generalizability: Demonstrates and discusses model robustness across different subsets or scenarios.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Colon Cancer Histopathology Image Classification.ipynb		Colon Cancer Histopathology Image Classification.ipynb
Colorectal-Cancer-Cell-Classification-Challenges-and-Solutions.pdf		Colorectal-Cancer-Cell-Classification-Challenges-and-Solutions.pdf
README.md		README.md
data_labels_extraData.csv		data_labels_extraData.csv
data_labels_mainData.csv		data_labels_mainData.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Automated Detection & Cell-Type Classification Using Deep Learning

Model Development Approach

1. Exploratory Data Analysis (EDA)**

2. Patient-Level Data Splitting

3. Data Pipeline with Augmentation & Class Balancing

4. Performance Metrics Selection

5. Baseline Deep Learning Model — Custom CNN

6. Model Optimization

7. Model Performance and Robustness.

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Automated Detection & Cell-Type Classification Using Deep Learning

Model Development Approach

1. Exploratory Data Analysis (EDA)**

2. Patient-Level Data Splitting

3. Data Pipeline with Augmentation & Class Balancing

4. Performance Metrics Selection

5. Baseline Deep Learning Model — Custom CNN

6. Model Optimization

7. Model Performance and Robustness.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages