Denoising autoencoder to remove annotations embedded in medical images

This repository is part of my thesis on detecting ovarian cancer using deep learning approaches, and I will soon make all repositories related to this project public.

Denoising autoencoder to remove annotations embedded in medical images

Medical scans sometimes come with annotations drawn by the radiologist or doctor. These annotations can act as confounders for a deep learning method that aims to classify or segment these images.

For example, if radiologists draw arrows that point to benign tumors, a standard CNN will learn to recognize the arrows as indicators of a benign tumor. The arrows act as confounders, and the classifier will predict "benign" when it sees the arrows, even if the tumor is malignant.

How it works

Training process

Given a dataset that has clean images (medical scans without annotations) and noisy images (medical scans with annotations), artificially noised/annotated images are generated from the clean images to resemble the noised images. From this, you now have a dataset that has pairs of clean and noised images, and the network learns to go from the noised image back to the clean image.

To generate the dataset of artificially noised images, you need your custom drawing function that adds annotations to the clean images in your dataset. The custom drawing function will of course depend on the nature of your images and the annotations.

Some examples

Original image	Inferred (removed annotations)

The network relies on a custom weighted loss function that I defined in losses.py. Since annotations occupy a small part of the image, a simple cross-entropy or MSE loss, which does not inherently address class imbalance, may lead to suboptimal performance.

Installation

Make a Conda environment (this has been developed and tested with Python 3.12.7) and run pip install -r requirements.txt

Usage

Dataset structure: folder with "clean" and "annotated" subdirectories. The image names in both directories should be the same (corresponding image pairs).

Training: python -m src.train

Visualize curves during training: in a separate console, run tensorboard --logdir=runs and navigate to localhost:6006 in a browser window.

Inference: python -m src.inference

For inference, make sure to change the dataset path and the model path in src/infer.py.

Config parameters

Training parameters:

loss_alpha: Weighted loss term for annotated (foreground) parts of the image
loss_beta: Weighted loss term for background parts of the image
resize_size: Dimensions to resize the image
epochs: Number of epochs
lr: learning rate (optimizer is SGD with momentum)
batch_size: batch size
val_split: Between 0 and 1, ratio of images in the validation set
Architecture: Either Autoencoder, or AutoencoderWithSkipConnections. Note that some higher-frequency information from the original image could be lost if you opt for Autoencoder instead of AutoencoderWithSkipConnections.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
media		media
src		src
.gitignore		.gitignore
README.md		README.md
batch_infer_and_save.py		batch_infer_and_save.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Denoising autoencoder to remove annotations embedded in medical images

How it works

Training process

Some examples

Installation

Usage

Config parameters

Training parameters:

About

Uh oh!

Releases

Packages

Languages

dtronmans/denoising-autoencoder

Folders and files

Latest commit

History

Repository files navigation

Denoising autoencoder to remove annotations embedded in medical images

How it works

Training process

Some examples

Installation

Usage

Config parameters

Training parameters:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages