This repository gives access to the tools to train the models presented at the BCN 20000's dataset scientific publication. The dataset itself is available for download at FigShare.
This repository includes a Python script that generates a master split file for the BCN_20K dataset. The master split file provides a reproducible way to split the dataset into training, validation, and testing sets across multiple folds.
The master split file used to obtain the publication results can be found on this repo. If the user wants to generate again, it can run the script from the command line by providing the path to your input CSV file and the desired output path for the master split CSV file.
python create_master_split.py path/to/bcn_20k_train.csv path/to/output/master_split_file.csv
First, clone the repository and install the required dependencies:
git clone https://github.com/your-repository-url.git
cd your-repository-directory
pip install -r requirements.txt
The code for the cropping technique used on the dermatoscopies can be found at:
The csv's with the image filename must be passed as a --csv_dir
argument when executing the code.
In order to train a model, one should set the model's name at utils/settings.yaml
for one of the following:
Settings name | Model |
---|---|
res18 | ResNet 18 |
res34 | ResNet 34 |
res50 | ResNet 50 |
effb0 | EfficientNet b0 |
effb1 | EfficientNet b1 |
effb2 | EfficientNet b2 |
In the same file you can change the proposed learning_rate
and regularization
values. The code will save a model everytime it surpasses the highest balanced accuracy of the validation set. The checkpoints are saved at saved_models/
.