Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training model on 7T MP2RAGE images #65

Open
valosekj opened this issue Jul 13, 2024 · 2 comments
Open

Training model on 7T MP2RAGE images #65

valosekj opened this issue Jul 13, 2024 · 2 comments

Comments

@valosekj
Copy link
Member

valosekj commented Jul 13, 2024

This issue tracks the training of the model on 7T MP2RAGE images (hc-leipzig-7t-mp2rage).

Steps (originally posted in #63 (comment)):

Thanks for testing the model @KaterinaKrejci231054! The predictions for the inverted UNIT1 image (multiplied by -1) look promising! I believe we can leverage them for the model training. I would do the following steps:

Note that the recommended nnUNet trainer now is nnU-Net ResEnc L; see here. We can this train two models: one with the default trainer, the second with the nnU-Net ResEnc L trainer.

TODO: describe the training, tagging @KaterinaKrejci231054

@KaterinaKrejci231054
Copy link
Contributor

KaterinaKrejci231054 commented Jul 22, 2024

Obtaining GT labels and nnUNet model training

1. Obtaining GT labels

In the hc-leipzig-7t-mp2rage dataset were not provided groundtruth labels (rootlets segmentation), so first step needs to be obtaining GT labels for training.

I tried to run T2w r20240523 model on inverse UNIT1 data (5 subjects) to obtain rootlets segmentation. This segmentations needed manual corrections (see below) to be considered as groundtruth labels:

image

NOTE: Each manually corrected label fits to all from 3 contrasts (INV1, INV2 and UNIT1), so this is big advantage for us.

image

2. Model training

I created 4 datasets for 4 model training:

  1. UNIT1 data only - model UNIT1
  2. INV1 data only - model INV1
  3. INV2 data only - model INV2
  4. UNIT1 + INV1 + INV2 data together - model MIX

image

UNIT1 model training

image

INV1 model training

image

INV2 model training

image

MIX model training:

image

3. Results - one testing subject

UNIT1 model - gif

sub-17_testing_UNIT1

INV1 model - gif

sub-17_testing_INV1

INV2 model - gif

sub-17_testing_INV2

Comparison UNIT1 model vs MIX model

image

Comparison INV1 model vs MIX model

image

Comparison INV2 model vs MIX model

image

4. Results summary - one testing subject

image

valosekj pushed a commit that referenced this issue Aug 29, 2024
@KaterinaKrejci231054
Copy link
Contributor

nnUNet model training (10 training vs 15 training subjects, default vs increased patch-size)

More manual corrections were made - see hc-leipzig-7t-mp2rage_train-test_split.csv and 6 subjects were excluded due to poor data quality. Then we tried to train single-contrast models (based on UNIT1-neg images) and multi-contrast models (based on UNIT1, inv1 and inv2 images).

New results are comparing:

  • single-contrast model vs multi-contrast model
  • default vs increased patch-size
  • number of training subjects (n=10 vs n=15)

UNIT1-neg models (single-contrast)

Training with default vs increased patch-size

Training log (Dataset026) graph with model settings: 15 training subjects, 1000 epochs, default patch size [192, 96, 128], fold 0

training_log_2024_8_21_12_01_25

Training log (Dataset028) graph with model settings, 15 training subjects, 1000 epochs, increased patch size [352, 96, 128], fold 0

training_log_2024_8_21_12_08_18_training_log_2024_8_23_10_44_27

Impact of default vs increased patch-size (4 testing subjects)

Increasing the patch size (Dataset028) had a positive effect, particularly at the C2 and C3 levels. With the default patch size (Dataset026), training didn't start at these levels, resulting in a Dice score of 0 for testing subjects. In other levels, the performance on testing data was similar between the two models.

Results visualization in violinplot

UNIT1_neg_plot_Dataset026_fold0_Dataset028_fold0

Impact of increased patch size and different number of training subjects

Spinal level C2
Increasing the patch size led to earlier training initiation at the C2 level in both cases compared to the default patch size (lighter vs. darker colors). Additionally, increasing the number of training subjects also resulted in earlier training at the C2 level (dark blue vs. dark red).

spinal_level_2_validation_pseudo_dice_Dataset020_Dataset021_Dataset026_Dataset028

Spinal level C3
Increasing the patch size and the number of subjects doesn't have as significant an impact at the C3 level as it does at the C2 level.

spinal_level_3_validation_pseudo_dice_Dataset020_Dataset021_Dataset026_Dataset028

Multi-contrast models (MIX)

NOTE: We considered INV1, INV2 and UNIT1 images as multi-contrast dataset.

Training with default vs increased patch-size

Training log (Dataset027) graph with model settings: 45 training images, 2000 epochs, default patch size [192, 96, 128], fold 0

training_log_2024_8_21_12_03_51_training_log_2024_8_23_10_55_18

Training log (Dataset029) graph with model settings, 45 training images, 2000 epochs, increased patch size [352, 96, 128], fold 0

training_log_2024_8_23_11_29_05

Impact of default vs increased patch-size (4 testing subjects)

Increasing the patch size had a positive effect, particularly at the C2 and C3 levels. With the default patch size, training didn't start at these levels, resulting in a Dice score of 0 for testing subjects. In other levels, the performance on testing data was similar between the two models.

UNIT1 testing images - Results visualization in violinplot

UNIT1_plot_Dataset027_fold0_Dataset029_fold0

INV1 testing images - Results visualization in violinplot

inv-1_part-mag_MP2RAGE_plot_Dataset027_fold0_Dataset029_fold0

INV2 testing images - Results visualization in violinplot

inv-2_part-mag_MP2RAGE_plot_Dataset027_fold0_Dataset029_fold0

Impact of increased patch size and different number of training subjects

Spinal level C2
Increasing the patch size led to earlier training initiation at the C2 level (lighter red vs. darker red).

spinal_level_2_validation_pseudo_dice_Dataset029_Dataset027_Dataset024

Spinal level C3
Increasing the patch size led to earlier training initiation at the C3 level (lighter red vs. darker red). Additionally, increasing the number of training subjects also resulted in earlier training at the C3 level (dark blue vs. dark red).

spinal_level_3_validation_pseudo_dice_Dataset029_Dataset027_Dataset024

Comparison single-contrast vs multi-contrast model (UNIT1-neg vs UNIT1 data)

Model settings:

  • Dataset028 … 4 UNIT1-neg testing images, increased patch size [352, 96, 128], fold 0
  • Dataset029 … 4 UNIT1 testing images, increased patch size [352, 96, 128], fold 0

The performance of single-contrast and multi-contrast models is similar, but the multi-contrast model has the advantage of being directly applicable to the original MP2RAGE data. Unlike the UNIT1-neg single-contrast model, there’s no need to create inverse images.

Single-contrast vs multi-contrast model violinplot

UNIT1_plot_Dataset028_fold0_Dataset029_fold0

Summary

  • Increasing the patch size primarily benefited training at the C2 and C3 levels for both the single-contrast model (UNIT1-neg) and the multi-contrast model.

  • For models at levels C4 to C8, there was no significant difference between using the default patch size and the increased patch size.

  • While the performance of single-contrast and multi-contrast models is similar, the multi-contrast model has the advantage of being directly applicable to the original MP2RAGE data, without the need to create inverse images like the UNIT1-neg single-contrast model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants