GitHub - j-morano/MIRAGE: Official repository of the paper "MIRAGE: A multimodal foundation model and benchmark for comprehensive retinal OCT image analysis", accepted by npj Digital Medicine.

Quick Start • Weights • Inference • Benchmark • Tuning • Citation

This repository contains the official code for the paper, "MIRAGE: A multimodal foundation model and benchmark for comprehensive retinal OCT image analysis", led by José Morano and Hrvoje Bogunović, from the CD-AIR lab of the Medical University of Vienna. The paper has been accepted for publication in npj Digital Medicine.

[`arXiv`]

MIRAGE is a multimodal foundation model for comprehensive retinal OCT/SLO image analysis. It is trained on a large-scale dataset of multimodal data, and is designed to perform a wide range of tasks, including disease staging, diagnosis, and layer and lesion segmentation. MIRAGE is based on the MultiMAE architecture, and is pretrained using a multi-task learning strategy. The model, based on ViT, is available in two sizes: MIRAGE-Base and MIRAGE-Large.

Important

All scripts and code are intended to run on Linux systems.

Overview

Overview of the proposed model (MIRAGE) and other general (DINOv2) and domain-specific (MedSAM, RETFound) foundation models. In contrast to existing unimodal foundation models, our approach utilizes multimodal self-supervised learning to train a Vision Transformer on a large dataset of paired multimodal retinal images, including optical coherence tomography (OCT), scanning laser ophthalmoscopy (SLO), and automatically generated labels for retinal layers. We evaluated the model on a comprehensive benchmark consisting of 19 tasks from 14 publicly available datasets and two private datasets, covering both OCT and SLO classification and segmentation tasks. Statistical significance was calculated using the Wilcoxon signed-rank test across all datasets. Our foundation model, MIRAGE, significantly outperforms state-of-the-art foundation models across all task types.

Quick start

For a quick start, use the provided script prepare_env.py to create a new python environment, install the required packages, and download the model weights and the datasets.

Important

The script will download the model weights and the datasets, which are large files. Make sure you have enough disk space and a stable internet connection.

In addition, if the system Python version is not 3.10.*, it will install Python 3.10.16 (from source) in the same directory. It will also install PyTorch 2.5.1 (CUDA 11.8).

./prepare_env.py

Tip

Run the script with the -h or --help flag to see the available options.

Basic usage with Hugging Face 🤗

The models can be easily used with the hf/mirage_hf.py code and loading the weights with Hugging Face 🤗. The only requirement is having the torch, einops, huggingface_hub, and safetensors packages installed.

from huggingface_hub import PyTorchModelHubMixin
from mirage_hf import MIRAGEWrapper


class MIRAGEhf(MIRAGEWrapper, PyTorchModelHubMixin):
    def __init__(
        self,
        input_size=512,
        patch_size=32,
        modalities='bscan-slo',
        size='base',
    ):
        super().__init__(
            input_size=input_size,
            patch_size=patch_size,
            modalities=modalities,
            size=size,
        )

# For the MIRAGE model based on ViT-Base
model = MIRAGEhf.from_pretrained("j-morano/MIRAGE-Base")
# For the MIRAGE model based on ViT-Large
model = MIRAGEhf.from_pretrained("j-morano/MIRAGE-Large")

Requirements

Note

The code has been tested with PyTorch 2.5.1 (CUDA 11.8) and Python 3.10.10.

pip

Create a new python environment and activate it:

python -m venv venv  # if not already created
source venv/bin/activate

Install the required packages:

pip install -r requirements.txt

Model weights

The model weights are available in the Model weights release on GitHub.

Model	Link
MIRAGE-Base	Weights-Base
MIRAGE-Large	Weights-Large

Inference

The script mirage_wrapper.py provides a simple pipeline to load the model and run inference on a single sample. This sample is already included in the repository (_example_images/) and consists of a triplet of OCT, SLO, and layer segmentation images.

To run the inference, simply execute the script:

python mirage_wrapper.py

Check the code for more details.

Evaluation benchmark

We provide all the publicly available datasets used in the benchmark with the data splits. See docs/segmentation_benchmark.md for more details on the segmentation benchmark, and docs/classification_benchmark.md for the classification benchmark.

Pretraining

Although we do not provide the pretraining data due to privacy concerns, we provide the code to pretrain MIRAGE on a multimodal dataset. Please check the docs/pretraining.md for more details.

Tuning

We provide the code to fine-tune MIRAGE and other state-of-the-art foundation models for OCT segmentation tasks. Please check the docs/segmentation_tuning.md for more details.

We also provide the code to fine-tune the models for OCT and SLO classification tasks. More information can be found in the docs/classification_tuning.md file.

Questions and issues

If you have any questions or find problems with the code, please open an issue on GitHub.

Citation

If you find this repository useful, please consider giving it a star ⭐ and a citation 📝:

@misc{morano2025mirage,
    title={{MIRAGE}: Multimodal foundation model and benchmark for comprehensive retinal {OCT} image analysis},
    author={José Morano and Botond Fazekas and Emese Sükei and Ronald Fecso and Taha Emre and Markus Gumpinger and Georg Faustmann and Marzieh Oghbaie and Ursula Schmidt-Erfurth and Hrvoje Bogunović},
    year={2025},
    eprint={2506.08900},
    archivePrefix={arXiv},
    primaryClass={cs.CV},
    url={https://arxiv.org/abs/2506.08900},
}

License

The models and associated code are released under the CC-BY-NC-ND 4.0 license and may only be used for non-commercial, academic research purposes with proper attribution. See LICENSE for more details.

Acknowledgements

MIRAGE code is mainly based on MultiMAE, along with timm, DeiT, DINO, MoCo-v3, BEiT, MAE-priv, MAE, mmsegmentation, MONAI, and RETFound. We thank the authors for making their code available.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
_cfgs		_cfgs
_example_images/67115144RFITNV		_example_images/67115144RFITNV
docs		docs
hf		hf
mirage		mirage
mutils		mutils
.gitignore		.gitignore
.ignore		.ignore
CITATION.bib		CITATION.bib
LICENSE		LICENSE
README.md		README.md
fm_cls_config.py		fm_cls_config.py
fm_seg_config.py		fm_seg_config.py
mirage_wrapper.py		mirage_wrapper.py
prepare_env.py		prepare_env.py
requirements.txt		requirements.txt
run.sh		run.sh
run_cls_tuning.py		run_cls_tuning.py
run_pretraining.py		run_pretraining.py
run_seg_eval.py		run_seg_eval.py
run_seg_tuning.py		run_seg_tuning.py
runner		runner

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

[`arXiv`]

Overview

Quick start

Basic usage with Hugging Face 🤗

Requirements

pip

Model weights

Inference

Evaluation benchmark

Pretraining

Tuning

Questions and issues

Citation

License

Acknowledgements

About

Uh oh!

Releases 3

Uh oh!

Languages

License

j-morano/MIRAGE

Folders and files

Latest commit

History

Repository files navigation

[arXiv]

Overview

Quick start

Basic usage with Hugging Face 🤗

Requirements

pip

Model weights

Inference

Evaluation benchmark

Pretraining

Tuning

Questions and issues

Citation

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Languages

[`arXiv`]