Skip to content

HongkunSun/MICCAI-FLARE-2025-Challenge-Task-5

Repository files navigation

MICCAI-FLARE-2025-Challenge-Task-5

Vision-Language Model for Multitask Medical Text Generation

This repository is the official implementation of Vision-Language Model for Multitask Medical Text Generation.

Environments and Requirements

  • Ubuntu 22.04
  • NVIDIA GeForce RTX 3090 24GB
  • CUDA 12.4
  • python 3.9.21

To install requirements:

git clone https://github.com/HongkunSun/MICCAI-FLARE-2025-Challenge-Task-5.git
cd MICCAI-FLARE-2025-Challenge-Task-5
conda env create -f environment.yml
conda activate FLARE_2025_Challenge_Task_5_MTYW

Dataset

The pipeline expects the FLARE 2025 2D MLLM dataset to be organized in the following structure:

organized_dataset/
├── training/
│   ├── Retinography/
│   │   ├── retino/
│   │   │   ├── imagesTr/
│   │   │   └── retino_questions_train.json
│   │   └── fundus/
│   │       └── ...
│   └── ...
├── validation-hidden/
│   └── ...
└── validation-public/
    └── ...

Preprocessing

We have organized the data according to task types, and the specific files are located in the all_learning_task_split directory. You need to replace the image paths for each data entry. We provide the processing script process_json.py for this purpose. You only need to replace organized_dataset with the actual path.

Running the data preprocessing code:

python process_json.py --folder ./all_learning_task_split --out_folder ./all_learning_task_split --old organized_dataset --new /path/to/your/organized_dataset

Prepare weight for vision-language model

Llama2 Version

from huggingface_hub import snapshot_download
snapshot_download(repo_id="meta-llama/Llama-2-7b-chat-hf", local_dir=local_dir_1)

Then modify line 8 at MICCAI-FLARE-2025-Challenge-Task-5/train_configs/medsiglip_llama2_7b_finetune.yaml to be the path of Llama-2-7b-chat-hf.

Medsiglip Version

snapshot_download(repo_id="google/medsiglip-448", local_dir=local_dir_2)

Then modify line 9 at MICCAI-FLARE-2025-Challenge-Task-5/train_configs/medsiglip_llama2_7b_finetune.yaml to be the path of Medsiglip-448.

Training

Before training, you also need to input the path of each task's JSON file into the corresponding YAML file's ann_path. You can find these files in the MICCAI-FLARE-2025-Challenge-Task-5/mtyw/configs/datasets directory.

To train the model in the paper, run this command:

python train.py

Inference

To infer the testing cases, run this command:

python inference_flare2025.py --dataset_path <path_to_validation-hidden> --ckpt <path_to_trained_model_pth> --llama_model local_dir_1 --vit_model local_dir_2 --output_file <path_to_output_json_file>

Results

Our method achieves the following performance on MICCAI FLARE 2025 Task 5 Validation-Hidden

Model name classification multi_label_classification detection regression
maiahmed 0.74 0.57 0.82 11.84
mtyw(ours team) 0.70 0.54 0.80 13.63
lujiazho 0.68 0.17 0.69 16.50
phucnlt 0.45 0.54 0.85 22.89

Contributing

This project is licensed under the MIT License. We welcome contributions from the community! To contribute:

  1. Fork the repository.
  2. Open a Pull Request.

Acknowledgement

We thank the contributors of public datasets.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages