Repository for evaluating medical foundation models. This repository contains the code and scripts to evaluate eight different foundation models. Two subfolders contain model-specific evaluation pipelines:
Ark+/— evaluation scripts and assets for the Ark+ model evaluations.MedImageInsights/— evaluation scripts and assets for the MedImageInsights model evaluations.
Important: this repository standardizes on uv for environment creation and dependency installation. All instructions below and in sub-project READMEs assume you will use uv.
- From the repository root, sync the shared environment and tools used by the evaluations of six models (BiomedCLIP, CheXagent, MedSigLIP, RAD-DINO, DINOv2, SigLIP2):
uv sync- Activate the environment created by
uv:
source .venv/bin/activate- Run the experiment scripts in numeric order (file prefixes like
1_0,1_1denote sequence). Example:
python 1_0_ptx_classification_experiments.py- When finished, exit the environment with:
deactivateNotes:
- Only run accompanying
.shwrappers when they are present to automate steps (they typically exist alongside.pyfiles with the same numeric prefix). - Check a sub-project's
uv.lockfor the expected Python version (for example,MedImageInsightsmay require Python 3.8).
This repository expects local copies of the datasets used for evaluation. Below are brief instructions for preparing two commonly used datasets.
-
SIIM-ACR Pneumothorax
- Source: https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation
- Preparation steps:
- Download the DICOM files from the Kaggle competition.
- Convert DICOMs to PNGs (keep the same base filename) and save them into
{YOUR PATH}/train_png/. - Convert the RLE masks provided by the competition to binary PNG masks (same base filename) and save them into
{YOUR PATH}/train_msk/. - The
inputsfolder contains the files for 5-fold cross-validation. Each file (e.g.,input_train_ptx_cla_0.csvfor the first fold) includes the image paths and labels for that specific fold. - The same folder contains
ptx_volume_pct.csv, which records the pneumothorax volumes.
-
EmoryCXR
- The institutional chest X-ray dataset used for cardiomegaly analysis is not publicly available as it may contain protected health information. Limited information about these datasets may be shared on reasonable request to Dr. Judy Gichoya at [email protected], subject to approval and appropriate confidentiality agreements.
Ark+/README.md— Ark+ evaluation instructions (usesuv).MedImageInsights/README.md— MedImageInsights evaluation instructions (usesuv).
Proceed to the sub-project README for exact commands and experiment notes.