GitHub - f10409/Medical-Foundation-Model-Evaluation

Overview: running the model evaluations

Repository for evaluating medical foundation models. This repository contains the code and scripts to evaluate eight different foundation models. Two subfolders contain model-specific evaluation pipelines:

Ark+/ — evaluation scripts and assets for the Ark+ model evaluations.
MedImageInsights/ — evaluation scripts and assets for the MedImageInsights model evaluations.

Important: this repository standardizes on uv for environment creation and dependency installation. All instructions below and in sub-project READMEs assume you will use uv.

Quick start

From the repository root, sync the shared environment and tools used by the evaluations of six models (BiomedCLIP, CheXagent, MedSigLIP, RAD-DINO, DINOv2, SigLIP2):

uv sync

Activate the environment created by uv:

source .venv/bin/activate

Run the experiment scripts in numeric order (file prefixes like 1_0, 1_1 denote sequence). Example:

python 1_0_ptx_classification_experiments.py

When finished, exit the environment with:

deactivate

Notes:

Only run accompanying .sh wrappers when they are present to automate steps (they typically exist alongside .py files with the same numeric prefix).
Check a sub-project's uv.lock for the expected Python version (for example, MedImageInsights may require Python 3.8).

Data

This repository expects local copies of the datasets used for evaluation. Below are brief instructions for preparing two commonly used datasets.

SIIM-ACR Pneumothorax
- Source: https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation
- Preparation steps:
  1. Download the DICOM files from the Kaggle competition.
  2. Convert DICOMs to PNGs (keep the same base filename) and save them into {YOUR PATH}/train_png/.
  3. Convert the RLE masks provided by the competition to binary PNG masks (same base filename) and save them into {YOUR PATH}/train_msk/.
  4. The inputs folder contains the files for 5-fold cross-validation. Each file (e.g., input_train_ptx_cla_0.csv for the first fold) includes the image paths and labels for that specific fold.
  5. The same folder contains ptx_volume_pct.csv, which records the pneumothorax volumes.
EmoryCXR
- The institutional chest X-ray dataset used for cardiomegaly analysis is not publicly available as it may contain protected health information. Limited information about these datasets may be shared on reasonable request to Dr. Judy Gichoya at [email protected], subject to approval and appropriate confidentiality agreements.

Sub-project READMEs

Ark+/README.md — Ark+ evaluation instructions (uses uv).
MedImageInsights/README.md — MedImageInsights evaluation instructions (uses uv).

Proceed to the sub-project README for exact commands and experiment notes.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Ark+		Ark+
MedImageInsights		MedImageInsights
inputs		inputs
models		models
transforms		transforms
utils		utils
.gitignore		.gitignore
0_0_hugging_face.ipynb		0_0_hugging_face.ipynb
0_1_sample_data_ptx_classification_segmentation.ipynb		0_1_sample_data_ptx_classification_segmentation.ipynb
0_2_sample_data_cm_classification.ipynb		0_2_sample_data_cm_classification.ipynb
0_3_sample_data_ptx_segmentation.ipynb		0_3_sample_data_ptx_segmentation.ipynb
1_0_ptx_classification_experiments.py		1_0_ptx_classification_experiments.py
1_0_ptx_classification_experiments.sh		1_0_ptx_classification_experiments.sh
1_1_cm_classification_experiments.py		1_1_cm_classification_experiments.py
1_1_cm_classification_experiments.sh		1_1_cm_classification_experiments.sh
2_0_ptx_segmentation_experiments.py		2_0_ptx_segmentation_experiments.py
2_0_ptx_segmentation_experiments.sh		2_0_ptx_segmentation_experiments.sh
2_1_cm_segmentation_experiments.py		2_1_cm_segmentation_experiments.py
2_1_cm_segmentation_experiments.sh		2_1_cm_segmentation_experiments.sh
3_cross-val_performance.ipynb		3_cross-val_performance.ipynb
4_0_calculate_ptx_volume.ipynb		4_0_calculate_ptx_volume.ipynb
4_1_ptx_cla_subgroup_analysis.ipynb		4_1_ptx_cla_subgroup_analysis.ipynb
4_2_ptx_seg_subgroup_analysis.ipynb		4_2_ptx_seg_subgroup_analysis.ipynb
5_0_stats.ipynb		5_0_stats.ipynb
5_1_stats_subgroup.ipynb		5_1_stats_subgroup.ipynb
5_2_Visualize_seg_result.ipynb		5_2_Visualize_seg_result.ipynb
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config_cm_cla.py		config_cm_cla.py
config_cm_seg.py		config_cm_seg.py
config_ptx_cla.py		config_ptx_cla.py
config_ptx_seg.py		config_ptx_seg.py
pyproject.toml		pyproject.toml
table1.ipynb		table1.ipynb
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview: running the model evaluations

Quick start

Data

Sub-project READMEs

About

Uh oh!

Releases

Packages

Languages

License

f10409/Medical-Foundation-Model-Evaluation

Folders and files

Latest commit

History

Repository files navigation

Overview: running the model evaluations

Quick start

Data

Sub-project READMEs

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages