CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling

This repository provides the first publicly available dataset of expert radiologist gaze during CT analysis CTScanGaze and CT-Searcher, a transformer-based model for 3D scanpath prediction on medical CT volumes. Our work addresses the critical gap in understanding how radiologists visually examine 3D medical images during diagnostic procedures.

Figure 1: CTScanGaze

Figure 2: CTSearcher

🎉 This work has been accepted as a highlight paper at ICCV 2025! 🎉

🛠️ Installation

Quick Setup

# Clone the repository
git clone https://github.com/UARK-AICV/CTScanGaze
cd CTScanGaze

# Create conda environment
conda create -n ctsearcher python=3.9
conda activate ctsearcher

pip install uv 

uv pip install -r requirements.txt

📊 CT-ScanGaze Dataset

CT-ScanGaze is the first publicly available eye gaze dataset focused on CT scan analysis. The dataset is available on Hugging Face.

Each data sample contains the following fields:

{
    "name": str,           # CT scan identifier
    "subject": int,        # Radiologist ID
    "task": str,           # Task description
    "X": list,             # X coordinates of fixations
    "Y": list,             # Y coordinates of fixations  
    "Z": list,             # Z coordinates (slice numbers)
    "T": list,             # Fixation durations in seconds
    "K": int,              # Fixation captured time
    "length": int,         # Scanpath length
    "report": str,         # Report for this CT
}

Note that other fields in the JSON are dummy, so you do not need to care about them. For the reports, many reports will look like duplications because multiple CTs are from the same CT reading session for the same patient.

Additionally, we provide zip files containing all CT scans that match the identifiers, along with corresponding radiological reports for each CT scan.

🏃‍♂️ Training

Quick Start

There is another step to prepare the extracted CT feature first before running the scripts below. I will update this later (⚠️TODO). But for anyone wants to go ahead, we use SwinUNETR.

# Train the gaze predictor using the synthetic data
bash bash/train_semi.sh

# Train the gaze predictor on CTScanGaze.
bash bash/train_no_semi.sh

Custom Training

python src/train.py \
    --img_dir /path/to/ct/images \
    --feat_dir /path/to/swin_features \
    --fix_dir /path/to/gaze/data \
    --log_root runs/experiment_name \
    --epoch 40 \
    --batch 2

🔬 Evaluation

Test a Trained Model

python src/test.py \
    --resume_dir runs/COCO_Search_baseline \
    --img_dir /path/to/test/ct/images \
    --feat_dir /path/to/test/features \
    --fix_dir /path/to/test/gaze/data

Evaluation Metrics

We use comprehensive 3D-adapted metrics for scanpath evaluation:

Scanpath-based Metrics:

ScanMatch (SM): Spatial and temporal similarity with duration consideration
MultiMatch (MM): Five-dimensional assessment (shape, direction, length, position, duration)
String Edit Distance (SED): Sequence-based comparison using Levenshtein distance

Spatial-based Metrics:

Correlation Coefficient (CC): Linear correlation between predicted and ground truth heatmaps
Normalized Scanpath Saliency (NSS): Normalized saliency at fixation locations
Kullback-Leibler Divergence (KLDiv): Distribution similarity measure

⚠️ TODO

The current code base is working as long as the path and extracted features are prepared. But a lot of refactoring work is needed.

Extracted feature of CTs
Clean and refactor codebase
Synthetic dataset
Improve code comments and structure

📜 Citation

If you find our work useful, please cite our paper:

@article{pham2025ct,
  title={CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling},
  author={Pham, Trong-Thang and Awasthi, Akash and Khan, Saba and Marti, Esteban Duran and Nguyen, Tien-Phat and Vo, Khoa and Tran, Minh and Nguyen, Ngoc Son and Van, Cuong Tran and Ikebe, Yuki and others},
  journal={arXiv preprint arXiv:2507.12591},
  year={2025}
}

� Acknowledgments

This material is based upon work supported by the National Science Foundation (NSF) under Award No OIA-1946391, NSF 2223793 EFRI BRAID, National Institutes of Health (NIH) 1R01CA277739-01.

�📄 License

This project is licensed under the Creative Commons Attribution Non Commercial Share Alike 4.0 International License. See the LICENSE file for details.

� Contact

Primary Contact: Trong Thang Pham ([email protected])

For questions, feedback, or collaboration opportunities, feel free to reach out! I would love to hear from you if you have any thoughts or suggestions about this work.

Note: While we don't actively seek contributions to the codebase, we greatly appreciate and welcome feedback, discussions, and suggestions for improvements.

⭐ Star this repository if you find it useful! ⭐

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling

📋 Table of Contents

🛠️ Installation

Quick Setup

📊 CT-ScanGaze Dataset

🏃‍♂️ Training

Quick Start

Custom Training

🔬 Evaluation

Test a Trained Model

Evaluation Metrics

⚠️ TODO

📜 Citation

� Acknowledgments

�📄 License

� Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
bash		bash
docs		docs
runs/COCO_Search_baseline		runs/COCO_Search_baseline
src		src
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

UARK-AICV/CTScanGaze

Folders and files

Latest commit

History

Repository files navigation

CT-ScanGaze: A Dataset and Baselines for 3D Volumetric Scanpath Modeling

📋 Table of Contents

🛠️ Installation

Quick Setup

📊 CT-ScanGaze Dataset

🏃‍♂️ Training

Quick Start

Custom Training

🔬 Evaluation

Test a Trained Model

Evaluation Metrics

⚠️ TODO

📜 Citation

� Acknowledgments

�📄 License

� Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages