What about an ECG signal foundation model?
Cardiovascular diseases are the leading cause of death worldwide, accounting for an estimated 17.9 million deaths annually, which is about 32% of all global deaths. Electrocardiograms (ECGs) play a crucial role in diagnosing these conditions, with over 300 million ECGs performed each year globally.
Despite the widespread use of ECGs, there's a lack of general-purpose models that can effectively interpret ECG data across diverse populations and conditions. Our work presents D-BETA, a new approach that learns directly from both ECG signals and their relevant textual reports simultaneously without needing exact manual labels. D-BETA not only captures subtle details in each type of data but also learns how they connect, helping it make a better foundation model with more accurate decisions.
Across comprehensive evaluation, D-BETA consistently outperforms strong baselines on 100+ cardiac conditions, offering a scalable, self-supervised path toward accurate, label-efficient heart health AI worldwide.
This repo provides a quick example of running D-BETA with a zero-shot experiment on CODE-15 test dataset. It is structured as follows:
.
├── configs
│ ├── config.json
├── data
│ ├── pretrain
│ ├── downstream
│ │ ├── code-test
│ │ │ └── data
│ │ ├── annotations
│ │ ├── ecg_tracings.hdf5
├── models
│ ├── modules
│ └── dbeta.py
└── infer.ipynb
└── README.md
First, we need to clone the project and prepare the environment as follows:
git clone https://github.com/manhph2211/D-BETA.git && cd D-BETA
conda create -n dbeta python=3.9
conda activate dbeta
pip install -r requirements.txt
Next, please download the CODE-test data from here and put it into the data/downstream/code-test
directory.
Then, we need to download the pre-trained model from here, and put it into checkpoints
directory.
Finally, to run the code, we can just use the example.ipynb
notebook.
This research was supported by the Google South Asia & Southeast Asia research award.
We are also thankful for the valuable work provided by this nice repo and repo.
If you find this work useful 😄, please consider citing our paper:
@misc{pham2025dbeta,
title={Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners},
author={Hung Manh Pham and Aaqib Saeed and Dong Ma},
year={2025},
url={https://arxiv.org/abs/2410.02131},
}