Skip to content

manhph2211/D-BETA

Repository files navigation

Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners (ICML 2025)


Illustration of our contrastive masked ECG-language modeling technique

🚀 Introduction

What about an ECG signal foundation model?

Cardiovascular diseases are the leading cause of death worldwide, accounting for an estimated 17.9 million deaths annually, which is about 32% of all global deaths. Electrocardiograms (ECGs) play a crucial role in diagnosing these conditions, with over 300 million ECGs performed each year globally.

Despite the widespread use of ECGs, there's a lack of general-purpose models that can effectively interpret ECG data across diverse populations and conditions. Our work presents D-BETA, a new approach that learns directly from both ECG signals and their relevant textual reports simultaneously without needing exact manual labels. D-BETA not only captures subtle details in each type of data but also learns how they connect, helping it make a better foundation model with more accurate decisions.

Across comprehensive evaluation, D-BETA consistently outperforms strong baselines on 100+ cardiac conditions, offering a scalable, self-supervised path toward accurate, label-efficient heart health AI worldwide.

This repo provides a quick example of running D-BETA with a zero-shot experiment on CODE-15 test dataset. It is structured as follows:

.
├── configs
│   ├── config.json
├── data
│   ├── pretrain
│   ├── downstream
│   │   ├── code-test
│   │   │   └── data
│   │           ├── annotations
│   │           ├── ecg_tracings.hdf5
├── models
│   ├── modules
│   └── dbeta.py
└── infer.ipynb
└── README.md

📖 Usage

First, we need to clone the project and prepare the environment as follows:

git clone https://github.com/manhph2211/D-BETA.git && cd D-BETA
conda create -n dbeta python=3.9
conda activate dbeta
pip install -r requirements.txt

Next, please download the CODE-test data from here and put it into the data/downstream/code-test directory.

Then, we need to download the pre-trained model from here, and put it into checkpoints directory.

Finally, to run the code, we can just use the example.ipynb notebook.

📝 Acknowledgments

This research was supported by the Google South Asia & Southeast Asia research award.

We are also thankful for the valuable work provided by this nice repo and repo.

📄 Citation

If you find this work useful 😄, please consider citing our paper:

@misc{pham2025dbeta,
      title={Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners}, 
      author={Hung Manh Pham and Aaqib Saeed and Dong Ma},
      year={2025},
      url={https://arxiv.org/abs/2410.02131}, 
}

About

Boosting Masked ECG-Text Auto-Encoders as Discriminative Learners (ICML 2025)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •