TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning

TimeMaster is a reinforcement‑learning‑enhanced framework for training time‑series multimodal large language models (MLLMs). It enables structured, interpretable reasoning over visualized time‑series signals and has been evaluated on real‑world tasks such as EMG, ECG and Human Activity Recognition (HAR) using Qwen2.5‑VL‑3B‑Instruct.

News

[2025.06.21] SFT model released. See link.
[2025.06.21] Code released.
[2025.06.16] Our paper on TimeMaster released. See link.

Overview

TimeMaster performs structured reasoning on time series images using reinforcement learning with composite rewards. The framework integrates format, hard, and soft rewards to improve classification, interpretability, and clinical insight generation.

Installation

1. Set Up Conda Environment

conda create -n timemaster python=3.11 -y
conda activate timemaster

pip3 install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn==2.7.4.post1 --no-build-isolation
pip3 install -e .
pip3 install vllm==0.8.2

pip3 install -r requirements_timemaster.txt

2. Data Preprocessing

Currently, we provide the CTU dataset. Additional datasets will be released soon.

To preprocess the dataset, simply run the following script:

bash example/data_preprocess/ctu.sh

After successful execution, the following preprocessed data will be generated:

data/ctu_image/
    ├── images/
    ├── test/
    ├── train/
    ├── dataset_dict.json
    ├── test.parquet
    └── train.parquet

3. Model Preparation

Download the SFT model from our TimeMaster's HuggingFace using the command below:

huggingface-cli download langfeng01/TimeMaster-SFT-Qwen2.5-VL-3B-CTU --local-dir ./checkpoints/TimeMaster-SFT-Qwen2.5-VL-3B-CTU/

This will download all model files into the ./checkpoints/ directory.

RL Training

1. Training

We offer two types of training:

TimeMaster (SFT + RL): RL training initialized from a supervised fine-tuned (SFT) checkpoint. To use this, set MODEL_PATH=./checkpoints/TimeMaster-SFT-Qwen2.5-VL-3B-CTU in the script: ./example/grpo_trainer/run_ctu.sh
TimeMaster (RL): RL training from scratch using the base model. To use this, set MODEL_PATH=Qwen/Qwen2.5-VL-3B-Instruct in the script: ./example/grpo_trainer/run_ctu.sh

After setting the appropriate MODEL_PATH, start the RL training by running:

bash example/grpo_trainer/run_ctu.sh

After training, the model checkpoint will be saved in: ./checkpoints/

2. Evaluation

To start evaluation, set EVAL=True in the script: ./example/grpo_trainer/run_ctu.sh. Then, run the following command:

bash example/grpo_trainer/run_ctu.sh

Usage Tips

1. Additional datasets

TimeMaster supports additional datasets beyond CTU, including EMG, ECG, HAR, RCW, and TEE.
To process these datasets, follow the same data preparation pipeline demonstrated in example/data_preprocess/ctu.sh.

2. Reward design

The core reward functions are located in ./verl/utils/reward_score/:

ctu.py: Implements format and accuracy rewards for the CTU dataset.
emg_soft.py: Demonstrates a composite reward setup with three components — format, accuracy, and extension (the latter using the OpenAI API for soft evaluation).

Reasoning Example

Citation

If TimeMaster helps your research, we would appreciate it if you could cite our work:

@article{zhang2025timemaster,
  title={TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning},
  author={Zhang, Junru and Feng, Lang and Guo, Xu and Wu, Yuhan and Dong, Yabo and Xu, Duanqing},
  journal={arXiv preprint arXiv:2506.13705},
  year={2025}
}

Acknowledgements

We thank the veRL project for foundational RL infrastructure and Qwen2-VL-Finetune project for support in SFT.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
assets		assets
create_data		create_data
example		example
patches		patches
recipe/prime		recipe/prime
scripts		scripts
tests		tests
verl		verl
LICENSE		LICENSE
Notice.txt		Notice.txt
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
requirements_timemaster.txt		requirements_timemaster.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning

News

Table of Contents

Overview

Installation

1. Set Up Conda Environment

2. Data Preprocessing

3. Model Preparation

RL Training

1. Training

2. Evaluation

Usage Tips

1. Additional datasets

2. Reward design

Reasoning Example

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

langfengQ/TimeMaster

Folders and files

Latest commit

History

Repository files navigation

TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning

News

Table of Contents

Overview

Installation

1. Set Up Conda Environment

2. Data Preprocessing

3. Model Preparation

RL Training

1. Training

2. Evaluation

Usage Tips

1. Additional datasets

2. Reward design

Reasoning Example

Citation

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages