TimeMaster
is a reinforcement‑learning‑enhanced framework for training time‑series multimodal large language models (MLLMs). It enables structured, interpretable reasoning over visualized time‑series signals and has been evaluated on real‑world tasks such as EMG, ECG and Human Activity Recognition (HAR) using Qwen2.5‑VL‑3B‑Instruct.
- [2025.06.21] SFT model released. See link.
- [2025.06.21] Code released.
- [2025.06.16] Our paper on
TimeMaster
released. See link.
TimeMaster
performs structured reasoning on time series images using reinforcement learning with composite rewards. The framework integrates format, hard, and soft rewards to improve classification, interpretability, and clinical insight generation.
conda create -n timemaster python=3.11 -y
conda activate timemaster
pip3 install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu124
pip3 install flash-attn==2.7.4.post1 --no-build-isolation
pip3 install -e .
pip3 install vllm==0.8.2
pip3 install -r requirements_timemaster.txt
Currently, we provide the CTU dataset. Additional datasets will be released soon.
To preprocess the dataset, simply run the following script:
bash example/data_preprocess/ctu.sh
After successful execution, the following preprocessed data will be generated:
data/ctu_image/
├── images/
├── test/
├── train/
├── dataset_dict.json
├── test.parquet
└── train.parquet
Download the SFT model from our TimeMaster's HuggingFace using the command below:
huggingface-cli download langfeng01/TimeMaster-SFT-Qwen2.5-VL-3B-CTU --local-dir ./checkpoints/TimeMaster-SFT-Qwen2.5-VL-3B-CTU/
This will download all model files into the ./checkpoints/
directory.
We offer two types of training:
TimeMaster
(SFT + RL): RL training initialized from a supervised fine-tuned (SFT) checkpoint. To use this, setMODEL_PATH=./checkpoints/TimeMaster-SFT-Qwen2.5-VL-3B-CTU
in the script: ./example/grpo_trainer/run_ctu.shTimeMaster
(RL): RL training from scratch using the base model. To use this, setMODEL_PATH=Qwen/Qwen2.5-VL-3B-Instruct
in the script: ./example/grpo_trainer/run_ctu.sh
After setting the appropriate MODEL_PATH
, start the RL training by running:
bash example/grpo_trainer/run_ctu.sh
After training, the model checkpoint will be saved in: ./checkpoints/
To start evaluation, set EVAL=True
in the script: ./example/grpo_trainer/run_ctu.sh. Then, run the following command:
bash example/grpo_trainer/run_ctu.sh
TimeMaster
supports additional datasets beyond CTU, including EMG, ECG, HAR, RCW, and TEE.
To process these datasets, follow the same data preparation pipeline demonstrated in example/data_preprocess/ctu.sh.
The core reward functions are located in ./verl/utils/reward_score/:
ctu.py
: Implements format and accuracy rewards for the CTU dataset.emg_soft.py
: Demonstrates a composite reward setup with three components — format, accuracy, and extension (the latter using the OpenAI API for soft evaluation).
If TimeMaster
helps your research, we would appreciate it if you could cite our work:
@article{zhang2025timemaster,
title={TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning},
author={Zhang, Junru and Feng, Lang and Guo, Xu and Wu, Yuhan and Dong, Yabo and Xu, Duanqing},
journal={arXiv preprint arXiv:2506.13705},
year={2025}
}
We thank the veRL project for foundational RL infrastructure and Qwen2-VL-Finetune project for support in SFT.