DEER 🦌: Dynamic Early Exit in Reasoning Models

This is the repository of our paper: Dynamic Early Exit in Reasoning Models.

DEER monitors model behavior at potential reasoning transition points and dynamically terminates the next reasoning chain’s generation when the model exhibits high confidence in a trial answer. It is consistently effective on 11 cutting-edge reasoning LLMs of varying series and sizes, reducing the length of CoT sequences by an average of 19.1% - 80.1% while improving accuracy by 0.3% - 5.0%.

🔥 Latest Updates

[2025/05/20] Released DEER code for mathematical reasoning tasks (HuggingFace & vLLM).
[Coming Soon] DEER for code generation tasks & Branch-Parallel Decoding Acceleration.

🎯 Key Results

Results on 11 reasoning models with 16k token budgets. "Acc" denotes accuracy, "Tok" denotes token count, and "CR" denotes compression rate.

Experimental results presented in bar charts.

🚀 Quick Start

1. Installation

git clone https://github.com/yourusername/DEER.git
cd DEER
pip install -r requirements.txt

2. DEER on vLLM (Recommended)

Considering efficiency, we recommend reproducing the results using the code based on the vLLM framework.

For Most Reasoning Models

CUDA_VISIBLE_DEVICES=1 python ../vllm-deer.py \
    --model_name_or_path "./DeepSeek-R1-Distill-Qwen-14B" \
    --dataset_dir "./data/" \
    --output_path "./outputs" \
    --dataset "math" \
    --threshold 0.95 \
    --max_generated_tokens 16000 \
    --think_ratio 0.6 \
    --batch_size 2000 \
    --policy avg1 \
    --dtype bfloat16 \
    --gpu-memory-utilization 0.9 \

or run:

bash ./bashes/bash-vllm-deer.sh.

For Qwen3 Models

CUDA_VISIBLE_DEVICES=1 python ../vllm-deer-qwen3.py \
    --model_name_or_path "./Qwen3-4B" \
    --dataset_dir "./data/" \
    --output_path "./outputs" \
    --dataset "math" \
    --threshold 0.95 \
    --max_generated_tokens 16000 \
    --think_ratio 0.8 \
    --batch_size 2000 \
    --dtype bfloat16 \
    --policy avg2 \
    --gpu-memory-utilization 0.9 \

or run:

bash ./bashes/bash-vllm-deer-qwen3.sh.

In our experiments, we found that Qwen3-series models tend to be over-confident in confidence prediction, so we made some modifications to its implementation.

The calculation of answer confidence was changed from arithmetic mean to geometric mean.
An additional condition must be satisfied for early exit: the model must generate </think> after the trial answer.

3. DEER on Transformers

For inference using HuggingFace Transformers (without vLLM), run:

bash ./bashes/bash-vanilla-deer.sh

📊 Evaluation

DEER currently supports evaluation on 7 reasoning benchmarks. The rule-based evaluation for these benchmarks is based on the code implementation from the project LIMO.

python ../check.py \
    --model_name_or_path "./DeepSeek-R1-Distill-Qwen-14B" \
    --data_name "math" \
    --generation_path "your_output.jsonl" \

or run

bash ./bashes/bash-check-correct.sh

📜 Citation

If you use DEER in your research, please cite our paper:

@article{yang2025dynamic,
  title={Dynamic Early Exit in Reasoning Models},
  author={Yang, Chenxu and Si, Qingyi and Duan, Yongjie and Zhu, Zheliang and Zhu, Chenyu and Lin, Zheng and Cao, Li and Wang, Weiping},
  journal={arXiv preprint arXiv:2504.15895},
  year={2025}
}

💬 Community

Join our WeChat group for discussions:

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
bashes		bashes
data		data
images		images
prompts/qwen-instruct		prompts/qwen-instruct
utils		utils
LICENSE		LICENSE
README.md		README.md
check.py		check.py
requirements.txt		requirements.txt
vanilla_deer.py		vanilla_deer.py
vllm-deer-qwen3.py		vllm-deer-qwen3.py
vllm-deer.py		vllm-deer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DEER 🦌: Dynamic Early Exit in Reasoning Models

🔥 Latest Updates

🎯 Key Results

🚀 Quick Start

1. Installation

2. DEER on vLLM (Recommended)

For Most Reasoning Models

For Qwen3 Models

3. DEER on Transformers

📊 Evaluation

📜 Citation

💬 Community

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

iie-ycx/DEER

Folders and files

Latest commit

History

Repository files navigation

DEER 🦌: Dynamic Early Exit in Reasoning Models

🔥 Latest Updates

🎯 Key Results

🚀 Quick Start

1. Installation

2. DEER on vLLM (Recommended)

For Most Reasoning Models

For Qwen3 Models

3. DEER on Transformers

📊 Evaluation

📜 Citation

💬 Community

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages