Long-Chain Reasoning Distillation via Adaptive Prefix Alignment

This repository contains the source code for the paper: Long-Chain Reasoning Distillation via Adaptive Prefix Alignment.

• 🎯 Overview • ⚙️ Set Up • 🔧 P-ALIGN Pineline • 📨 Contact

🎯Overview

Prefix-ALIGN is a distillation framework that leverages adaptive prefix alignment to improve student model reasoning. It truncates teacher Chains-of-Thought (CoTs) and selects concise, informative prefixes as supervision, effectively bridging the gap between teacher trajectories and student capacity. P-ALIGN enables student models to learn from high-quality CoTs without being hindered by redundancy or uncertainty.

⚙️Set Up

1. Python Environment.

Use git clone to download this project.

conda create -n P-ALIGN python=3.10
conda activate P-ALIGN
git clone https://github.com/NEUIR/P-ALIGN.git
cd P-ALIGN
pip install -r requirements.txt --force-reinstall --no-deps --no-cache-dir

2. Install LLaMA-Factory.

Refer to https://github.com/hiyouga/LLaMA-Factory for detailed instructions.

conda create -n llama_factory python=3.10
conda activate llama_factory
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

📊 Data Preparation

1. Raw Training and Evaluation Data

Due to licensing restrictions, we do not redistribute the original training and evaluation datasets.
Please refer to the data/ directory for instructions on how to download and organize the raw data.

data/
├── raw/      # Instructions for obtaining original datasets
└── results/  # Model inference and evaluation results

2. Processed Data Release

To facilitate quick reproduction of our experiments, we release the processed data used in our method on P-ALIGN

🔧P-ALIGN Pipeline

1. Adaptive Prefix Truncation via Binary Search

To extract the most concise yet sufficient reasoning prefix from long chains of thought, we adopt a binary search–based truncation strategy. Specifically, the student model performs self-evaluation over different prefix lengths to determine whether the current prefix is sufficient to solve the problem, allowing us to efficiently identify the minimal effective reasoning prefix.

bash scripts/Prefix_truncation.sh

2. Prefix-based Alignment

To better align long chains of thought with the reasoning capacity of the student model, we further apply a prefix alignment strategy. This step ensures that the retained prefix not only remains sufficient, but is also well-matched to the student model’s inference ability.

bash scripts/Prefix_alignment.sh

3.Inference

During inference, we leverage vLLM to enable efficient and accelerated decoding, significantly improving inference speed while maintaining generation quality.

bash scripts/Inference.sh

4、Evaluation

To validate the effectiveness of our approach, we evaluate model performance by matching generated outputs with ground-truth answers. We report pass@1 and pass@3 as the main evaluation metrics to measure overall reasoning performance.

bash scripts/Evaluation.sh

📨Contanct

If you have questions, suggestions, and bug reports, please email:

[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
data		data
figs		figs
scripts		scripts
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Long-Chain Reasoning Distillation via Adaptive Prefix Alignment

🎯Overview

⚙️Set Up

1. Python Environment.

2. Install LLaMA-Factory.

📊 Data Preparation

1. Raw Training and Evaluation Data

2. Processed Data Release

🔧P-ALIGN Pipeline

1. Adaptive Prefix Truncation via Binary Search

2. Prefix-based Alignment

3.Inference

4、Evaluation

📨Contanct

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

License

NEUIR/P-ALIGN

Folders and files

Latest commit

History

Repository files navigation

Long-Chain Reasoning Distillation via Adaptive Prefix Alignment

🎯Overview

⚙️Set Up

1. Python Environment.

2. Install LLaMA-Factory.

📊 Data Preparation

1. Raw Training and Evaluation Data

2. Processed Data Release

🔧P-ALIGN Pipeline

1. Adaptive Prefix Truncation via Binary Search

2. Prefix-based Alignment

3.Inference

4、Evaluation

📨Contanct

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages