This repository contains the source code for the paper: Long-Chain Reasoning Distillation via Adaptive Prefix Alignment.
• 🎯 Overview • ⚙️ Set Up • 🔧 P-ALIGN Pineline • 📨 Contact
Prefix-ALIGN is a distillation framework that leverages adaptive prefix alignment to improve student model reasoning. It truncates teacher Chains-of-Thought (CoTs) and selects concise, informative prefixes as supervision, effectively bridging the gap between teacher trajectories and student capacity. P-ALIGN enables student models to learn from high-quality CoTs without being hindered by redundancy or uncertainty.
Use git clone to download this project.
conda create -n P-ALIGN python=3.10
conda activate P-ALIGN
git clone https://github.com/NEUIR/P-ALIGN.git
cd P-ALIGN
pip install -r requirements.txt --force-reinstall --no-deps --no-cache-dirRefer to https://github.com/hiyouga/LLaMA-Factory for detailed instructions.
conda create -n llama_factory python=3.10
conda activate llama_factory
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"Due to licensing restrictions, we do not redistribute the original training and evaluation datasets.
Please refer to the data/ directory for instructions on how to download and organize the raw data.
data/
├── raw/ # Instructions for obtaining original datasets
└── results/ # Model inference and evaluation results
To facilitate quick reproduction of our experiments, we release the processed data used in our method on P-ALIGN
To extract the most concise yet sufficient reasoning prefix from long chains of thought, we adopt a binary search–based truncation strategy. Specifically, the student model performs self-evaluation over different prefix lengths to determine whether the current prefix is sufficient to solve the problem, allowing us to efficiently identify the minimal effective reasoning prefix.
bash scripts/Prefix_truncation.sh To better align long chains of thought with the reasoning capacity of the student model, we further apply a prefix alignment strategy. This step ensures that the retained prefix not only remains sufficient, but is also well-matched to the student model’s inference ability.
bash scripts/Prefix_alignment.sh During inference, we leverage vLLM to enable efficient and accelerated decoding, significantly improving inference speed while maintaining generation quality.
bash scripts/Inference.sh To validate the effectiveness of our approach, we evaluate model performance by matching generated outputs with ground-truth answers. We report pass@1 and pass@3 as the main evaluation metrics to measure overall reasoning performance.
bash scripts/Evaluation.sh If you have questions, suggestions, and bug reports, please email:
