-
Notifications
You must be signed in to change notification settings - Fork 126
Open
Description
I'm running post-training on a pruning model. After post-training, I get degraded performance - eg. mmlu goes down to 24%. is this expected?
MODEL=meta-llama/Llama-2-7b-hf
prune_ckpt_path='llama_prune'
tune_ckpt_path='model'
RATIO=0.10
# Pruning step
echo "[START] - Start Pruning with RATIO=$RATIO"
python hf_prune.py --base_model=$MODEL --pruning_ratio $RATIO --device cpu --eval_device cuda \
--block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30 \
--block_attention_layer_start 4 --block_attention_layer_end 30 \
--save_ckpt_log_name $prune_ckpt_path --pruner_type taylor \
--taylor param_first --save_model
echo "[FINISH] - Finish Pruning Model"
# Tuning step
echo "[START] - Start Tuning with RATIO=$RATIO"
python post_training.py --prune_model $prune_ckpt_path/pytorch_model.bin --data_path yahma/alpaca-cleaned \
--output_dir $tune_ckpt_path --wandb_project llama_tune --lora_r 8 --num_epochs 2 \
--learning_rate 1e-4 --batch_size 64
echo "[FINISH] - Finish Tuning for RATIO=$RATIO"
Metadata
Metadata
Assignees
Labels
No labels