Name	Name	Last commit message	Last commit date
parent directory ..
nncf_config	nncf_config
patches	patches
README.md	README.md
install.sh	install.sh
run_commonsense.py	run_commonsense.py
run_glue.py	run_glue.py
run_math.py	run_math.py
running_commands	running_commands

LoNAS

Official implementation of LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models.

This repo contains the code for LoNAS, which is a pioneering method that leverages Neural Architecture Search (NAS) to explore a space of elastic low-rank adapters, effectively compressing large language models while maintaining or even enhancing performance, thus facilitating their use in resource-constrained environments. Please refer to our paper for more details.

Setup

Here is an installation script developed from scratch for LoNAS.

pip install virtualenv
virtualenv lonas-env
source lonas-env/bin/activate

# install pytorch
pip install torch==2.1.2

# install dependencies
bash install.sh
# Note: please ignore the whitespace issues when applying the patch and running `install.sh`.

Quick Start

Training

Taking the unified commonsense reasoning training as an example, please download the 15K instruction-following commonsense reasoning training data from LLM-Adapters.

Example command to train a super-adapter of LLaMA-7B using LoNAS:

python run_commonsense.py \
    --dataset_path commonsense_15k.json \
    --model_name_or_path yahma/llama-7b-hf \
    --do_train \
    --per_device_train_batch_size 4 \
    --gradient_accumulation_steps 4 \
    --num_train_epochs 6 \
    --warmup_steps 100 \
    --optim adamw_torch \
    --fp16 \
    --output_dir <path to super-adapter> \
    --logging_steps 20 \
    --save_strategy epoch \
    --save_total_limit 2 \
    --lora \
    --lora_r 32 \
    --lora_alpha 64 \
    --lora_dropout 0.1 \
    --target_modules q_proj,k_proj,v_proj,up_proj,gate_proj,down_proj \
    --nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json

The nncf_config indicates the NNCF configuration encompassing the search space for elastic adapters and modules of the base model (e.g., q_proj). The implementation of the elastic modules leverages the BootstrapNAS feature of OpenVINO™ NNCF. We employ the stage LR scheduler within NNCF, so the learning rate schedule is specified within the NNCF configuration file, rather than within the arguments of TrainingArguments. For instance,

"schedule": {
    "list_stage_descriptions": [
        {"train_dims": ["width"], "epochs": 6, "depth_indicator": 1, "width_indicator": 5, "init_lr": 3e-4, "epochs_lr": 6, "sample_rate": 1}
    ]
},

For more details on the stage scheduler, see BootstrapNAS.md. After training, the weights of the trained super-adapter will be obtained in the output_dir directory.

Evaluation

All evaluation datasets can be downloaded from LLM-Adapters. Place them into the directory datasets.

git clone https://github.com/AGI-Edgerunners/LLM-Adapters.git
mv LLM-Adapters/dataset datasets

Example command to evaluate the trained super-adapter (heuristic subnetwork):

python run_commonsense.py \
    --dataset_path None \
    --model_name_or_path yahma/llama-7b-hf \
    --lora \
    --lora_weights <path to super-adapter> \
    --nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json \
    --do_test \
    --output_dir <path to results>

This command evaluates the performance of the heuristic subnetwork across eight commonsense reasoning tasks: BoolQ, PIQA, SIQA, HellaSwag, WinoG, Arc-e, Arc-c, and OBQA.

Search

In order to discover more optimized subnetworks within the trained super-network, LoNAS employs advanced search algorithms to further explore the super-network. To implement it, we leverage OpenVINO™ NNCF, which conveniently supports various search algorithms, requiring the configuration of search settings within NNCF config, such as:

"search": {
    "algorithm": "NSGA2",
    "batchnorm_adaptation": {
        "num_bn_adaptation_samples": 0
    },
    "num_evals": 200,
    "population": 5,
    "ref_acc": 0.45,
    "acc_delta": 0.01
}

Further details can be found in BootstrapNAS.md. The following is an example command to search for the trained super-adapter:

python run_commonsense.py \
    --dataset_path commonsense_15k.json \
    --model_name_or_path yahma/llama-7b-hf \
    --lora \
    --lora_weights <path to super-adapter> \
    --val_set_size 1000
    --nncf_config nncf_config/unified_commonsense/nncf_lonas_llama_7b.json \
    --do_search \
    --output_dir <path to search results>

The argument --val_set_size 1000 signifies the utilization of 1k validation samples to evaluate each discovered subnetwork. After running this command, the results of the 200 identified subnetworks ("num_evals": 200 set in the search field of NNCF config) will be placed in the --output_dir folder, including search_progression.png and search_progression.csv. From these results, we can select the subnetwork configurations that best meet different requirements.

Released Models

Name	Tasks	Base Model
lonas-bert-base-glue	RTE, MRPC, STS-B, CoLA, SST2, QNLI, QQP, MNLI	bert-base-uncased
lonas-llama-7b-commonsense	Commonsense Reasoning	yahma/llama-7b-hf
lonas-bloomz-7b-math	Math Reasoning	bigscience/bloomz-7b1

Reproduce Results

Please refer to running_commands for all commands related to reproducing the paper's results.

GLUE benchmark

Method	Trainable Parameter Ratio	GFLOPs	RTE	MRPC	STS-B	CoLA	SST-2	QNLI	QQP	MNLI	AVG
LoRA	0.27%	11.2	65.85	84.46	88.73	57.58	92.06	90.62	89.41	83.00	81.46
LoNAS	0.27%	8.0	70.76	88.97	88.28	61.12	93.23	91.21	88.55	82.00	83.02

Commonsense Reasoning

Method	Total Params.	TFLOPs	BoolQ	PIQA	SIQA	HellaSwag	WinoG	Arc-e	Arc-c	OBQA	Average
LoRA	6.7B	1.7	62.6	75.3	67.9	52.9	58.6	79.2	58.3	71.2	65.8
LoNAS	5.6B	1.4	62.9	73.0	68.7	51.4	63.9	72.3	58.5	71.0	65.2

Math Reasoning

Method	Total Params.	TFLOPs	GSM8K	AQuA	MAWPS	SVAMP	Average
LoRA	7.1B	1.8	17.4	21.3	70.2	41.0	37.5
LoNAS	6.1B	1.5	18.6	22.0	76.5	31.8	37.2

Citation

@inproceedings{munoz-etal-2024-lonas,
    title = "{L}o{NAS}: Elastic Low-Rank Adapters for Efficient Large Language Models",
    author = "Munoz, Juan Pablo  and
      Yuan, Jinjie  and
      Zheng, Yi  and
      Jain, Nilesh",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.940",
    pages = "10760--10776",
}

Acknowledgement

This work benefits from the following repositories:

Peft: https://github.com/huggingface/peft
LLM-Adapters: LLM-Adapters
NNCF: https://github.com/openvinotoolkit/nncf
BootstrapNAS: https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/blob/main/BootstrapNAS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LoNAS

LoNAS

README.md

LoNAS

Setup

Quick Start

Training

Evaluation

Search

Released Models

Reproduce Results

Citation

Acknowledgement

Files

LoNAS

Directory actions

More options

Directory actions

More options

Latest commit

History

LoNAS

Folders and files

parent directory

README.md

LoNAS

Setup

Quick Start

Training

Evaluation

Search

Released Models

Reproduce Results

Citation

Acknowledgement