0.17.0: SHiRA, MiSS, LoRA for MoE, and more
Highlights

New Methods
SHiRA
@kkb-code contributed Sparse High Rank Adapters (SHiRA, paper) which promise to offer a potential gain in performance over LoRAs - especially the concept loss when using multiple adapters is improved. Since the adapters only train on 1-2% of the weights and are inherently sparse, switching between adapters may be cheaper than with LoRAs. (#2584)
MiSS
@JL-er added a new PEFT method, MiSS (Matrix Shard Sharing) in #2604. This method is an evolution of Bone, which, according to our PEFT method comparison benchmark, gives excellent results when it comes to performance and memory efficiency. If you haven't tried it, you should do so now.
At the same time, Bone will be deprecated in favor of MiSS and will be removed in PEFT v0.19.0. If you already have a Bone checkpoint, you can use scripts/convert-bone-to-miss.py
to convert it into a MiSS checkpoint and proceed with training using MiSS.
Enhancements
LoRA for nn.Parameter
LoRA is now able to target nn.Parameter
directly (#2638, #2665)! Ever had this complicated nn.Module
with promising parameters inside but it was too custom to be supported by your favorite fine-tuning library? No worries, now you can target nn.Parameters
directly using the target_parameters
config attribute which works similarly to target_modules
.
This option can be especially useful for models with Mixture of Expert (MoE) layers, as those often use nn.Parameter
s directly and cannot be targeted with target_modules
. For example, for the Llama4 family of models, use the following config to target the MoE weights:
config = LoraConfig(
...,
target_modules=[], # <= prevent targeting any modules
target_parameters=["feed_forward.experts.down_proj", "feed_forward.experts.gate_up_proj"],
)
Note that this feature is still experimental as it comes with a few caveats and therefore might change in the future. Also, MoE weights with many experts can be quite huge, so expect a higher memory usage than compared to targeting normal nn.Linear
layers.
Injecting adapters based on a state_dict
Sometimes, it is possible that there is a PEFT adapter checkpoint but the corresponding PEFT config is not known for whatever reason. To inject the PEFT layers for this checkpoint, you would usually have to reverse-engineer the corresponding PEFT config, most notably the target_modules
argument, based on the state_dict
from the checkpoint. This can be cumbersome and error prone. To avoid this, it is also possible to call inject_adapter_in_model
and pass the loaded state_dict
as an argument:
from safetensors.torch import load_file
from peft import LoraConfig, inject_adapter_in_model
model = ...
state_dict = load_file(<path-to-safetensors-file>)
lora_config = LoraConfig() # <= no need to specify further
model = inject_adapter_in_model(lora_config, model, state_dict=state_dict)
Find more on state_dict
based injection in the docs.
Changes
Compatibility
A bug in prompt learning methods caused modules_to_save
to be ignored. Especially classification tasks are affected since they usually add the classification/score layer to modules_to_save
. In consequence, these layers were neither trained nor stored after training. This has been corrected now. (#2646)
All Changes
- Bump version to 0.16.1.dev0 after release by @BenjaminBossan in #2632
- FEAT: Add GH action to deploy method comparison app by @BenjaminBossan in #2625
- enable FSDP example for model `hugging-quants/Meta-Llama-3.1-8B-Instr… by @kaixuanliu in #2626
- FIX: Create mask function signature change in transformers 4.53.1 by @BenjaminBossan in #2633
- FIX: Correctly skip AWQ test based on torch version by @BenjaminBossan in #2631
- FIX: Faulty OFT parameter device test by @BenjaminBossan in #2630
- Fix #2634: Allow peft_type to be a string by @githubnemo in #2635
- SHiRA Adapters by @kkb-code in #2584
- FIX: Prompt learning methods modules_to_save issue by @BenjaminBossan in #2646
- FIX: Error in workflow file to deploy method comparison app by @BenjaminBossan in #2645
- FEAT Allow LoRA to target nn.Parameter by @BenjaminBossan in #2638
- Update BibTeX entry by @cx-alberto-simoes in #2659
- FIX Prefix tuning after transformers PR 38635 by @BenjaminBossan in #2662
- make method comparison device agnostic, so it can expand to more accelerators like XPU by @yao-matrix in #2610
- Update tokenizer parameter in sfttrainer across multiple examples by @gapsong in #2664
- Update lora.md by @qgallouedec in #2666
- GPT2 compatible version of LLama-Adapters by @efraimdahl in #2643
- Method Comparison: Improve formatting/layout of table by @githubnemo in #2670
- ENH: Targeting multiple parameters on the same module by @BenjaminBossan in #2665
- Update extending vocab docs by @githubnemo in #2669
- FIX Failing target_parameters param usage count by @BenjaminBossan in #2676
- Fix trainable tokens with fsdp by @BenjaminBossan in #2681
- FIX: Small fixes to target_parameters by @BenjaminBossan in #2677
- TST: Add more HF Hub model caching by @BenjaminBossan in #2682
- FIX: Missing device map for facebook/opt-125m by @BenjaminBossan in #2675
- Fix not detecting regex-targeted embedding layer by @githubnemo in #2649
- Add MiSS as a replacement for Bone. by @JL-er in #2604
- [WIP] ENH: Adapter injection based on state_dict by @BenjaminBossan in #2637
- Release 0.17.0 by @BenjaminBossan in #2691
New Contributors
- @kaixuanliu made their first contribution in #2626
- @kkb-code made their first contribution in #2584
- @cx-alberto-simoes made their first contribution in #2659
- @efraimdahl made their first contribution in #2643
Full Changelog: v0.16.0...v0.17.0