Highlights

New Methods

LoRA-FA

In #2468, @AaronZLT added the LoRA-FA optimizer to PEFT. This optimizer is based on AdamW and it increases memory efficiency of LoRA training. This means that you can train LoRA with less memory, or, with the same memory budget, use higher LoRA ranks, potentially getting better results.

RandLoRA

Thanks to @PaulAlbert31, a new PEFT method called RandLoRA was added to PEFT (#2464). Similarly to VeRA, it uses non-learnable random low rank matrices that are combined through learnable matrices. This way, RandLoRA can approximate full rank updates of the weights. Training models quantized with bitsandbytes is supported.

C³A

@Phoveran added Circular Convolution Adaptation, C3A, in #2577. This new PEFT method can overcome the limit of low rank adaptations as seen e.g. in LoRA while still promising to be fast and memory efficient.

Enhancements

Thanks to @gslama12 and @SP1029, LoRA now supports Conv2d layers with groups != 1. This requires the rank r being divisible by groups. See #2403 and #2567 for context.

@dsocek added support for Intel Neural Compressor (INC) quantization to LoRA in #2499.

DoRA now supports Conv1d layers thanks to @EskildAndersen (#2531).

Passing init_lora_weights="orthogonal" now enables orthogonal weight initialization for LoRA (#2498).

@gapsong brought us Quantization-Aware LoRA training in #2571. This can make QLoRA training more efficient, please check the included example. Right now, only GPTQ is supported.

There has been a big refactor of Orthogonal Finetuning, OFT, thanks to @zqiu24 (#2575). This makes the PEFT method run more quickly and require less memory. It is, however, incompatible with old OFT checkpoints. If you have old OFT checkpoints, either pin the PEFT version to <0.16.0 or retrain it with the new PEFT version.

Thanks to @keepdying, LoRA hotswapping with compiled models no longer leads to CUDA graph re-records (#2611).

Changes

Compatibility

#2481: The value of required_grads_ of modules_to_save is now set to True when used directly with inject_adapter. This is relevant for PEFT integrations, e.g. Transformers or Diffusers.
Due to a big refactor of vision language models (VLMs) in Transformers, the model architecture has been slightly adjusted. One consequence of this is that if you use a PEFT prompt learning method that is applied to vlm.language_model, it will no longer work, please apply it to vlm directly (see #2554 for context). Morever, the refactor results in different checkpoints. We managed to ensure backwards compatability in PEFT, i.e. old checkpoints can be loaded successfully. There is, however, no forward compatibility, i.e. loading checkpoints trained after the refactor is not possible with package versions from before the refactor. In this case, you need to upgrade PEFT and transformers. More context in #2574.
#2579: There have been bigger refactors in Transformers concerning attention masks. This required some changes on the PEFT side which can affect prompt learning methods. For prefix tuning specifically, this can result in numerical differences but overall performance should be the same. For other prompt learning methods, numerical values should be the same, except if the base model uses 4d attention masks, like Gemma. If you load old prompt learning checkpoints, please double-check that they still perform as expected, especially if they're trained on Gemma or similar models. If not, please re-train them or pin PEFT and transformers to previous versions (<0.16.0 and <4.52.0, respectively).

All Changes

Bump version and minor instruction fix by @githubnemo in #2439
FIX for ConvNd layers using the groups argument. by @gslama12 in #2403
DOC: Tip on how to merge with DeepSpeed by @BenjaminBossan in #2446
Fix incorrect link in docs by @kenning in #2444
Fix typos by @omahs in #2447
Refactor to better support LoRA variants by @BenjaminBossan in #2443
enable 5 test cases on XPU by @yao-matrix in #2442
FIX: Faulty test that results in nan weights by @BenjaminBossan in #2448
Fix sft example script trl and env var by @BenjaminBossan in #2454
LoRA variant init now also receives kwargs by @BenjaminBossan in #2455
Fix #2450: Revamp adapter_state_dict_* methods by @githubnemo in #2456
Method comparison evaluation suite by @githubnemo in #2395
Bump version to reflect patch release by @githubnemo in #2461
The paper on the Bone structure has been updated by @JL-er in #2312
CI: More caching in tests by @BenjaminBossan in #2472
fix gpu tests by @jiqing-feng in #2471
Fix compare results by @jiqing-feng in #2473
fix error_factor for xpu by @jiqing-feng in #2475
Fix: Multiple PEFT methods have issues with models loaded in float16 or bfloat16 by @BenjaminBossan in #2433
TST Refactor tests to make them simpler by @BenjaminBossan in #2462
Use Python 3.9 as RUFF target version and apply fixes by @cyyever in #2483
FIX Deleting adapters on auxiliary modules by @BenjaminBossan in #2466
fix args by @real-zhangzhe in #2474
ENH Add default target_modules for Llama4 by @BenjaminBossan in #2480
[Feature Request] Add LoRA-FA to PEFT by @AaronZLT in #2468
TST Refactor (continued) of encoder tests by @BenjaminBossan in #2478
FIX: Error when merging LoRA bias with scale != 1 by @BenjaminBossan in #2489
FIX: X-LoRA error when targeting different modules by @BenjaminBossan in #2488
Fix: the evaluation_strategy is deprecated by @yuanwu2017 in #2487
Testing common uses situational HF_HUB_OFFLINE by @githubnemo in #2490
MNT: Update HF Hub download kwargs by @BenjaminBossan in #2492
FIX Multi GPU tests: explicit device map by @BenjaminBossan in #2484
Fix #2477: Regression accessing modules_to_save by @githubnemo in #2481
make test_lora_use_dora_linear pass on XPU by @yao-matrix in #2493
TST: AQLM test no longer x-fails by @BenjaminBossan in #2506
TST make 3 flaky test cases always pass on XPU by @yao-matrix in #2503
FIX: CPT should not be tested with sequence classification by @BenjaminBossan in #2507
Update Docker image builds for torch 2.7+cu126 by @matthewdouglas in #2514
Feature: RandLora integration into peft by @PaulAlbert31 in #2464
LORA/MODEL: Use max rank of pattern for add_weighted_adapter by @Beinsezii in #2512
fix typo for skipping test by @jiqing-feng in #2519
docs typo: fix links by @imba-tjd in #2517
Add INC dispatcher by @dsocek in #2499
ENH: Add default Qwen3 target modules by @BenjaminBossan in #2522
MNT: Pin GitHub action hashes for security by @BenjaminBossan in #2521
TST: Refactor remaining common tests to use pytest by @BenjaminBossan in #2491
ENH: Add tests, docs, types for scaling methods by @BenjaminBossan in #2526
TST Mark AutoAWQ as xfail for now by @BenjaminBossan in #2529
FIX Prompt learning issue with 4d attention mask by @BenjaminBossan in #2458
FIX: Use correct argument name in MultiheadAttention forward by @BenjaminBossan in #2510
Method comparison: Support more options for the optimizer by @BenjaminBossan in #2479
Randlora documentation and some example usage by @PaulAlbert31 in #2524
added support for Conv1d for DoRA by @EskildAndersen in #2531
Fix #2535: Prevent adapters targeting themselves by @githubnemo in #2539
Fix typos by @omahs in #2544
Use HF Papers by @qgallouedec in #2542
Address changes in transformers VLM architecture by @githubnemo in #2554
CI: Handle errors with MacOS and transformers by @BenjaminBossan in #2561
Fix zizmor warnings about unpinned docker images by @githubnemo in #2565
align xpu behavior w/ cuda by @yao-matrix in #2551
LORA/MODEL: Discard rank_pattern, rank_alpha for add_weighted_adapter by @Beinsezii in #2550
fix inconsistent variable naming in load_adapter by @pranav-gade in #2553
Prevent applying LoRA to disallowed modules in Mamba-based architectures by @dhiaEddineRhaiem in #2562
TST: Refactor unittest to pytest style custom tests by @BenjaminBossan in #2573
Simple variant application test by @githubnemo in #2572
prepare_model_for_gradient_checkpointing protected to public by @qgallouedec in #2569
Optimize isinstance Check in LoraParallelLinear by @JavaZeroo in #2576
FIX: Generation nightly CI failing due to gemma by @BenjaminBossan in #2580
FIX: Correctly determine no_split_modules by @BenjaminBossan in #2570
ENH: Orthogonal LoRA layer initialization (2) by @BenjaminBossan in #2498
ENH: Method comparison improve logging by @BenjaminBossan in #2591
DOC Update README, contributing.md, GH templates by @BenjaminBossan in #2588
Input sanitizer for benchmark result renderer by @githubnemo in #2594
Add Makefile + results for MetaMathQA task by @githubnemo in #2593
Track number of (trainable) parameters for MetaMathQA by @githubnemo in #2598
ENH: Method comparison allow full finetuning by @BenjaminBossan in #2597
enable some left out cases on XPU, all enabled cases pass by @yao-matrix in #2596
FIX: Transformers VLM architecture changes by @BenjaminBossan in #2574
Enable XPU regression tests with deterministic by @jiqing-feng in #2600
Results with number of parameters + full fine tuning by @githubnemo in #2602
Add support for Quantization-Aware Low-Rank Adaptation (QALoRA) by @gapsong in #2571
OFT: several improvements to make OFT faster and more memory efficient by @zqiu24 in #2575
FIX: Trainable tokens error with DeepSpeed ZeRO3 by @BenjaminBossan in #2605
ENH Method comparison: temporary and cancelled result files include timestamp by @BenjaminBossan in #2617
FIX: Avoid CUDA Graph re-record when hotswapping LoRAs. by @keepdying in #2611
FIX Account for attention mask being a dict, fix generate issues with gemma by @BenjaminBossan in #2579
TST Skip (more) failing MacOS tests by @BenjaminBossan in #2620
FIX Update signature for resolve_lora_variant by @BenjaminBossan in #2618
[FEAT] Add C3A Support by @Phoveran in #2577
FIX for #2549 - modify lora_B definition for conv layers with groups by @SP1029 in #2567
FIX: Type annotation error in PEFT method comparison script by @BenjaminBossan in #2628
FIX CI Multi-GPU tests require device_map by @BenjaminBossan in #2612
TST Update diffusers hotswap tests by @BenjaminBossan in #2619
Auto-tagging of PEFT models by @githubnemo in #2599

New Contributors

@kenning made their first contribution in #2444
@omahs made their first contribution in #2447
@yao-matrix made their first contribution in #2442
@cyyever made their first contribution in #2483
@real-zhangzhe made their first contribution in #2474
@AaronZLT made their first contribution in #2468
@yuanwu2017 made their first contribution in #2487
@PaulAlbert31 made their first contribution in #2464
@Beinsezii made their first contribution in #2512
@imba-tjd made their first contribution in #2517
@dsocek made their first contribution in #2499
@EskildAndersen made their first contribution in #2531
@pranav-gade made their first contribution in #2553
@dhiaEddineRhaiem made their first contribution in #2562
@JavaZeroo made their first contribution in #2576
@gapsong made their first contribution in #2571
@keepdying made their first contribution in #2611
@SP1029 made their first contribution in #2567

Full Changelog: v0.15.2...v0.16.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

0.16.0: LoRA-FA, RandLoRA, C³A, and much more