Integrate DeLoRA

This request proposes integrating DeLoRA (Decoupled Low-rank Adaptation), as described in the ICLR25 accepted paper

paper: https://arxiv.org/abs/2503.18225
code: https://github.com/ExplainableML/DeLoRA

### Motivation
DeLoRA tackles finetuning in a Frobenius-norm bounded setup: this allows to prevent divergence from the pretrained model, effectively decoupling the learning of angles and magnitudes.

This is done by 
- normalization of the BA low-rank matrices, which bound the updates' Frobenius norm
- (learnable) scaling λ, which controls the update's boundary/magnitude
- layer-wise scaling of ||W||, to adapt each update's norm to the original weights' norm (mimicking multiplicative finetuning).

<img width="400" height="235" alt="Image" src="https://github.com/user-attachments/assets/07750438-5855-4319-b94a-d802393b42db" />

The method might feel quite similar to DoRA (given the similar target of decoupling angular from magnitude learning), however it presents key differences: 
- DoRA applies normalization and scaling operations on the fully finetuned weights ($W + \Delta W$) 
- the normalization operation is performed on the column space of the weight matrices

Conversely DeLoRA 
- introduces the normalization and scaling operations directly on the weight updates $\Delta W$ (more effectively preventing divergence from the pretrained model) 
- normalizes the inner low-dimensional space, which implicitly enforces a Frobenius-norm boundary. While, in theory, setting the scaling parameter λ as learnable does not prevent divergence, this does not happen in practice (see figure below, showing performance and norms at the varying of the learning rate)

<img width="4167" height="1160" alt="Image" src="https://github.com/user-attachments/assets/02d81c57-e947-487e-b83c-0f16bac48ba5" />

As a result, DeLoRA is able to achieve better decoupling and, consequently, robustness. In addition, one can arbitrarily initialize the parameter λ to achieve countless norm-bounded variations

### Your Contribution
The implementation in https://github.com/ExplainableML/DeLoRA is based on peft, and we would be pleased submit a pull request, welcoming any suggestions or guidance on this.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate DeLoRA #2760

Motivation

Your Contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate DeLoRA #2760

Description

Motivation

Your Contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions