-
Notifications
You must be signed in to change notification settings - Fork 2.1k
FEAT Add sine-LoRA #2434 #2457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
FEAT Add sine-LoRA #2434 #2457
Changes from 5 commits
cbd48a0
0b7e0ec
d98ab16
f27ef97
8ed09c4
f9ae3e9
e4e3608
8d4db0c
76b16ec
1723ba8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
Collecting oauthlib | ||
Using cached oauthlib-3.2.2-py3-none-any.whl.metadata (7.5 kB) | ||
Using cached oauthlib-3.2.2-py3-none-any.whl (151 kB) | ||
Installing collected packages: oauthlib | ||
Successfully installed oauthlib-3.2.2 | ||
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -297,6 +297,14 @@ class LoraConfig(PeftConfig): | |
ranks. Right now, DoRA only supports linear and Conv2D layers. DoRA introduces a bigger overhead than pure | ||
LoRA, so it is recommended to merge weights for inference. For more information, see | ||
https://arxiv.org/abs/2402.09353. | ||
use_sinelora (`bool`): | ||
Enable 'Sine Activated Low-Rank Adaptation' (Sine-LoRA). This technique introduce to apply sine activation | ||
on the low-rank adaptor. This can be beneficial for rank boosting for low-rank matrices and enhancing its | ||
capacity. For more information, see https://arxiv.org/pdf/2403.19243. | ||
sinelora_frequency (`float`): | ||
The frequency factor for the sine activation. If not specified, it will be set to the default value of 200. | ||
sinelora_scaling (`float`): | ||
The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features). | ||
Comment on lines
+307
to
+308
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this value is optional, it should be marked as type There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. change type of |
||
layer_replication (`List[Tuple[int, int]]`): | ||
Build a new stack of layers by stacking the original model layers according to the ranges specified. This | ||
allows expanding (or shrinking) the model without duplicating the base model weights. The new layers will | ||
|
@@ -493,6 +501,32 @@ class LoraConfig(PeftConfig): | |
) | ||
}, | ||
) | ||
use_sinelora: bool = field( | ||
default=False, | ||
metadata={ | ||
"help": ( | ||
"Enable 'Sine Activated Low-Rank Adaptation' (Sine-LoRA). This technique introduce to apply sine activation " | ||
"on the low-rank adaptor. This can be beneficial for rank boosting for low-rank matrices and enhancing its " | ||
"capacity. For more information, see https://arxiv.org/pdf/2403.19243. " | ||
) | ||
}, | ||
) | ||
sinelora_frequency: float = field( | ||
default=200.0, | ||
metadata={ | ||
"help": ( | ||
"The frequency factor for the sine activation. If not specified, it will be set to the default value of 200." | ||
) | ||
}, | ||
) | ||
sinelora_scaling: float = field( | ||
yipingji marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
default=None, | ||
metadata={ | ||
"help": ( | ||
"The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features)." | ||
) | ||
}, | ||
) | ||
githubnemo marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Enables replicating layers in a model to expand it to a larger model. | ||
layer_replication: Optional[list[tuple[int, int]]] = field( | ||
default=None, | ||
|
@@ -597,6 +631,7 @@ def __post_init__(self): | |
) | ||
if self.use_dora: | ||
raise ValueError("The argument lora_bias=True is not supported for DoRA, please pass use_dora=False") | ||
|
||
|
||
# Using post training conversion of modified base weights to restore their initial values PiSSA/CorDA/OLoRA cannot | ||
# be correctly done when using rslora + rank_pattern/alpha_pattern. We can't really know if the user intends | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -14,7 +14,7 @@ | |||||
from __future__ import annotations | ||||||
|
||||||
from typing import Any | ||||||
|
||||||
import math | ||||||
import torch | ||||||
from accelerate.utils.imports import is_xpu_available | ||||||
from torch import nn | ||||||
|
@@ -287,3 +287,34 @@ class DoraConv3dVariant(_DoraConvNdVariant): | |||||
def init(module: Conv3d, adapter_name: str, **kwargs: Any) -> None: | ||||||
dora_layer = DoraConv3dLayer(fan_in_fan_out=False) | ||||||
_DoraConvNdVariant.init_convd_variant(module, adapter_name, dora_layer=dora_layer) | ||||||
|
||||||
|
||||||
class SineLoraLinearVariant(LoraVariant): | ||||||
@staticmethod | ||||||
def init(module: Linear, adapter_name:str) -> None: | ||||||
|
def init(module: Linear, adapter_name:str) -> None: | |
def init(module: Linear, adapter_name:str, **kwargs) -> None: |
With PR #2455 now merged, init()
receives all the parameters that update_layer
receives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmmm I did not use that and do you think that is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so, no.
Currently the tests do not work because of the changes necessary in Linear.__init__
and Embedding.__init__
. Once the changes are in place you'll see that calls to init
will complain about unexpected arguments passed to init()
. That's because all the config args are passed to init
and without the wildcard **kwargs
you have to define them all (which we don't want, of course).
Also you need a place to set module.sinelora_scaling
and module.sinelora_frequency
. This is here, from the kwargs, e.g.
module.sinelora_frequency = kwargs['sinelora_frequency']
For sinelora_scaling
you need to check if kwargs['sinelora_scaling']
is None
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're missing a return here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove :)