FEAT Add sine-LoRA #2434 #2457

yipingji · 2025-03-27T06:55:29Z

Hi,

I just implemented sine LoRA in variants.py and changed "resolve_lora_variant" a bit in layers.py.

githubnemo

Hey, thanks for providing an early draft. This is already on the right track, although there are still some bits missing.

I would suggest to focus on making one thing work first (e.g., integration of SineLoraLinearVariant) so that you can write tests for it and then extending it to other layer types (e.g., Embedding).

For testing I'd suggest adding a test case to test_custom_models.py (copying one of the LoRA tests cases in TEST_CASES with use_sine_lora: True is sufficient for now) so that you have a broad number of test cases for you to execute. Say you added this line

("Vanilla MLP LoRA + SineLoRA", "MLP", LoraConfig, {"target_modules": ["lin0", "lin1"]}),

you can then invoke the tests using

pytest -k SineLoRA tests

Feel free to ping if you have any questions :)

githubnemo · 2025-03-27T11:52:52Z

src/peft/tuners/lora/layer.py

+            from .variants import SineLoraLinearVariant
+            return SineLoraLinearVariant()


This should reference the (not yet existing) SineLoraEmbeddingVariant since we're in the Embedding class.

But this code is good for Linear.resolve_lora_variant :) You can use it there!

Effectively every class that overrides modules in the model (Linear, Embedding, Conv2d, ...) needs its own variant implementation and resolve_lora_variant implementation but we can keep it at Linear and Embedding for now if you want.

I corrected. Please check my new PR.

Hmm can you check again? I don't see changes in this regard :/

I git push again. I only implemented Linear and Embedding for now and is that ok?

Yep that's totally fine. We can add support for convolutions later once Linear and Embedding work as expected.

src/peft/tuners/lora/layer.py

githubnemo · 2025-03-27T12:09:20Z

src/peft/tuners/lora/variants.py

+
+class SineLoraLinearVariant(LoraVariant):
+    @staticmethod
+    def init(module: Linear, adapter_name:str) -> None:


Suggested change

def init(module: Linear, adapter_name:str) -> None:

def init(module: Linear, adapter_name:str, **kwargs) -> None:

With PR #2455 now merged, init() receives all the parameters that update_layer receives.

hmmmm I did not use that and do you think that is ok?

I don't think so, no.

Currently the tests do not work because of the changes necessary in Linear.__init__ and Embedding.__init__. Once the changes are in place you'll see that calls to init will complain about unexpected arguments passed to init(). That's because all the config args are passed to init and without the wildcard **kwargs you have to define them all (which we don't want, of course).

Also you need a place to set module.sinelora_scaling and module.sinelora_frequency. This is here, from the kwargs, e.g.

module.sinelora_frequency = kwargs['sinelora_frequency']

For sinelora_scaling you need to check if kwargs['sinelora_scaling'] is None.

src/peft/tuners/lora/variants.py

Merge remote-tracking branch 'upstream/main' into sine_lora

githubnemo

Great progress! Maybe I'm missing some changes but I didn't see an update regarding the Embedding vs Linear layer variants. Could it be you missed a git push?

githubnemo · 2025-03-28T11:12:08Z

=3

+Collecting oauthlib
+  Using cached oauthlib-3.2.2-py3-none-any.whl.metadata (7.5 kB)
+Using cached oauthlib-3.2.2-py3-none-any.whl (151 kB)
+Installing collected packages: oauthlib
+Successfully installed oauthlib-3.2.2


src/peft/tuners/lora/config.py

src/peft/tuners/lora/variants.py

tests/test_custom_models.py

githubnemo · 2025-03-28T11:22:16Z

src/peft/tuners/lora/layer.py

+            from .variants import SineLoraLinearVariant
+            return SineLoraLinearVariant()


Hmm can you check again? I don't see changes in this regard :/

githubnemo

Hey, for Linear and Embedding this looks like it is almost ready.

Please remove the following files that were probably wrongly checked in:

=3
posters/.$Untitled Diagram.drawio.dtmp
posters/Untitled Diagram.drawio
posters/Untitled Diagram.drawio.pdf

Please run make style to fix any style and code issues.

I think it would be worthwhile to add a test that compares plain LoRA with SineLoRA and makes sure that the output is different (e.g., for extreme scaling values so that we don't have to train for long). You could create tests/test_lora_variant_sinelora.py and add such a test there. I think it would be quite useful for making sure that the code works as intended :)

To test the Embedding layer implementation you could add

("Embedding + transformers Conv1D 2 LoRA + SineLoRA", "EmbConv1D", LoraConfig, {"target_modules": ["emb"], "use_sinelora": True}),

to the custom models testcase you already added.

src/peft/tuners/lora/layer.py

githubnemo · 2025-03-31T10:35:08Z

src/peft/tuners/lora/layer.py

+            from .variants import SineLoraLinearVariant
+            return SineLoraLinearVariant()


Yep that's totally fine. We can add support for convolutions later once Linear and Embedding work as expected.

src/peft/tuners/lora/layer.py

githubnemo · 2025-03-31T12:35:20Z

src/peft/tuners/lora/config.py

+        sinelora_scaling (`float`):
+            The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features).


If this value is optional, it should be marked as type Optional[float]

change type of sinelora_scaling here to Optional[float] as it is defined in code.

src/peft/tuners/lora/config.py

githubnemo · 2025-03-31T12:36:37Z

src/peft/tuners/lora/variants.py

+
+class SineLoraLinearVariant(LoraVariant):
+    @staticmethod
+    def init(module: Linear, adapter_name:str) -> None:


I don't think so, no.

Currently the tests do not work because of the changes necessary in Linear.__init__ and Embedding.__init__. Once the changes are in place you'll see that calls to init will complain about unexpected arguments passed to init(). That's because all the config args are passed to init and without the wildcard **kwargs you have to define them all (which we don't want, of course).

Also you need a place to set module.sinelora_scaling and module.sinelora_frequency. This is here, from the kwargs, e.g.

module.sinelora_frequency = kwargs['sinelora_frequency']

For sinelora_scaling you need to check if kwargs['sinelora_scaling'] is None.

yipingji · 2025-03-31T14:48:26Z

@githubnemo Hi, I am still a bit confused how to add module.sinelora_frequency = kwargs['sinelora_frequency'] and sinelora_scaling. It returns key error when I put module.sinelora_frequency = kwargs['sinelora_frequency'] in class SineLoraLinearVariant(LoraVariant):

githubnemo · 2025-04-01T12:37:50Z

Hi @yipingji

Sorry for being unclear. I tried highlighting the issue in #2457 (comment). The kwargs come from the update_layer calls, the update_layer calls get their parameters from the layer __init__ calls. As long as they don't have sinelora_frequency and sinelora_scaling they will be missing from the kwargs in SineLora*Variant.init as well.

So you have to update the following functions with new parameters the same way you did to add the use_sinelora parameter:

LoraLayer.update_layer
Linear.__init__
Linear.update_layer
Embedding.__init__
Embedding.update_layer

For example, the change for LoraLayer and Linear could look like this:

diff --git a/src/peft/tuners/lora/layer.py b/src/peft/tuners/lora/layer.py
index cb09608..efd8d6a 100644
--- a/src/peft/tuners/lora/layer.py
+++ b/src/peft/tuners/lora/layer.py
@@ -183,6 +183,8 @@ class LoraLayer(BaseTunerLayer):
         use_rslora,
         use_dora: bool = False,
         use_sinelora: bool = False,
+        sinelora_frequency = 200.0,
+        sinelora_scaling = None,
         lora_bias: bool = False,
     ):
         # collect the kwargs
@@ -574,6 +576,8 @@ class Linear(nn.Module, LoraLayer):
         use_dora: bool = False,
         lora_bias: bool = False,
         use_sinelora: bool = False,
+        sinelora_frequency: float = 200.0,
+        sinelora_scaling: Optional[float] = None,
         **kwargs,
     ) -> None:
         super().__init__()
@@ -591,6 +595,8 @@ class Linear(nn.Module, LoraLayer):
             use_dora=use_dora,
             lora_bias=lora_bias,
             use_sinelora=use_sinelora,
+            sinelora_frequency=sinelora_frequency,
+            sinelora_scaling=sinelora_scaling,
         )
         self.is_target_conv_1d_layer = is_target_conv_1d_layer

I hope that helps!

To test the Embedding layer implementation you could add

("Embedding + transformers Conv1D 2 LoRA + SineLoRA", "EmbConv1D", LoraConfig, {"target_modules": ["emb"], "use_sinelora": True}),

to the custom models testcase you already added.

yipingji · 2025-04-01T13:47:39Z

Hi @githubnemo, apologies for the previous update and it is my first time to PR in such a large codebase. I was just wondering if my current one is correct.

githubnemo

Hi @githubnemo, apologies for the previous update and it is my first time to PR in such a large codebase. I was just wondering if my current one is correct.

No worries, you're doing fine :) Thanks for your work!

Please make sure to test your changes regularly by running pytest:

# from the root of the repository
pytest -k SineLoRA tests

I left some comments regarding the forward implementations.

I think we need to implement merge and unmerge as well but they should be basically equal to the way LoRA is doing it.

src/peft/tuners/lora/variants.py

tests/test_custom_models.py

githubnemo · 2025-04-01T14:56:12Z

src/peft/tuners/lora/variants.py

+        sine_output = (
+            module._embed(x)
+            @ torch.sin(module.sinelora_frequency * lora_embedding_A.weight.T @ lora_embedding_B.weight.T)
+            / module.sinelora_scaling
+            * lora_scaling
+        )


I wonder, is this correct? The way I see it, _embed takes to parameters, x and weight but we're only supplying x here. weight should probably be lora_embedding_A.T?

githubnemo · 2025-04-01T14:59:21Z

src/peft/tuners/lora/variants.py

+        result = result + sine_output
+


We're missing a return here.

githubnemo · 2025-04-15T16:56:54Z

@yipingji gentle ping :)

yipingji · 2025-04-17T04:19:53Z

@yipingji gentle ping :)

Sorry for the late updates. I am working on conference recently and will PR soon;)

github-actions · 2025-05-11T15:03:45Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

yipingji · 2025-05-21T03:55:15Z

Hi @githubnemo ,

How can i reopen the PR

githubnemo · 2025-05-21T22:06:30Z

@yipingji I've reopened the PR and added the WIP tag so stale bot will not bother us.

yipingji · 2025-05-22T04:39:14Z

@githubnemo I just updated and please have a look;)

yipingji · 2025-05-28T11:41:31Z

gentle ping @githubnemo ;)

githubnemo

Thanks for the update! I think we're getting close on finishing this.

Very important: make sure to run make style and pytest -k SineLoRA tests before pushing changes and asking for a review. Both the tests and the linter are failing right now. You can use these tools to iterate on your side before asking for a review, that way the review will be a lot faster :)

I suggested to add a test case for the sinelora_scaling parameter, if it helps I encourage you to add more such test cases to remove the remaining bugs even faster!

githubnemo · 2025-06-04T15:58:43Z

src/peft/tuners/lora/config.py

+        sinelora_scaling (`float`):
+            The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features).


change type of sinelora_scaling here to Optional[float] as it is defined in code.

githubnemo · 2025-06-04T16:08:59Z

src/peft/tuners/lora/layer.py

+            from .variants import DoraConv1dVariant
+        elif use_sinelora:
+            from .variants import SineLoraConv1dVariant
+        return None


We're missing the return statements here so resolve_lora_variant always returns None.

There doesn't seem to exist a SineLoraConv1dVariant yet

If you have an implementation for conv*d I'd suggest adding it. If you don't maybe it is worthwhile to skip it for now and undo the changes in the Conv* layers.

githubnemo · 2025-06-04T16:12:15Z

src/peft/tuners/lora/layer.py

+        init_lora_weights,
+        use_rslora,
+        use_dora,
+        use_sinelora,


Make sure to change the update_layer call in ConvNd.__init__ as well (currently misses all sinelora arguments). But since we're skipping convolutions for now I suggest to remove it entirely.

githubnemo · 2025-06-05T10:30:46Z

src/peft/tuners/lora/variants.py

+        merged_weight = orig_weight + delta_weight
+        if not torch.isfinite(merged_weight).all():
+            raise ValueError(f"NaNs detected in merged weights for adapter {active_adapter}")
+        module._cache_store(f"{active_adapter}-delta_weight", delta_weight)


I don't think we need to cache the delta weights?

The same goes for merge_unsafe and the Embedding implementation.

githubnemo · 2025-06-05T10:34:17Z

src/peft/tuners/lora/variants.py

+    def init(module: Embedding, adapter_name: str, **kwargs) -> None:
+        module.sinelora_frequency = kwargs["sinelora_frequency"]
+
+        sinelora_scaling = kwargs["sinelora_scaling"]


Should probably be self.sinelora_scaling = kwargs....

Please add a test case for this scenario, e.g.:

("Embedding + transformers Conv1D LoRA + SineLoRA 2", "EmbConv1D", LoraConfig, {"target_modules": ["emb"], "use_sinelora": True, "sinelora_scaling": 100}),

githubnemo · 2025-06-05T10:38:41Z

src/peft/tuners/lora/variants.py

+class DoraEmbeddingLayer(DoraLinearLayer):
+    def forward(self, x, *, lora_A, lora_B, scaling, base_layer, embed_fn):
+        """
+        For DoRA, calculate the extra output from LoRA with DoRA applied. This should be added on top of the base layer
+        output.
+        """
+        lora_weight = (lora_A @ lora_B).T
+        magnitude = self.weight
+        weight = base_layer.weight
+        weight_norm = self.get_weight_norm(weight, lora_weight.detach(), scaling)
+        # see section 4.3 of DoRA (https://arxiv.org/abs/2402.09353)
+        # "[...] we suggest treating ||V +∆V ||_c in
+        # Eq. (5) as a constant, thereby detaching it from the gradient
+        # graph. This means that while ||V + ∆V ||_c dynamically
+        # reflects the updates of ∆V , it won’t receive any gradient
+        # during backpropagation"
+        weight_norm = weight_norm.detach()
+        mag_norm_scale = magnitude / weight_norm
+        result_dora = mag_norm_scale * (embed_fn(x, lora_A) @ lora_B) * scaling
+        return mag_norm_scale, result_dora


Where does this come from?

1st sine lora

cbd48a0

githubnemo reviewed Mar 27, 2025

View reviewed changes

Yiping Ji added 3 commits March 28, 2025 14:29

1

0b7e0ec

Merge remote-tracking branch 'upstream/main' into sine_lora

update sine_lora code

d98ab16

updated sine lora

f27ef97

githubnemo reviewed Mar 28, 2025

View reviewed changes

updated

8ed09c4

BenjaminBossan changed the title ~~draft PR~~ FEAT Add sine-LoRA #2434 Mar 31, 2025

githubnemo reviewed Mar 31, 2025

View reviewed changes

updated

f9ae3e9

updated

e4e3608

githubnemo reviewed Apr 1, 2025

View reviewed changes

github-actions bot closed this May 19, 2025

latest

8d4db0c

githubnemo reopened this May 21, 2025

githubnemo added the wip label May 21, 2025

latest

76b16ec

Merge branch 'main' into sine_lora

1723ba8

githubnemo reviewed Jun 5, 2025

View reviewed changes

		from .variants import SineLoraLinearVariant
		return SineLoraLinearVariant()

	def init(module: Linear, adapter_name:str) -> None:
	def init(module: Linear, adapter_name:str, **kwargs) -> None:

		sinelora_scaling (`float`):
		The scaling factor for the sine activation. If not specified, it will be set to the default value of sqrt(in_features).

FEAT Add sine-LoRA #2434 #2457

Are you sure you want to change the base?

FEAT Add sine-LoRA #2434 #2457

Uh oh!

Conversation

yipingji commented Mar 27, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

githubnemo left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yipingji commented Mar 31, 2025

Uh oh!

githubnemo commented Apr 1, 2025

Uh oh!

yipingji commented Apr 1, 2025

Uh oh!

githubnemo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

githubnemo commented Apr 15, 2025

Uh oh!

yipingji commented Apr 17, 2025

Uh oh!

github-actions bot commented May 11, 2025

Uh oh!

yipingji commented May 21, 2025

Uh oh!

githubnemo commented May 21, 2025

Uh oh!

githubnemo left a comment •

edited

Loading

githubnemo Jun 4, 2025 •

edited

Loading