[WIP] Update `LoraConfig` for KaSA implementation #2698

iambogeumkim · 2025-08-02T05:41:15Z

I was delayed in updating the code because I was focusing on company work, but now I'm planning to resume the project in earnest. If I have any questions about implementing the code, may I continue to ask you?

I apologize for opening a new pull request, as the previous one was closed 🥲 Thank you for your understanding.

BenjaminBossan

Thank you for resuming your work on KaSA.

Implementation-wise, we need to take a different approach. Right now, KaSA is just added to the normal LoRA code, but we only want to activate it if the user opts in. Therefore, it should be implemented in a separate class, something like KasaVariant, in peft/tuners/lora/variants.py. Please check how DoRA is implemented and use a similar approach, as I have detailed in my previous comment. If anything is unclear, feel free to ask.

github-actions · 2025-09-01T15:04:02Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

BenjaminBossan · 2025-09-01T15:12:53Z

gentle ping @NSBG

… variants.py

iambogeumkim · 2025-09-02T12:14:03Z

Thank you for your alert!

I spent some time looking over the KaSA paper and code to get ready for more serious work, but it does seem pretty difficult 🥲 My goal is to upload code that's ready for review before the end of September, so I'm going to try even harder.

Right now, I'm stuck at the 'Extend LoRA variant resolution' stage you mentioned. Honestly, this seems like the most important part, but it's hard for me to figure out where to start—specifically, which file and class I should work on first. Could you help me with this?

BenjaminBossan · 2025-09-02T14:04:09Z

That's great to see, thanks for picking this back up.

Right now, I'm stuck at the 'Extend LoRA variant resolution' stage you mentioned. Honestly, this seems like the most important part, but it's hard for me to figure out where to start—specifically, which file and class I should work on first. Could you help me with this?

You're already on the right track, you added KasaLinearVariant, which is the most important step. There are definitely some changes required there, as there is some code that is only relevant for DoRA and can be removed for KaSA. But we can leave that as is for now.

Next about resolving the variants. As a first step, let's revert the changes you made to lora/layer.py and start fresh. We don't need a self.use_kasa attribute, we only have self.use_dora for backwards compatibility, as we didn't have LoRA variants when we first implemented DoRA.

Then let's look at these lines in lora.Linear:

peft/src/peft/tuners/lora/layer.py

Lines 636 to 642 in a3197b1

    
           def resolve_lora_variant(self, *, use_dora: bool, **kwargs) -> Optional[LoraVariant]: 
        
               if not use_dora: 
        
                   return None 
        
               from .variants import DoraLinearVariant 
        
               return DoraLinearVariant()

Here we need to extend the functionality to add KaSA. The updated method could be something like:

    def resolve_lora_variant(self, *, use_dora: bool, use_kasa: bool, **kwargs) -> Optional[LoraVariant]:
        if use_dora and use_kasa:
            raise ValueError("Cannot use DoRA and KaSA at the same time, please choose only one.")

        variant = None
        if use_dora:
            from .variants import DoraLinearVariant

            variant = DoraLinearVariant()
        elif use_kasa:
            ...

        return variant

Does that make sense? Similarly, we'd have to update the resolve_lora_variant methods of other LoRA layers, depending on whether they work with KaSA or not (I'm not sure if KaSA works with Conv2d etc.).

I would suggest that you work on this as a next step, then we'll see what else needs to be done.

iambogeumkim · 2025-09-04T14:57:54Z

wow I really appreciate your sincere feedback. I'll read your advice carefully and then move forward 🤗

iambogeumkim · 2025-09-08T18:39:23Z

@BenjaminBossan I modified the code in the files below based on what you explained. Please give me feedback if there are parts that still need fixing, and then we can discuss the next steps.

1. variants.py

Completed updates to methods in the KasaLinearVariants class

2. layer.py

In the LoraLayer class, added self.use_kasa[adapter_name] = use_kasa inside the update_layer method
In the Linear class, added KaSA handling logic inside the get_delta_weight method

BenjaminBossan

Thanks for integrating my feedback. I gave this another review and noted the next few changes that are necessary. Please check my comments.

Apart from this, the branch is now encountering merge conflicts. Could you please bring your fork up-to-date with the remote and then merge with, or rebase on, the latest main branch from PEFT? If you have questions on how to resolve the merge conflicts, don't hesitate to ask.

Furthermore, please always run make style on your changes before pushing to make our linter happy.

More of a note for myself: Since KaSA updates the base weights of the model, we will have to take extra care to ensure that it works correctly when saving and loading the adapter.

BenjaminBossan · 2025-09-09T13:44:26Z

src/peft/tuners/lora/layer.py


        """
-        return None
+        if use_dora and use_kasa:


Let's undo the changes in this method body and return None. Instead, since this KaSA layer is implemented for Linear only, add the logic to lora.Linear.resolve_lora_variant instead.

Also, we should update the resolve_lora_variant methods of the other layer types like lora.Embedding.resolve_lora_variant to accept the use_kasa argument but raise an error if it's True. Otherwise, users may add it to non-supported layers and not notice that it doesn't actually do anything there.

BenjaminBossan · 2025-09-09T13:45:08Z

src/peft/tuners/lora/layer.py

+        ############ kasa ############# 
+        self.lora_diag[adapter_name] = nn.Parameter(torch.randn(r), requires_grad=True)
+
+        weight = self.get_base_layer().weight
+        dtype = weight.dtype
+        svd_rank = self.in_features - r
+        weight = weight.to(torch.float32)
+        U, S, Vh = torch.linalg.svd(weight.data, full_matrices=False)
+        U_principle, S_principle, Vh_principle = U[:, :svd_rank], S[:svd_rank], Vh[:svd_rank, :]
+        self.get_base_layer().weight.data = (U_principle @ torch.diag(S_principle) @ Vh_principle).to(dtype)

+        #########################


All of this can be removed, since it's part of KasaLinearVariant.init, right?

BenjaminBossan · 2025-09-09T13:47:30Z

src/peft/tuners/lora/variants.py

+        # initialize lora_diag
+        module.lora_diag[adapter_name] = nn.Parameter(torch.randn(module.r[adapter_name]), requires_grad=True)
+
+        # SVD


Let's add a reference here, so that we know the origin:
# see https://github.com/juyongjiang/KaSA/blob/f85e88c22d0fa4cb8ab2923d7c2bf1bbec152da3/peft/src/peft/tuners/lora/layer.py#L132

# initialize lora_diag module.lora_diag[adapter_name] = nn.Parameter(torch.randn(module.r[adapter_name]), requires_grad=True) # see https://github.com/juyongjiang/KaSA/blob/f85e88c22d0fa4cb8ab2923d7c2bf1bbec152da3/peft/src/peft/tuners/lora/layer.py#L132 # SVD

I put it in here, how is it?

BenjaminBossan · 2025-09-09T13:51:30Z

src/peft/tuners/lora/variants.py

+    @staticmethod
+    def merge_safe(module: Linear, active_adapter: str, orig_weight: torch.Tensor) -> torch.Tensor:
+        delta_weight = module.get_delta_weight(active_adapter)
+        return orig_weight + delta_weight
+
+    @staticmethod
+    def merge_unsafe(module: Linear, active_adapter: str, orig_weight: torch.Tensor) -> None:
+        delta_weight = module.get_delta_weight(active_adapter)
+        orig_weight.data += delta_weight
+
+    @staticmethod
+    def unmerge(module: Linear, active_adapter: str, orig_weight: torch.Tensor) -> torch.Tensor:
+        delta_weight = module.get_delta_weight(active_adapter)
+        return orig_weight - delta_weight


KaSA should have an influence on the merged weights, should it not?

Although this PR is closed, it seems I've incorporated everything else except for this comment (of course, you'd have to look at the code). Could you explain this question in more detail?

BenjaminBossan · 2025-09-09T13:53:07Z

src/peft/tuners/lora/variants.py

+            x = dropout(x)
+
+        # KaSA calculation
+        lora_output = lora_B(torch.einsum('ijk,kl->ijl', lora_A(x), diag)) * scaling


Again, let's add a reference:
# see https://github.com/juyongjiang/KaSA/blob/f85e88c22d0fa4cb8ab2923d7c2bf1bbec152da3/peft/src/peft/tuners/lora/layer.py#L602C21-L602C110

# KaSA calculation # see https://github.com/juyongjiang/KaSA/blob/f85e88c22d0fa4cb8ab2923d7c2bf1bbec152da3/peft/src/peft/tuners/lora/layer.py#L602C21-L602C110 lora_output = lora_B(torch.einsum('ijk,kl->ijl', lora_A(x), diag)) * scaling return result + lora_output

I inserted this near where the actual calculation logic begins, rather than just in an empty space. I think this is a bit better.

iambogeumkim · 2025-09-16T17:02:08Z

@BenjaminBossan oh I didn't mean to close the branch, but it seems to have closed while I was merging with the main branch. I guess I'll have to open a new PR, right? 😰

+) when I tried to sync with the main branch, I ended up discarding all my commits, so did that cause it to close?

BenjaminBossan · 2025-09-17T08:55:18Z

oh I didn't mean to close the branch, but it seems to have closed while I was merging with the main branch. I guess I'll have to open a new PR, right? 😰

+) when I tried to sync with the main branch, I ended up discarding all my commits, so did that cause it to close?

I don't know what happened, but I could re-open the PR and there are some changes visible. Can you double check that everything looks as expected? If for some reason it's not what it's expected, you can create a new PR and push your local branch.

iambogeumkim · 2025-09-17T09:08:15Z

I usually handle merges in the terminal, and I suspect the pull request was closed because I accidentally wiped the commit history while using the 'Sync fork' feature on GitHub. I'll be more careful in the future. Thanks for reopening it.

I'll review the changes and open a new PR if needed. Sorry to keep bothering you with this.

BenjaminBossan · 2025-09-18T09:35:02Z

I'll review the changes and open a new PR if needed. Sorry to keep bothering you with this.

No worries. If the diff on this PR looks good, let me know and I'll do a review. Only open a new PR if for some reason, the code here does not correspond to what it should be.

github-actions · 2025-11-29T15:03:59Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

iambogeumkim · 2025-11-29T15:05:16Z

Check

…apter types, enhancing compatibility checks in the initialization process.

…ve readability in LoraModel class.

…re SVD is applied only once, while also cleaning up whitespace in multiple locations.

iambogeumkim · 2025-12-06T08:07:36Z

@BenjaminBossan

I've addressed the points you mentioned, applied make style, and resolved the conflicts. Let me know if anything else needs to be updated.

Regarding the SVD value caching, I gave it some thought and realized I was stuck on the idea that 'caching is always efficient.' Since the base weights are already updated in the first adapter even when using multiple KaSA adapters, I realized we can simply reuse those values subsequently. So, I modified the code to skip the calculation as you suggested.

BenjaminBossan

Thanks for the new updates. We just merged another LoRA variant, which created merge conflicts with your PR, but it should be easy to resolve. Could you please take care? Thanks.

BenjaminBossan · 2025-12-08T11:13:40Z

tests/test_initialization.py

+        config1 = LoraConfig(
+            r=8,
+            target_modules=["linear"],
+            init_lora_weights=True,


You can remove this line, as it's irrelevant.

BenjaminBossan · 2025-12-08T11:13:45Z

tests/test_initialization.py

+        config2 = LoraConfig(
+            r=16,
+            target_modules=["linear"],
+            init_lora_weights=True,


You can remove this line, as it's irrelevant.

# src/peft/tuners/lora/model.py if len(self.peft_config) > 1: kasa_count = sum(1 for cfg in self.peft_config.values() if cfg.use_kasa) non_kasa_count = len(self.peft_config) - kasa_count if kasa_count > 0 and non_kasa_count > 0: raise ValueError("KaSA adapters cannot be mixed with other adapter types.")

I understood this to mean that since it's handled in this section, it's irrelevant elsewhere. Is my understanding correct?

Oh, this was a misunderstanding. I meant that the single line I commented on (init_lora_weights=True,) can be removed, the test as a whole is good to keep :) Please restore these tests.

ah okay haha

I changed the tests back :) !

…tLoraInitialization, simplifying the test suite and focusing on essential compatibility checks.

iambogeumkim · 2025-12-08T13:54:41Z

I applied what you mentioned and resolvd conflicts. Please take a look!

…dapter types in TestLoraInitialization, ensuring compatibility checks are enforced in both configurations.

BenjaminBossan

PR is close to the finish line. I found a small issue, please check. Also, once ready to commit, please call make style.

BenjaminBossan · 2025-12-10T10:16:49Z

src/peft/tuners/lora/model.py

+        if (len(self.peft_config) > 1) and (config.bias != "none"):
+            raise ValueError(
+                f"{self.__class__.__name__} supports only 1 adapter with bias. When using multiple adapters, "
+                "set bias to 'none' for all adapters."
+            )


Let's remove this and call super()._check_new_adapter_config(config) instead.

…class method, improving code clarity and ensuring consistent behavior across adapter types.

iambogeumkim · 2025-12-11T15:28:38Z

Is this the final step? Please let me know if there's anything else needed.

HuggingFaceDocBuilderDev · 2025-12-11T16:27:48Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan · 2025-12-11T16:42:39Z

@iambogeumkim Could you please run make style?

…sting formatting and line breaks in LoraLayer class.

iambogeumkim · 2025-12-12T02:21:49Z

I did run make style, but it looks like one file was missed. I re-ran it and pushed the commit.

Also, thank you for your patience with all my questions, even the trivial ones. I know I might have been a bit of a bother 😅 Since this was my first code contribution, I learned so much thanks to your guidance. Wishing you a warm and happy holiday season!

BenjaminBossan · 2025-12-12T12:51:43Z

I did run make style, but it looks like one file was missed. I re-ran it and pushed the commit.

Something doesn't seem to work right, as the formatter is still complaining. These changes should resolve it:

modified   src/peft/tuners/lora/config.py
@@ -764,8 +764,9 @@ class LoraConfig(PeftConfig):
                 "singular value decomposition (SVD) with knowledge-aware singular values to dynamically "
                 "activate parametric knowledge according to its relevance to downstream tasks."
             )
-        }
+        },
     )
+
     def to_dict(self):
         """
         Returns the configuration for your adapter model as a dictionary. Removes runtime configurations.
modified   tests/test_custom_models.py
@@ -1265,10 +1265,12 @@ def _skip_tests_with_multiple_adapters_with_target_parameters(config_cls, config
     if (config_cls == LoraConfig) and config_kwargs.get("target_parameters"):
         pytest.skip("LoRA with multiple adapters with target_parameters is not supported")
 
+
 def _skip_test_disable_adapters(config_cls, config_kwargs):
     if (config_cls == LoraConfig) and config_kwargs.get("use_kasa"):
         pytest.skip("KaSA modifies base weights, so adapter disable test is skipped")
 
+
 class MLP(nn.Module):
     def __init__(self, bias=True):
         super().__init__()

Also, thank you for your patience with all my questions, even the trivial ones. I know I might have been a bit of a bother 😅 Since this was my first code contribution, I learned so much thanks to your guidance. Wishing you a warm and happy holiday season!

Don't worry, it's always the first time for someone. Happy to hear that you learned a lot.

…le configurations with multiple adapters, enhancing clarity and maintainability.

iambogeumkim · 2025-12-13T06:23:49Z

I double-checked if there were any unpushed files related to KaSA. Aside from those two files, everything seems to be pushed, so it should be ready to be merged now.

BenjaminBossan · 2025-12-15T12:12:47Z

@iambogeumkim There are a bunch of failing tests because Embedding.update_layer and _ConvNd.update_layer need to be passed the use_kasa argument. For this, you need to update their __init__ methods. Could you please update those? Once you finish, you can run pytest tests/ -k kasa to check locally if the tests now pass.

…d adapter configuration support.

iambogeumkim · 2026-01-08T14:16:41Z

I’ve updated layer.py as you suggested, and I’ve confirmed that all local tests are passing. I’ve also run the make style command. I hope everything looks good now, but please let me know if there’s anything else I should address.

BenjaminBossan · 2026-01-09T12:46:45Z

Thanks for the latest changes. There are still some errors, this time caused by X-LoRA. I checked and the issue there is that X-LoRA models can have PEFT configs that contain both normal LoRA and X-LoRA configs. Since X-LoRA configs don't have .use_case, this check fails:

kasa_count = sum(1 for cfg in self.peft_config.values() if cfg.use_kasa)

It's a bit of an edge case, but let's add if isinstance(cfg, LoraConfig) and it should work.

github-actions · 2026-02-02T15:15:46Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

githubnemo · 2026-02-02T19:42:47Z

not stale

iambogeumkim added 2 commits May 18, 2025 01:37

Add KaSA implementation to layer.py

387ad2a

Add use_kasa argument to LoraConfig

ae00a34

iambogeumkim marked this pull request as draft August 2, 2025 05:45

BenjaminBossan requested changes Aug 5, 2025

View reviewed changes

iambogeumkim added 3 commits September 2, 2025 20:54

Add use_kasa parameter to Linear class

6588f1a

Add KasaLinearVariant class (just copy of DoraLinearVariant class) in…

a824ac9

… variants.py

Add kasa description

05e4e07

iambogeumkim added 5 commits September 5, 2025 02:37

Remove unnecessary self.kasa

d1e7e43

[WIP] update KasaLinearVariant class with SVD implementation

9e53b9f

Modify merge/unmerge method in KasaLinearVariant class

aa37111

update KasaLinearVariant class with SVD implementation

9cfe65c

fix type in init method

f9d7cc7

BenjaminBossan requested changes Sep 9, 2025

View reviewed changes

iambogeumkim added 2 commits September 16, 2025 22:19

delete unnecessary part in layer.py

84813a3

add original reference in layer.py

39abcad

iambogeumkim closed this Sep 16, 2025

iambogeumkim force-pushed the peft-kasa branch from 39abcad to 20a9829 Compare September 16, 2025 16:53

merge main to peft-kasa

06f76d8

BenjaminBossan reopened this Sep 17, 2025

re-add KaSA implementation to variants.py

0043ae3

iambogeumkim and others added 4 commits December 6, 2025 14:34

Implement tests to ensure KaSA adapters cannot be mixed with other ad…

39cf1f9

…apter types, enhancing compatibility checks in the initialization process.

Refactor KaSA adapter compatibility check to simplify logic and impro…

283ff0a

…ve readability in LoraModel class.

Refactor KasaLinearVariant class to improve code readability and ensu…

bfe8996

…re SVD is applied only once, while also cleaning up whitespace in multiple locations.

Merge branch 'main' into peft-kasa

cbc5b0c

BenjaminBossan reviewed Dec 8, 2025

View reviewed changes

iambogeumkim and others added 2 commits December 8, 2025 22:36

Merge branch 'main' into peft-kasa

14fa9d7

Remove tests for mixing KaSA adapters with other adapter types in Tes…

b6aae1e

…tLoraInitialization, simplifying the test suite and focusing on essential compatibility checks.

Add tests to validate that KaSA adapters cannot be mixed with other a…

951b6b2

…dapter types in TestLoraInitialization, ensuring compatibility checks are enforced in both configurations.

BenjaminBossan requested changes Dec 10, 2025

View reviewed changes

Refactor LoraModel's adapter configuration check to utilize the super…

f20b983

…class method, improving code clarity and ensuring consistent behavior across adapter types.

Refactor resolve_lora_variant method for improved readability by adju…

a7b8ba6

…sting formatting and line breaks in LoraLayer class.

BenjaminBossan added the wait-transformers-v5 Don't merge before transformers v5 release. label Dec 12, 2025

Fix formatting in LoraConfig and update test cases to skip incompatib…

a9763ff

…le configurations with multiple adapters, enhancing clarity and maintainability.

BenjaminBossan removed the wait-transformers-v5 Don't merge before transformers v5 release. label Dec 16, 2025

Add 'use_kasa' parameter to Embedding and _ConvNd classes for enhance…

3498758

…d adapter configuration support.

[WIP] Update LoraConfig for KaSA implementation #2698

Are you sure you want to change the base?

[WIP] Update LoraConfig for KaSA implementation #2698

Conversation

iambogeumkim commented Aug 2, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 1, 2025

Uh oh!

BenjaminBossan commented Sep 1, 2025

Uh oh!

iambogeumkim commented Sep 2, 2025

Uh oh!

BenjaminBossan commented Sep 2, 2025

Uh oh!

iambogeumkim commented Sep 4, 2025

Uh oh!

iambogeumkim commented Sep 8, 2025

1. variants.py

2. layer.py

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iambogeumkim commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BenjaminBossan commented Sep 17, 2025

Uh oh!

iambogeumkim commented Sep 17, 2025

Uh oh!

BenjaminBossan commented Sep 18, 2025

Uh oh!

github-actions bot commented Nov 29, 2025

Uh oh!

iambogeumkim commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

iambogeumkim commented Dec 6, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iambogeumkim commented Dec 8, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

[WIP] Update `LoraConfig` for KaSA implementation #2698

[WIP] Update `LoraConfig` for KaSA implementation #2698

iambogeumkim commented Sep 16, 2025 •

edited

Loading

iambogeumkim commented Nov 29, 2025 •

edited

Loading

iambogeumkim commented Dec 12, 2025 •

edited

Loading