Skip to content

Conversation

@YinBo0927
Copy link

We are integrating FeRA (Frequency-Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning), proposed in this paper(https://www.arxiv.org/abs/2511.17979). This method uses a frequency-energy indicator and a soft router to adaptively blend multiple LoRA experts based on the latent's frequency components for effective diffusion model adaptation.

  • Added fera under tuners
  • Update mapping and imports with FeRA changes
  • Added fine-tuning examples on Stable Diffusion (Text-to-Image generation)
  • All quality checks are passed.

@BenjaminBossan
Copy link
Member

For more context: #2930

@BenjaminBossan
Copy link
Member

Thanks for the PR @YinBo0927. Please make sure to only check in files that really belong to the PR.

@YinBo0927
Copy link
Author

Thanks for the PR @YinBo0927. Please make sure to only check in files that really belong to the PR.

Thanks for mention! I have deleted the logs file.

@YinBo0927 YinBo0927 marked this pull request as ready for review December 11, 2025 15:48
Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR to add FeRA to PEFT. I did a first review of the code, not super in-depth yet, but let's get the basics working first.

Apart from these comments, could you please:

  1. Explain how the frequency energy consistency loss comes into play? From my understanding, this is an important part but I don't see it in the code right now.
  2. For each file, ensure that the PEFT copyright note is added to the start of the file using the current year.

# FeRA: Frequency–Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning
[![arXiv](https://img.shields.io/badge/arXiv-2511.17979-b31b1b.svg)](https://arxiv.org/abs/2511.17979)

## A light Example
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a sentence or two explaining what is happening here.

@@ -0,0 +1,35 @@
# FeRA: Frequency–Energy Constrained Routing for Effective Diffusion Adaptation Fine-Tuning
[![arXiv](https://img.shields.io/badge/arXiv-2511.17979-b31b1b.svg)](https://arxiv.org/abs/2511.17979)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the svg is needed here.

@@ -0,0 +1,24 @@
import torch
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's extend the example to include proper training of the model.

default=4,
metadata={"help": "The rank of the LoRA experts."},
)
lora_alpha: float = field(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's rename this to just alpha.

This is the configuration class to store the configuration of a [`FeRAModel`].
"""

rank: int = field(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets rename this to r`.

f"Target module {target} is not supported. Currently, only `torch.nn.Linear` is supported."
)

def prepare_forward(self, z_t: torch.Tensor):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is unclear to me when I need to call this. Once before inference? Also before training? Let's properly document this method.

default=0.7,
metadata={"help": "Temperature for the Softmax in the router."},
)
fecl_weight: float = field(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this argument being used anyway.


routing_weights = self.router(e_t) # (B, num_experts)

count = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count is not being used.

self.num_experts = {}
self.kwargs = kwargs

self.current_routing_weights = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As is, current_routing_weights is directly set on the module. I wonder if it should be added as a (non-persistent) buffer instead. Otherwise, calls like module.to(0) won't affect current_routing_weights.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a better way to test these things in PEFT. For this, add FeRA to the test matrix in our existing tests. Start here:

TEST_CASES = [

Add some FeRA settings to the end of this list. Check the other PEFT methods in that list to see how to achieve that. This test file can be removed then, as those existing tests already cover what's needed.

Once you've done this, you can run the tests with pytest tests/test_custom_models.py -k fera. You will probably see a lot of failures which are caused by FeRA not supporting merging. For those tests, we will have to add skips. But we can discuss this later once the tests are there.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member

ping @YinBo0927

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants