-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Intergrate MonteCLoRA (TMLR 2025 accepted) into PEFT #2943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
BenjaminBossan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR to add MoneCLoRA to PEFT.
I went through the paper but must confess that I don't understand all of the math in there. I will trust you on the correctness of the implementation when it comes to this.
Regarding the PEFT integration, I think we should take a bit of a different approach. Right now, you inherit from LoraLayer, LoraModel, etc. The reason is that you can reuse most of the existing code. However, this approach can be problematic, as other code may do checks like if isinstance(layer, LoraLayer) assuming it gets a normal LoRA layer.
Checking the actual changes you added, I have thus a different proposal: Let's implement MonteCLoRA as a "LoRA variant". This way, users can use a normal LoRA model but with MonteCLoRA under the hood. Check the existing LoRA variants to get an idea of what is required for this:
From a user's perspective, it would look something like this:
monte_clora_config = MonteCLoraConfig(...)
lora_config = LoraConfig(..., use_monteclora=monta_clora_config) # by default use_monteclora = False
model = get_peft_model(model, lora_config) # <= uses normal LoraModelPlease check if that makes sense to you.
Also, while working on this, please ensure that the copyright notice is present in all new files and with the current year. Once you finish, don't forget to call make style.
There are some other areas that need work, but let's get these out of the way first and then we can discuss the rest.
| # ---------------------------------------------------------------------------- | ||
| # We subclass Trainer to inject the Variational Loss (KLD + Entropy) | ||
| # into the training loop. | ||
| class MonteCLoRATrainer(Trainer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be fine to add this to the proper PEFT package, so that users don't need to copy & paste this code. A good place would be the helpers.py file. Also, let's define a separate mixin class with compute_loss and MonteCLoRATrainer which inherits from Trainer and this mixin. The reason for that is so that users who use a different Trainer class, say, SFTTrainer, can use the mixin.
|
|
||
|
|
||
| def __getattr__(name): | ||
| if (name == "Linear8bitLt") and is_bnb_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all unnecessary, right? Let's remove this.
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
|
Gentle ping @victor7246 |
|
Hi @BenjaminBossan , |
Implemented MonteCLoRA into LoRA variants
We are integrating Monte-Carlo-based robust LoRA (MonteCLoRA), accepted at TMLR 2025 - https://openreview.net/forum?id=2HFmicB8kh. This method uses probabilistic priors and mixture-of-LoRA for robust estimation of low-rank adapters.