-
Notifications
You must be signed in to change notification settings - Fork 2.1k
WaveFT method added into tuners #2560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks for the PR. I just did a quick skim and saw that the majority of added files are inside of the |
gentle ping @Bilican |
Hi @BenjaminBossan , sorry for the late update — I was caught up with finals at university. I've now removed all the unnecessary parts from the PyTorch-Wavelet-Toolbox-Custom directory as suggested. Let me know if there's anything else you'd like me to adjust! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the updates. There is, however, still a bunch of code inside PyTorch-Wavelet-Toolbox-Custom
. This doesn't work, everything that's required for users who install PEFT must live inside of src/peft
. I added a comment to that effect. Please check for all these files, functions, and classes in PyTorch-Wavelet-Toolbox-Custom/
if they're required for WaveFT to work. If not, delete them. If they are, move them to src/peft/tuners/waveft/<file>
with descriptive file names. From there, you can import those functions.
Please check the PR #2577, which is a recent PR that demonstrates how the code should be organized.
src/peft/tuners/waveft/layer.py
Outdated
# Get the directory containing the current file | ||
current_dir = os.path.dirname(os.path.abspath(__file__)) | ||
# Construct path to the target directory relative to current file | ||
target_path = os.path.normpath(os.path.join(current_dir, "../../PyTorch-Wavelet-Toolbox-Custom/src")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be avoided, as it makes installing the PEFT package impossible. Instead, everything from the PyTorch-Wavelet-Toolbox-Custom
that is needed for WaveFT to properly work in peft should be moved to src/peft/tuners/waveft
. Then you can import the functions directly. Just as an example, if you move wavelets_learnable.py
here, you can do: from .wavelets_learnable import WaveletFilter
.
Hi @BenjaminBossan, I’ve implemented the changes you pointed out. Please let me know if there’s anything else you’d like me to revise or improve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for cleaning up, this looks much better now.
For now, I only did a quick review, thus haven't checked the details. I tried to make WaveFT run but got into multiple issues, which I have commented on below.
I think a good next step would be to add WaveFT to our test suite and ensure that it passes. This way, we know that the basics are working. The easiest way to achieve this is to go here and add something like:
##########
# WaveFT #
##########
(
"Vanilla MLP 1 WaveRec", "MLP", WaveFTConfig, {"target_modules": "lin0", "n_frequency": 8},
)
After this, run the tests with pytest tests/test_custom_models.py -k waveft
and check that the tests are passing.
Also, please make sure that for each new file, please ensure that the copyright notice is there and that it has the current year. Finally, once you made your changes, run make style
to ensure that the linter is happ.y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the file named waverec2.py
and not waverec.py
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it is the inverse discrete wavelet transform in 2 dimensions not 1. But if necessary I can change the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, in that case, would waverec2d.py
make more sense? Or at least, add a comment to explain the name.
src/peft/tuners/waveft/waverec2.py
Outdated
from typing import Any, List, Literal, NamedTuple, Optional, Protocol, Union, cast, overload | ||
|
||
import numpy as np | ||
import pywt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This package is not a PEFT dependency, so this import will fail. I don't know what package exactly that is, maybe this one, but we very rarely add new external dependencies, only if absolutely necessary, so I think we should remove it.
Hi @BenjaminBossan, I’ve implemented the necessary changes (removed dependency, fixed naming convention, added tests, and fixed copyright notices) and ran the tests which seem to work without problem. Please let me know if there’s anything else you’d like me to revise or improve. |
Thanks @Bilican. There have been some changes on the main branch which cause merge conflicts, but they should be easy to resolve. Could you please merge with/rebase on the latest main branch and resolve the conflicts? |
Hello @BenjaminBossan . I have resolved the merge conflicts. Hopefully everything works now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, this is taking a nice shape. I reviewed the code in-depth and provided some feedback. Admittedly, I only skimmed the core WaveFT implementation, as this is outside my domain of expertise, but I trust that this code faithfully replicates the paper.
Besides these changes, we will need a few steps to finish the PR:
- Add at least one example (can be a copy of an existing one)
- Add documentation
- Extend testing (see my comment)
- Add an experimental setting to the PEFT method comparison suite
Also, once you finish, please always call make style
before committing your code.
Hi @BenjaminBossan, I’ve implemented the necessary changes hopefully without missing any of the points you have mentioned and ran the tests which seem to work without problem. Please let me know if there’s anything else you’d like me to revise or improve. I will now proceed with adding an experimental setting for the peft method comparison suite. Could you please elaborate on the other steps I need to take (adding documentation and adding at least one example part) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, nice work. The PR is shaping up quite nicely. I still had a few comments, but nothing serious.
ran the tests which seem to work without problem
I ran the tests locally and most of them pass indeed. However, I do get failures when running with CUDA:
FAILED tests/test_decoder_models.py::TestDecoderModels::test_generate[WaveFTConfig-config_kwargs18-hf-internal-testing/tiny-random-Gemma3ForCausalLM] - torch._dynamo.exc.Unsupported: Dynamic shape operator
FAILED tests/test_decoder_models.py::TestDecoderModels::test_generate_pos_args[WaveFTConfig-config_kwargs18-hf-internal-testing/tiny-random-Gemma3ForCausalLM] - torch._dynamo.exc.Unsupported: Dynamic shape operator
FAILED tests/test_decoder_models.py::TestDecoderModels::test_disable_adapter[WaveFTConfig-config_kwargs18-hf-internal-testing/tiny-random-Gemma3ForCausalLM] - torch._dynamo.exc.Unsupported: Dynamic shape operator
Can you replicate those errors?
I will now proceed with adding an experimental setting for the peft method comparison suite. Could you please elaborate on the other steps I need to take (adding documentation and adding at least one example part)
Regarding docs and examples, it is easiest if you check past PRs that added new PEFT methods. Just as an example, check the addition to the docs/
and examples/
from the SHiRA PR:
https://github.com/huggingface/peft/pull/2584/files
Just take inspiration from that and adapt it to WaveFT. Of course, you can pick another example than the one in shira_finetuning.py
. Basically, check if one of the existing ones in examples/
is suitable and then create a copy and adapt it for WaveFT.
As for adding an experiment to the PEFT benchmark, check out the README we have there, it should explain what is needed.
Finally, once you're finished with all your changes, don't forget to run make style
.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
@Bilican Do you still intend to work on this PR? |
Sorry for the late reply @BenjaminBossan, I was away for my wedding. I do intend to continue working on this PR promptly. |
No worries about the delay. And congratulations on your wedding. |
Hello @BenjaminBossan . Firstly want to thank you for your kind words and patience. I couldn't replicate the CUDA errors for those scripts. They pass in my case and I have my environment details below. Environment:
I have adressed all of the missing parts hopefully everything is as you requested. Please let me know if there’s anything else you’d like me to revise or improve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the latest updates, we're progressing quite well and not much is still required. There is a bit of a bigger task related to deduplicating code, but it's mostly about deleting some methods, so it should hopefully not be too much work.
When you're finished with your changes, please remember to call make style
.
I couldn't replicate the CUDA errors for those scripts.
I tried again and the errors are gone, not sure why these tests failed earlier for me.
src/peft/tuners/waveft/model.py
Outdated
class WaveFTModel(BaseTuner): | ||
prefix: str = "waveft_" | ||
tuner_layer_cls: type[BaseTunerLayer] = WaveFTLayer | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We recently merged a big PR to remove a lot of duplicated code in the various model.py
files (#2771). This unfortunately means that there are a few changes needed here, but overall it will make the code much simpler. Here are the steps you need to take:
- merge with/rebase on the latest main branch of PEFT
- add a class attribute here:
target_module_mapping = TRANSFORMERS_MODELS_TO_WAVEFT_TARGET_MODULES_MAPPING
- you can completely remove the following methods:
_check_new_adapter_config
_check_target_module_exists
_mark_only_adapters_as_trainable
__getattr__
get_peft_config_as_dict
_set_adapter_layers
enable_adapter_layers
disable_adapter_layers
set_adapter
_prepare_adapter_config
_unload_and_optionally_merge
merge_and_unload
unload
tests/testing_common.py
Outdated
|
||
if config.peft_type not in ( | ||
"LORA", | ||
"ADALORA", | ||
"IA3", | ||
"BOFT", | ||
"OFT", | ||
"VERA", | ||
"FOURIERFT", | ||
"HRA", | ||
"VBLORA", | ||
"RANDLORA", | ||
"SHIRA", | ||
"BONE", | ||
"C3A", | ||
"WAVEFT", | ||
"MISS", | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Hello @BenjaminBossan , I made the changes you have requested. Also fixed the styling issues. Hopefully everything is okay now. Please let me know if there’s anything else you’d like me to revise or improve. |
Hi @Bilican for some reason, something didn't go quite right with your latest merge/rebase. This PR now shows a bunch of commits and code diffs that are not related to your PR. This makes it extremely difficult to review the PR. Could you please try to merge/rebase again and see if this hopefully fixes the problem? If that doesn't work, in the worst case you need to open a new PR, but most of the time, some git magic can fix these types of problems. |
- Introduced `tuner_layer_cls` attribute in `WaveFTModel` to specify the layer class. - Updated `set_adapter` method to call the parent class implementation
Hello @BenjaminBossan. I rebuilt this branch on top of huggingface:main and force-pushed a clean history. I cherry-picked only the WaveFT commits plus my 3 follow-up fixes. The PR diff should now only show WaveFT-related changes. Hopefully everything is okay now. Please let me know if there’s anything else you’d like me to revise or improve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for cleaning up the branch and the latest updates, the PR is almost good to go. I had a few more comments, but nothing major. Please also run make style
before committing your changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the example on single GPU and on two GPUs with FSDP and it worked.
Co-authored-by: Benjamin Bossan <[email protected]>
Co-authored-by: Benjamin Bossan <[email protected]>
Co-authored-by: Benjamin Bossan <[email protected]>
Co-authored-by: Benjamin Bossan <[email protected]>
Co-authored-by: Benjamin Bossan <[email protected]>
Hello @BenjaminBossan , hopefully this PR is good to go. Please let me know if there’s anything else you’d like me to revise or improve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the latest changes, the PR looks great now and I have no further comments. Again, thanks for your patience, good work.
PS: Failing CI tests are unrelated.
@Bilican Before merging, I wanted to make a run on our MetaMath benchmark using the included experiment config for WaveFT. Unfortunately, both times I tried, I got an OOM error on my 24GB GPU (somewhere between 3500 and 3750 steps). Here is a plot of the sampled memory: ![]() It doesn't look like a memory leak, rather just spiky behavior and probably this particular step contained especially long sequences or something like that. Overall, this is not a huge deal and we can still merge. But before that, I just wanted to share this with you, maybe you have some idea for how to improve the memory situation. |
Hey @BenjaminBossan, thanks so much for your patience throughout this process! I’m not entirely sure what’s causing the memory usage spike, since I usually run my tests on 40GB or 80GB GPUs, I haven’t encountered this issue myself. It’s definitely interesting. I don’t think there’s a quick fix right now, but if I find anything that could improve the situation, I’ll open a new pull request. |
Thank you, I like to hear that.
No worries, I merged the PR as is. If you find anything, just open an issue or a new PR. |
Hello @BenjaminBossan, |
There is no need, we will run it on our own hardware to keep the results comparable with the other ones. This is not an automatic process, one of the PEFT maintainers needs to do this manually. We'll get to it soon. |
Initial request for #2552