Skip to content

Conversation

Bilican
Copy link
Contributor

@Bilican Bilican commented May 30, 2025

Initial request for #2552

@BenjaminBossan
Copy link
Member

Thanks for the PR. I just did a quick skim and saw that the majority of added files are inside of the PyTorch-Wavelet-Toolbox-Custom directory. Are these files needed? We need to organize this change to fit the structure of PEFT, so if there is useful stuff in there like tests, examples, etc., could you please re-arrange them accordingly? As an example, please check the organization of PR: #2172.

@BenjaminBossan
Copy link
Member

gentle ping @Bilican

@Bilican
Copy link
Contributor Author

Bilican commented Jun 17, 2025

Hi @BenjaminBossan , sorry for the late update — I was caught up with finals at university. I've now removed all the unnecessary parts from the PyTorch-Wavelet-Toolbox-Custom directory as suggested. Let me know if there's anything else you'd like me to adjust!

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates. There is, however, still a bunch of code inside PyTorch-Wavelet-Toolbox-Custom. This doesn't work, everything that's required for users who install PEFT must live inside of src/peft. I added a comment to that effect. Please check for all these files, functions, and classes in PyTorch-Wavelet-Toolbox-Custom/ if they're required for WaveFT to work. If not, delete them. If they are, move them to src/peft/tuners/waveft/<file> with descriptive file names. From there, you can import those functions.

Please check the PR #2577, which is a recent PR that demonstrates how the code should be organized.

# Get the directory containing the current file
current_dir = os.path.dirname(os.path.abspath(__file__))
# Construct path to the target directory relative to current file
target_path = os.path.normpath(os.path.join(current_dir, "../../PyTorch-Wavelet-Toolbox-Custom/src"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be avoided, as it makes installing the PEFT package impossible. Instead, everything from the PyTorch-Wavelet-Toolbox-Custom that is needed for WaveFT to properly work in peft should be moved to src/peft/tuners/waveft. Then you can import the functions directly. Just as an example, if you move wavelets_learnable.py here, you can do: from .wavelets_learnable import WaveletFilter.

@Bilican
Copy link
Contributor Author

Bilican commented Jun 23, 2025

Hi @BenjaminBossan, I’ve implemented the changes you pointed out. Please let me know if there’s anything else you’d like me to revise or improve.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for cleaning up, this looks much better now.

For now, I only did a quick review, thus haven't checked the details. I tried to make WaveFT run but got into multiple issues, which I have commented on below.

I think a good next step would be to add WaveFT to our test suite and ensure that it passes. This way, we know that the basics are working. The easiest way to achieve this is to go here and add something like:

    ##########
    # WaveFT #
    ##########
    (
        "Vanilla MLP 1 WaveRec", "MLP", WaveFTConfig, {"target_modules": "lin0", "n_frequency": 8},
    )

After this, run the tests with pytest tests/test_custom_models.py -k waveft and check that the tests are passing.

Also, please make sure that for each new file, please ensure that the copyright notice is there and that it has the current year. Finally, once you made your changes, run make style to ensure that the linter is happ.y

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the file named waverec2.py and not waverec.py?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it is the inverse discrete wavelet transform in 2 dimensions not 1. But if necessary I can change the name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, in that case, would waverec2d.py make more sense? Or at least, add a comment to explain the name.

from typing import Any, List, Literal, NamedTuple, Optional, Protocol, Union, cast, overload

import numpy as np
import pywt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package is not a PEFT dependency, so this import will fail. I don't know what package exactly that is, maybe this one, but we very rarely add new external dependencies, only if absolutely necessary, so I think we should remove it.

@Bilican
Copy link
Contributor Author

Bilican commented Jul 16, 2025

Hi @BenjaminBossan, I’ve implemented the necessary changes (removed dependency, fixed naming convention, added tests, and fixed copyright notices) and ran the tests which seem to work without problem. Please let me know if there’s anything else you’d like me to revise or improve.

@BenjaminBossan
Copy link
Member

Thanks @Bilican. There have been some changes on the main branch which cause merge conflicts, but they should be easy to resolve. Could you please merge with/rebase on the latest main branch and resolve the conflicts?

@Bilican
Copy link
Contributor Author

Bilican commented Jul 21, 2025

Hello @BenjaminBossan . I have resolved the merge conflicts. Hopefully everything works now.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, this is taking a nice shape. I reviewed the code in-depth and provided some feedback. Admittedly, I only skimmed the core WaveFT implementation, as this is outside my domain of expertise, but I trust that this code faithfully replicates the paper.

Besides these changes, we will need a few steps to finish the PR:

  1. Add at least one example (can be a copy of an existing one)
  2. Add documentation
  3. Extend testing (see my comment)
  4. Add an experimental setting to the PEFT method comparison suite

Also, once you finish, please always call make style before committing your code.

@Bilican
Copy link
Contributor Author

Bilican commented Aug 4, 2025

Hi @BenjaminBossan, I’ve implemented the necessary changes hopefully without missing any of the points you have mentioned and ran the tests which seem to work without problem. Please let me know if there’s anything else you’d like me to revise or improve. I will now proceed with adding an experimental setting for the peft method comparison suite. Could you please elaborate on the other steps I need to take (adding documentation and adding at least one example part)

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, nice work. The PR is shaping up quite nicely. I still had a few comments, but nothing serious.

ran the tests which seem to work without problem

I ran the tests locally and most of them pass indeed. However, I do get failures when running with CUDA:

FAILED tests/test_decoder_models.py::TestDecoderModels::test_generate[WaveFTConfig-config_kwargs18-hf-internal-testing/tiny-random-Gemma3ForCausalLM] - torch._dynamo.exc.Unsupported: Dynamic shape operator
FAILED tests/test_decoder_models.py::TestDecoderModels::test_generate_pos_args[WaveFTConfig-config_kwargs18-hf-internal-testing/tiny-random-Gemma3ForCausalLM] - torch._dynamo.exc.Unsupported: Dynamic shape operator
FAILED tests/test_decoder_models.py::TestDecoderModels::test_disable_adapter[WaveFTConfig-config_kwargs18-hf-internal-testing/tiny-random-Gemma3ForCausalLM] - torch._dynamo.exc.Unsupported: Dynamic shape operator

Can you replicate those errors?

I will now proceed with adding an experimental setting for the peft method comparison suite. Could you please elaborate on the other steps I need to take (adding documentation and adding at least one example part)

Regarding docs and examples, it is easiest if you check past PRs that added new PEFT methods. Just as an example, check the addition to the docs/ and examples/ from the SHiRA PR:

https://github.com/huggingface/peft/pull/2584/files

Just take inspiration from that and adapt it to WaveFT. Of course, you can pick another example than the one in shira_finetuning.py. Basically, check if one of the existing ones in examples/ is suitable and then create a copy and adapt it for WaveFT.

As for adding an experiment to the PEFT benchmark, check out the README we have there, it should explain what is needed.

Finally, once you're finished with all your changes, don't forget to run make style.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

@BenjaminBossan
Copy link
Member

@Bilican Do you still intend to work on this PR?

@Bilican
Copy link
Contributor Author

Bilican commented Sep 4, 2025

Sorry for the late reply @BenjaminBossan, I was away for my wedding. I do intend to continue working on this PR promptly.

@BenjaminBossan
Copy link
Member

No worries about the delay. And congratulations on your wedding.

@Bilican
Copy link
Contributor Author

Bilican commented Sep 25, 2025

Hello @BenjaminBossan . Firstly want to thank you for your kind words and patience. I couldn't replicate the CUDA errors for those scripts. They pass in my case and I have my environment details below.

Environment:

  • PyTorch: 2.7.0+cu126
  • CUDA: 12.6
  • Python: 3.13.3
  • Platform: linux

I have adressed all of the missing parts hopefully everything is as you requested. Please let me know if there’s anything else you’d like me to revise or improve.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the latest updates, we're progressing quite well and not much is still required. There is a bit of a bigger task related to deduplicating code, but it's mostly about deleting some methods, so it should hopefully not be too much work.

When you're finished with your changes, please remember to call make style.

I couldn't replicate the CUDA errors for those scripts.

I tried again and the errors are gone, not sure why these tests failed earlier for me.

class WaveFTModel(BaseTuner):
prefix: str = "waveft_"
tuner_layer_cls: type[BaseTunerLayer] = WaveFTLayer

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently merged a big PR to remove a lot of duplicated code in the various model.py files (#2771). This unfortunately means that there are a few changes needed here, but overall it will make the code much simpler. Here are the steps you need to take:

  1. merge with/rebase on the latest main branch of PEFT
  2. add a class attribute here: target_module_mapping = TRANSFORMERS_MODELS_TO_WAVEFT_TARGET_MODULES_MAPPING
  3. you can completely remove the following methods:
  • _check_new_adapter_config
  • _check_target_module_exists
  • _mark_only_adapters_as_trainable
  • __getattr__
  • get_peft_config_as_dict
  • _set_adapter_layers
  • enable_adapter_layers
  • disable_adapter_layers
  • set_adapter
  • _prepare_adapter_config
  • _unload_and_optionally_merge
  • merge_and_unload
  • unload

Comment on lines 1589 to 1606

if config.peft_type not in (
"LORA",
"ADALORA",
"IA3",
"BOFT",
"OFT",
"VERA",
"FOURIERFT",
"HRA",
"VBLORA",
"RANDLORA",
"SHIRA",
"BONE",
"C3A",
"WAVEFT",
"MISS",
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@Bilican
Copy link
Contributor Author

Bilican commented Sep 30, 2025

Hello @BenjaminBossan , I made the changes you have requested. Also fixed the styling issues. Hopefully everything is okay now. Please let me know if there’s anything else you’d like me to revise or improve.

@BenjaminBossan
Copy link
Member

Hi @Bilican for some reason, something didn't go quite right with your latest merge/rebase. This PR now shows a bunch of commits and code diffs that are not related to your PR. This makes it extremely difficult to review the PR. Could you please try to merge/rebase again and see if this hopefully fixes the problem? If that doesn't work, in the worst case you need to open a new PR, but most of the time, some git magic can fix these types of problems.

Ahmet Bilican added 9 commits October 1, 2025 14:26
- Introduced `tuner_layer_cls` attribute in `WaveFTModel` to specify the layer class.
- Updated `set_adapter` method to call the parent class implementation
@Bilican
Copy link
Contributor Author

Bilican commented Oct 1, 2025

Hello @BenjaminBossan. I rebuilt this branch on top of huggingface:main and force-pushed a clean history. I cherry-picked only the WaveFT commits plus my 3 follow-up fixes. The PR diff should now only show WaveFT-related changes. Hopefully everything is okay now. Please let me know if there’s anything else you’d like me to revise or improve.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning up the branch and the latest updates, the PR is almost good to go. I had a few more comments, but nothing major. Please also run make style before committing your changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested the example on single GPU and on two GPUs with FSDP and it worked.

@Bilican
Copy link
Contributor Author

Bilican commented Oct 2, 2025

Hello @BenjaminBossan , hopefully this PR is good to go. Please let me know if there’s anything else you’d like me to revise or improve.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the latest changes, the PR looks great now and I have no further comments. Again, thanks for your patience, good work.

PS: Failing CI tests are unrelated.

@BenjaminBossan
Copy link
Member

@Bilican Before merging, I wanted to make a run on our MetaMath benchmark using the included experiment config for WaveFT. Unfortunately, both times I tried, I got an OOM error on my 24GB GPU (somewhere between 3500 and 3750 steps). Here is a plot of the sampled memory:

image

It doesn't look like a memory leak, rather just spiky behavior and probably this particular step contained especially long sequences or something like that.

Overall, this is not a huge deal and we can still merge. But before that, I just wanted to share this with you, maybe you have some idea for how to improve the memory situation.

@Bilican
Copy link
Contributor Author

Bilican commented Oct 6, 2025

Hey @BenjaminBossan, thanks so much for your patience throughout this process!
I’m really happy to see this PR finally merged. I genuinely enjoyed working with you — your comments were super thoughtful and helped me improve a lot. Appreciate it!

I’m not entirely sure what’s causing the memory usage spike, since I usually run my tests on 40GB or 80GB GPUs, I haven’t encountered this issue myself. It’s definitely interesting. I don’t think there’s a quick fix right now, but if I find anything that could improve the situation, I’ll open a new pull request.

@BenjaminBossan BenjaminBossan merged commit b0954e0 into huggingface:main Oct 7, 2025
4 of 13 checks passed
@BenjaminBossan
Copy link
Member

I’m really happy to see this PR finally merged. I genuinely enjoyed working with you — your comments were super thoughtful and helped me improve a lot. Appreciate it!

Thank you, I like to hear that.

I’m not entirely sure what’s causing the memory usage spike, since I usually run my tests on 40GB or 80GB GPUs, I haven’t encountered this issue myself. It’s definitely interesting. I don’t think there’s a quick fix right now, but if I find anything that could improve the situation, I’ll open a new pull request.

No worries, I merged the PR as is. If you find anything, just open an issue or a new PR.

@Bilican
Copy link
Contributor Author

Bilican commented Oct 11, 2025

Hello @BenjaminBossan,
I realized that the WaveFT results for the MetaMath benchmark are not shown. If possible I can commit the results by running it on my own gpu so that it can be compared to other methods without a problem.

@BenjaminBossan
Copy link
Member

I realized that the WaveFT results for the MetaMath benchmark are not shown. If possible I can commit the results by running it on my own gpu so that it can be compared to other methods without a problem.

There is no need, we will run it on our own hardware to keep the results comparable with the other ones. This is not an automatic process, one of the PEFT maintainers needs to do this manually. We'll get to it soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants