Skip to content

Commit e657707

Browse files
authored
FEAT Add C3A (Circular Convolution Adaptation) (#2577)
Add new PEFT method C³A (Circular Convolution Adaptation). From "Parameter-Efficient Fine-Tuning via Circular Convolution": https://arxiv.org/abs/2407.19342
1 parent 4562926 commit e657707

File tree

23 files changed

+1445
-4
lines changed

23 files changed

+1445
-4
lines changed

docs/source/_toctree.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,8 @@
126126
title: Trainable Tokens
127127
- local: package_reference/randlora
128128
title: RandLora
129+
- local: package_reference/c3a
130+
title: C3A
129131

130132
title: Adapters
131133
- sections:
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License.
11+
12+
⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
13+
rendered properly in your Markdown viewer.
14+
15+
-->
16+
17+
# C3A: Parameter-Efficient Fine-Tuning via Circular Convolution
18+
19+
[C3A](https://huggingface.co/papers/2407.19342) is a parameter-efficient fine-tuning technique that leverages Circular Convolution to achieve high rank adaptation within reasonable resource limits.
20+
21+
Note that you should use a much larger learning rate (LR) for C3A than for other methods. For example, a LR of 1e-1 for C3A is a good starting point. Besides, a much smaller weight decay should be used. You can refer to the `method_comparison` folder for more details.
22+
23+
For the `block_size`, it affects tunable parameters and performance. To start with, you can choose a $\mathrm{gcd}(d_1,d_2)$ near $\frac{\sqrt{d_1\times d_2}}{r}$, where $r$ is the rank for LoRA you would use for this task.
24+
25+
C3A currently has the following constraints:
26+
27+
- Only `nn.Linear` layers are supported.
28+
- Quantized layers are not supported.
29+
- The block size should be a common divisor of both the input and output sizes of target layers.
30+
31+
If these constraints don't work for your use case, consider other methods instead.
32+
33+
The abstract from the paper is:
34+
35+
> Low-Rank Adaptation (LoRA) has gained popularity for fine-tuning large foundation models, leveraging low-rank matrices $\mathbf{A}$ and $\mathbf{B}$ to represent weight changes (i.e., $\Delta \mathbf{W} = \mathbf{B} \mathbf{A}$). This method reduces trainable parameters and mitigates heavy memory consumption associated with full delta matrices by sequentially multiplying $\mathbf{A}$ and $\mathbf{B}$ with the activation. Despite its success, the intrinsic low-rank characteristic may limit its performance. Although several variants have been proposed to address this issue, they often overlook the crucial computational and memory efficiency brought by LoRA. In this paper, we propose Circular Convolution Adaptation (C3A), which not only achieves high-rank adaptation with enhanced performance but also excels in both computational power and memory utilization. Extensive experiments demonstrate that C3A consistently outperforms LoRA and its variants across various fine-tuning tasks.
36+
37+
## C3AConfig
38+
39+
[[autodoc]] tuners.c3a.config.C3AConfig
40+
41+
## C3AModel
42+
43+
[[autodoc]] tuners.c3a.model.C3AModel

0 commit comments

Comments
 (0)