Does T5 actually utilize Encoder prefixes in PrefixTuning? past_key_values seems ignored by the Encoder. #2974
Unanswered
david900125
asked this question in
Q&A
Replies: 1 comment 3 replies
-
|
Thanks for opening this discussion. Your observation is correct, in encoder-decoder models, prefix tuning does not affect the encoder. The kv cache (aka I have asked internally if there is a different way we could adapt the encoder, but possibly there is no solution. I'll update once I have an answer. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
System Info
Description
I am exploring Prefix-Tuning on a Seq2Seq model (T5/Flan-T5) and I noticed a potential discrepancy between the paper's implementation and the current codebase regarding the Encoder's prefix.
According to the [Prefix-Tuning paper](https://arxiv.org/abs/2101.00190), prefixes should be prepended to both the Encoder and the Decoder for Encoder-Decoder architectures.
I configured my PEFT model with
num_transformer_submodules=2to generate prompts for both components.However, tracing the code:
peft/src/peft/peft_model.py(PeftModelForSeq2SeqLM):The code generates the prompt and passes it into
kwargs["past_key_values"]:transformers/models/t5/modeling_t5.py(T5Model):The
forwardmethod separates the execution flow. It callsself.encoderwithout passingpast_key_values.Reference Line: [modeling_t5.py#L991 (v5.0.0rc1)](https://github.com/huggingface/transformers/blob/v5.0.0rc1/src/transformers/models/t5/modeling_t5.py#L991)
Question
It seems that even if
num_transformer_submodules=2is set, thepast_key_valuesintended for the Encoder are effectively discarded before reaching the encoder's attention layers.PrefixTuningon T5 effectively only works on the Decoder?transformersthat automatically handles this injection that I might have missed?Any clarification would be appreciated. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions