Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable "apply_lora_to_output" in models with tied embedding #1960

Open
felipemello1 opened this issue Nov 7, 2024 · 0 comments
Open

Enable "apply_lora_to_output" in models with tied embedding #1960

felipemello1 opened this issue Nov 7, 2024 · 0 comments
Labels
community help wanted We would love the community's help completing this issue

Comments

@felipemello1
Copy link
Contributor

felipemello1 commented Nov 7, 2024

Many families of models with small models (<=3B parameters) have tied embeddings, meaning that the output layer projection is the same as the input tok_embeddings. Examples are gemma, qwen and llama 3.2.

These models currently don't support "apply_lora_to_output". This happens because in the past we used to pass as the output_proj a lambda function, e.g. lambda x: x @ tok_embeddings.weight.

Recently, we changed it, and started passing the TiedLinear module.. We need the TiedLinear so it can work well with FSDP and other techniques.

This task is to enable LoRA on top of this TiedLinear, like we do it for nn.Linear in other models that do not have tied embeddings, e.g. in llama 3.1.

After adding this feature, the configs of models from llama 3.2, qwen and gemma have to be updated to include the flag.

@felipemello1 felipemello1 added the community help wanted We would love the community's help completing this issue label Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community help wanted We would love the community's help completing this issue
Projects
None yet
Development

No branches or pull requests

1 participant