[Feature] Support additional trainable layers during LoRA fine-tuning

In e.g. the [LongLoRA paper](https://arxiv.org/abs/2309.12307) they fully train both the embedding and the norm layers while still applying LoRA to self-attention layers. Our recipes set only LoRA parameters to trainable [here](https://github.com/pytorch/torchtune/blob/74139c9da0a0f92d4da5f9c918f48fd1680ba157/recipes/lora_finetune_single_device.py#L409), but it shouldn't be too hard to support passing additional layers to that function from the config. E.g. it could be similar to  our usage of [custom_sharded_layers](https://github.com/pytorch/torchtune/blob/74139c9da0a0f92d4da5f9c918f48fd1680ba157/recipes/configs/llama3/8B_full.yaml#L67).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Support additional trainable layers during LoRA fine-tuning #1901

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Support additional trainable layers during LoRA fine-tuning #1901

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions