OOM with Gemma3-27B-IT on DPO (8×A100, Deepspeed) #2555
Unanswered
Kshitiz-Khandel
asked this question in
Q&A
Replies: 1 comment 7 replies
-
Hey! I noticed you used -adapter: lora
+adapter: qlora Let me know what happens after you set this. |
Beta Was this translation helpful? Give feedback.
7 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm encountering out-of-memory (OOM) errors when fine-tuning google/gemma-3-27b-it using DPO with 8×A100 (40GB) GPUs, while leveraging Deepspeed Zero3. Here's a breakdown:
base_model: google/gemma-3-27b-it
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
load_in_8bit: False
load_in_4bit: True
chat_template: gemma3
rl: dpo
datasets:
type: chat_template.default
field_messages: "prompt" ##messages
field_chosen: "chosen"
field_rejected: "rejected"
message_property_mappings:
role: role
content: content
roles:
user: ["user"]
assistant: ["assistant"]
system: ["system"]
dataset_prepared_path:
val_set_size: 0.05
output_dir: ./outputs/dpo2/gemma3/experiment2/
sequence_len: 1024
sample_packing: false
pad_to_sequence_len: true
adapter: lora
lora_model_dir:
lora_r: 16
lora_alpha: 16
lora_dropout: 0.05
lora_target_linear: true
wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
gradient_accumulation_steps: 2
micro_batch_size: 2
num_epochs: 10
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
bf16: auto
tf32: false
#gradient_checkpointing: true
resume_from_checkpoint:
logging_steps: 1
flash_attention: false
deepspeed: ./deepspeed_configs/zero3.json
warmup_steps: 10
evals_per_epoch: 4
saves_per_epoch: 1
weight_decay: 0.0
shuffle: True
plugins:
cut_cross_entropy: true
liger_rope: true
liger_rms_norm: true
liger_glu_activation: true
liger_layer_norm: true
gradient_checkpointing: "offload"
lora_mlp_kernel: true
lora_qkv_kernel: true
lora_o_kernel: true
However, enabling the lora_*_kernel options results in a model compatibility error:
These flags expect the model to be of type PeftModelForCausalLM, but Gemma3 is instantiated as AutoModelForCausalLM.
Any help on this would be appreciated
@NanoCode012
Beta Was this translation helpful? Give feedback.
All reactions