[Qwen3]: If qwen3 is used along with peft config, peft adds opcl obj no… (#926)

yeshsurya · web-flow · commit 983b674033a5 · 2025-11-20T17:02:10.000-08:00
…t injested further ## Summary Fixes : #925 Fix TypeError: liger_fused_linear_cross_entropy() got an unexpected keyword argument 'return_dict' that occurs when using Liger Kernel with PEFT and transformers Trainer. The return_dict parameter is a standard transformers parameter that controls output format (ModelOutput vs tuple). When using PEFT with Liger Kernel models, this parameter is passed through **kwargs all the way to liger_fused_linear_cross_entropy() which doesn't accept it, causing training to crash. This PR adds kwargs.pop("return_dict", None) in all affected model files to remove the parameter before it reaches the loss calculation functions.  ## Testing Done - Verified the fix resolves the TypeError with Qwen3 + PEFT + transformers Trainer - Tested training runs successfully complete without the error  - Hardware Type: H100 * 8 GPUs - [] run make test to ensure correctness - [] run make checkstyle to ensure code style - [] run make test-convergence to ensure convergence
diff --git a/src/liger_kernel/transformers/model/qwen3.py b/src/liger_kernel/transformers/model/qwen3.py
@@ -83,6 +83,8 @@ def lce_forward(
     kept_hidden_states = hidden_states[:, slice_indices, :]
 
     shift_labels = kwargs.pop("shift_labels", None)
+    # Remove output-control parameters that shouldn't be passed to loss functions
+    kwargs.pop("return_dict", None)
     logits = None
     loss = None
     token_accuracy = None