Skip to content

Conversation

KaparthyReddy
Copy link

This PR addresses a RuntimeError that occurs when loading W8A8 quantized Qwen2.5-VL models.

Changes:

  • Modified _init_weights in Qwen2_5_VLPreTrainedModel to safely skip non-floating tensors (e.g., int8 quantized weights) during initialization.
  • Ensures float tensors are initialized normally, while int8 or other non-float tensors are ignored, preventing errors from normal_() on integer dtypes.
  • Improves compatibility for loading quantized models without vLLM and allows safe custom hooks for research purposes.

Note: This PR does not include the tester file test_init_weights_safe.py; that file exists only in the fork for private testing.

Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_5_vl

@Rocketknight1
Copy link
Member

Can you link the issue this is fixing..?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants