Stopped working with Qwen Image Edit quantized model

### System Info

transformers and diffusers on main branch of GitHub
working with a 5090 and an 4090 and a ADA 4000 on runpod. 

i have been using the qwen edit for a little while now an all of a sudden its crashing when i spin up my same docker conatiner on a new machine.
all with main branch of Diffusers from GitHub pulling fresh on spin up.

ive done hunreds of images with the same docker container that installs the libs and runs my same pipeline script... any ideas?

seams like this PR introduced this bug.
https://github.com/huggingface/transformers/pull/41580

I tried using the older 4.57.1 transformers but im not sure it supports qwen image edit...


### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

```
from diffusers import QwenImageEditPlusPipeline, FlowMatchEulerDiscreteScheduler, FlowMatchHeunDiscreteScheduler
import os, io, time, sys, json
from typing import Optional
import torch
from PIL import Image
from typing import List, Dict, Optional, Any
from utils import json_to_render, json_to_sampler_options, RenderStatus, assign_loras, callback_on_step_end, load_image_sources

model_name = "aifx-art/Qwen-Image-Edit-2509-Q4"
dtype = torch.bfloat16

Load quantized pipeline

pipeline = QwenImageEditPlusPipeline.from_pretrained(
#quantized_model_dir,
model_name,
torch_dtype=dtype,
)
print("Quantized pipeline loaded.")

pipeline.scheduler = FlowMatchEulerDiscreteScheduler.from_config(
pipeline.scheduler.config,
)

print("Scheduler", pipeline.scheduler)

Pick device (MPS for mac, CUDA for Linux/Windows with GPU)

if torch.backends.mps.is_available():
pipeline = pipeline.to("mps")
elif torch.cuda.is_available():
pipeline = pipeline.to("cuda")
else:
pipeline = pipeline.to("cpu")

pipeline.enable_model_cpu_offload()
pipeline.set_progress_bar_config(disable=None)

generator = None
if render.seed is not None:
generator = torch.Generator(device=pipeline.device).manual_seed(render.seed)

images = load_image_sources(render)

inputs = {
"image": images if images else None,
"prompt": render.pos,
"negative_prompt": render.neg,
"num_inference_steps": render.steps,
"generator": generator,
"num_images_per_prompt": 1,
"width": render.width,
"height": render.height,
"callback_on_step_end": callback_on_step_end,
"true_cfg_scale": render.guidance,
#"guidance_scale": 1.0,
}

with torch.inference_mode():
output = pipeline(**inputs)
output_image = output.images[0]

print("saving file",render.filename)
output_image.save(render.filename, format="PNG")
```
Logs
```
pipeline = QwenImageEditPlusPipeline.from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File huggingface_hub/utils/_validators.py", line 89, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/diffusers/pipelines/pipeline_utils.py", line 1021, in from_pretrained
    loaded_sub_model = load_sub_model(
                       ^^^^^^^^^^^^^^^
  File .venv/lib/python3.12/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 876, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 270, in _wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File .venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4122, in from_pretrained
    model, missing_keys, unexpected_keys, mismatched_keys, offload_index, error_msgs = cls._load_pretrained_model(
                                                                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/transformers/modeling_utils.py", line 4275, in _load_pretrained_model
    missing_keys, unexpected_keys, mismatched_keys, misc = convert_and_load_state_dict_in_model(
                                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.12/site-packages/transformers/core_model_loading.py", line 621, in convert_and_load_state_dict_in_model
    raise ValueError("This quantization method is gonna be supported SOOOON")
ValueError: This quantization method is gonna be supported SOOOON
```

### Expected behavior

I tried using transformers 4.57.1 and I get other errors. looks like maybe qwen is not supported on that older version?
``
File "/usr/lib/user/resources/diffusers/proc-qwen-image-edit-q4.py", line 83, in <module>
    output = pipeline(**inputs)
             ^^^^^^^^^^^^^^^^^^
  File "/root/user/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/user/.venv/lib/python3.12/site-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 700, in __call__
    prompt_embeds, prompt_embeds_mask = self.encode_prompt(
                                        ^^^^^^^^^^^^^^^^^^^
  File "/root/user/.venv/lib/python3.12/site-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 318, in encode_prompt
    prompt_embeds, prompt_embeds_mask = self._get_qwen_prompt_embeds(prompt, image, device)
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/user/.venv/lib/python3.12/site-packages/diffusers/pipelines/qwenimage/pipeline_qwenimage_edit_plus.py", line 262, in _get_qwen_prompt_embeds
    outputs = self.text_encoder(
              ^^^^^^^^^^^^^^^^^^
  File "/root/user/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/user/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stopped working with Qwen Image Edit quantized model #42215

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Stopped working with Qwen Image Edit quantized model #42215

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions