Skip to content

HiDream running into issues with group offloading at the block-level #11307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sayakpaul opened this issue Apr 14, 2025 · 2 comments · Fixed by #11375
Closed

HiDream running into issues with group offloading at the block-level #11307

sayakpaul opened this issue Apr 14, 2025 · 2 comments · Fixed by #11375
Assignees

Comments

@sayakpaul
Copy link
Member

Code:
https://gist.github.com/sayakpaul/558e8efd239d831d8c9d19962ae6e13d

Error:

error
Traceback (most recent call last):
  File "/fsx/sayak/diffusers/check_hidream.py", line 115, in <module>
    latents = pipe(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/pipelines/hidream_image/pipeline_hidream_image.py", line 782, in __call__
    noise_pred = self.transformer(
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/hooks/hooks.py", line 148, in new_forward
    output = function_reference.forward(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hidream_image.py", line 801, in forward
    enc_hidden_state = self.caption_projection[i](enc_hidden_state)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/hooks/hooks.py", line 148, in new_forward
    output = function_reference.forward(*args, **kwargs)
  File "/fsx/sayak/diffusers/src/diffusers/models/transformers/transformer_hidream_image.py", line 398, in forward
    hidden_states = self.linear(caption)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/fsx/sayak/miniconda3/envs/diffusers/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 125, in forward
    return F.linear(input, self.weight, self.bias)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat2 in method wrapper_CUDA_mm)

Cc: @asomoza @a-r-r-o-w

@a-r-r-o-w a-r-r-o-w self-assigned this Apr 14, 2025
@a-r-r-o-w
Copy link
Member

After looking into this, the error seems to occur only when low_cpu_mem_usage=True. Setting it to False in the reproducer script works as expected. I'll open a PR to try and fix the behaviour for the True case

@a-r-r-o-w
Copy link
Member

Edit: sorry I wrote the wrong argument above by mistake. I mean it does not error out with use_stream=False but does error out with use_stream=True . For when it errors out, i.e. use_stream=True, it does not matter what low_cpu_mem_usage is set to

See https://huggingface.slack.com/archives/C065E480NN9/p1745226277598909?thread_ts=1744298274.325979&cid=C065E480NN9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants