Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancel does not work #5

Open
picobyte opened this issue Oct 4, 2023 · 3 comments
Open

Cancel does not work #5

picobyte opened this issue Oct 4, 2023 · 3 comments

Comments

@picobyte
Copy link

picobyte commented Oct 4, 2023

I have a Tesla M10, 4 gpus, passively cooled, overheating easily. At 95C the gpu becomes unusable until reboot. Controlling which gpu is activated and which one cools down via your extension, has several problems:

  • Cancel queue does not work. If you cancel a txt2img, the process keeps running.
  • url from string does not work if you link it via a node. I was using an increment and GWAS extension to either have gpus on port 29170 and 29172 active or those on 29171 and 29173, but using a string as input gives a Nonetype error at ComfyUI/custom_nodes/ComfyUI_NetDist/nodes/remote_control.py:165
    Sorry lost the exact error message, but it was about new_prompt[i] being None.
@city96
Copy link
Owner

city96 commented Oct 5, 2023

Huh, interesting idea to round-robin the GPUs. Might even work on my cards. They don't reach shutdown temp but they do thermal throttle (P40s).

Anyway, I don't know if custom nodes get notified when a workflow gets cancelled, but I'll try to figure something out. I can only realistically mess with my multi-GPU setup on the weekend so I'll try to get back to you on this.

(There's a "rewrite" branch but I'm not sure that fixes either of your issues.)

@picobyte
Copy link
Author

picobyte commented Oct 6, 2023

For the temperature issue I'll try a workaround using temperature protection. If you're interested in the attempted workflow to switch gpus: multi_gpu_test.json (currently not working). Possibly this can be done better. I am just starting with ComfyUI.

However I also wonder what the benefit is of one workflow to control the gpus versus running the ComfyUI multiple times. I think it would be better if dedicated tasks are dispatched to distinct GPUs, like one GPU for adding noise, another for UNET, one for reconstruction and maybe one for preview image generation[1], or something like that. Alternatively: subsequent cycles run on distinct GPUs. I mean this just as my (naive) concept of the ideal distribution if work. Or maybe averaging(?) of parallel run cycles or so.
[1] https://huggingface.co/blog/stable_diffusion

@city96
Copy link
Owner

city96 commented Oct 7, 2023

Okay, so I tried making a round-robin node to switch the URLs but // is interpreted as a comment... I'll get back to this once I find out where the logic for it is in comfyui.

image

As for the cancel, I added some simple logic to clear it before starting a new job. Now, this isn't optimal since the job keeps running even after you cancel it. I guess I could break it out into a separate "cancel all jobs" node but it'd be much cleaner if there was a way for custom nodes to be notified when a workflow is canceled/interrupted. I already asked comfy so I guess we'll just have to wait for now.

(Sorry, the readme is still a mess, I'll try to clean it up and then I'll merge the rewrite branch into the main one if everything works.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants