Replies: 1 comment 2 replies
-
It isn't as simple as that. Speeds vary depending on settings. And here is a thing that... GPU plays just a PART of it. I speed test by running a simple training with batch size 1, 2, 4, 6 and whatever dim I'm interested in. So whats interesting about these? Lets divide these by the batch size: Isn't that curious... Batch size 4 is the fastest. Why? I don't know. You'd imagine the scaling to be constant. But it isn't because batch size 6 is slower than 4, but faster than 2. However the slowest is batch size 1? Whats going on (I don't know exactly. BUT! Looking that after every GPU cycle, the CPU gets a small spike. But since I don't keep everything on the GPU Vram (because I only got 16 Gb of it) gradient check point is on the disk, and I suspect this action of going bank and forth is the delay. The only test you can do is literally testing. You can't know what you are getting, because you don't know what the exact hardware is and is doing on the other side. |
Beta Was this translation helpful? Give feedback.
-
Hi all,
would it be possible (or maybe it exists already) to have a small Kohya skript somewhere in the tools/ folder which does nothing else than dry-running some sort of learning just to see some sort of "default" learning-speed on your current machine/computer? To see how it performs (X seconds / iteration).
Motivation:
I use, as many others, RunPod to train models (because I only own a office Laptop). I always chose a Pod with GTX3090 but still, the performance differs a lot! On some days I get 1.5 sec / it, sometimes it gets really bad and is only ~4.5 sec / it. I would like to test it before setting up Kohya with app dependencies, requirements etc to see as quickly as possible, if I caught a "good" or "bad" Pod.
The training checkpoints themselves are completely irrelevant and wouldn't even need to be saved.
What do you think?
Beta Was this translation helpful? Give feedback.
All reactions