-
Notifications
You must be signed in to change notification settings - Fork 530
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support for gpu queue #3642
base: master
Are you sure you want to change the base?
support for gpu queue #3642
Conversation
gputils is required for gpu queue management
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3642 +/- ##
=======================================
Coverage 63.44% 63.45%
=======================================
Files 308 308
Lines 40891 40921 +30
Branches 5657 5665 +8
=======================================
+ Hits 25945 25966 +21
- Misses 13910 13916 +6
- Partials 1036 1039 +3 ☔ View full report in Codecov by Sentry. |
Just to check my understanding: in this model, a GPU-enabled job gets exclusive access to one full GPU, so the GPU queue is simply the number of available GPUs and the number of GPU-enabled jobs? There's no notion of a job acquiring multiple GPUs or partial GPUs? From some quick searching, it's at least possible (though I don't know how common) to write programs that utilize multiple GPUs, so I think we should allow nodes to be tagged with multiple GPU threads. If the CPU usage of a process is negligible, I think it would be reasonable to say: myproc = pe.Node(ProcessInterface(), n_threads=0, n_gpus=2) |
In the current implementation the user specifies how many n_gpu_procs the plugin should manage and the plugin will reserve those "slots" based on the node.n_threads property. If you think it's useful we can allow the user to specify different values for "gpu_procs" and "cpu_procs" for each node. |
I wrote a simpler implementation of this old pull request to handle a queue of threads to be executed on GPU.
The user can specify the maximum number of parallel threads with the plugin option n_gpu_procs
The multiprocplugin will raise exception if a node require more threads than allowed in a similar way as classic CPU threads.
Note that in this implementation any GPU node will also allocate a CPU slot (is that necessary? We can change that behavior ).
Moreover the plugin doesn't check that the system actually has a cuda capable GPU (we can add such check if you think we need it)