Open
Description
When I run the following code to get GPU process information:
import psutil
import pynvml #导包
UNIT = 1024 * 1024
pynvml.nvmlInit()
gpuDeriveInfo = pynvml.nvmlSystemGetDriverVersion()
gpuDeviceCount = pynvml.nvmlDeviceGetCount()
for i in range(gpuDeviceCount):
handle = pynvml.nvmlDeviceGetHandleByIndex(i)#获取GPU i的handle,后续通过handle来处理
print("进程pid:", pidInfo.pid, "用户名:", pidUser,
"显存占有:", pidInfo.usedGpuMemory/UNIT, "Mb") # 统计某pid使用的显存
pynvml.nvmlShutdown() #最后关闭管理工具
but I get the errors like this:
Traceback (most recent call last):
File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 782, in _nvmlGetFunctionPointer
_nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/ctypes/__init__.py", line 361, in __getattr__
func = self.__getitem__(name)
File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/ctypes/__init__.py", line 366, in __getitem__
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/nvidia-430/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "gpu_info.py", line 21, in <module>
pidAllInfo = pynvml.nvmlDeviceGetComputeRunningProcesses(handle)#获取所有GPU上正在运行的进程信息
File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
return nvmlDeviceGetComputeRunningProcesses_v2(handle);
File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 785, in _nvmlGetFunctionPointer
raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.NVMLError_FunctionNotFound: Function Not Found
Here is the nvidia-smi
information:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.64 Driver Version: 430.64 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 108... Off | 00000000:02:00.0 Off | N/A |
| 20% 26C P8 8W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 108... Off | 00000000:03:00.0 Off | N/A |
| 20% 28C P8 8W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GeForce GTX 108... Off | 00000000:82:00.0 Off | N/A |
| 20% 24C P8 9W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GeForce GTX 108... Off | 00000000:83:00.0 Off | N/A |
| 20% 27C P8 8W / 250W | 0MiB / 11178MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
The version of nvidia-ml-py
is 11.495.46. So why did this happen?
Activity
fbcotter commentedon Apr 20, 2023
This is also happening on the latest version, which now tries to call
nvmlDeviceGetComputeRunningProcesses_v3
for me with nvidia driver version470
. I think calling the older functionnvmlDeviceGetComputeRunningProcesses
is still available when I try. Maybe could we add atry-except
here