Skip to content

undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2 #43

Open
@kinredon

Description

@kinredon

When I run the following code to get GPU process information:

import psutil
import pynvml #导包


UNIT = 1024 * 1024


pynvml.nvmlInit() 
gpuDeriveInfo = pynvml.nvmlSystemGetDriverVersion()


gpuDeviceCount = pynvml.nvmlDeviceGetCount()


for i in range(gpuDeviceCount):
    handle = pynvml.nvmlDeviceGetHandleByIndex(i)#获取GPU i的handle,后续通过handle来处理

            print("进程pid:", pidInfo.pid, "用户名:", pidUser, 
            "显存占有:", pidInfo.usedGpuMemory/UNIT, "Mb") # 统计某pid使用的显存


pynvml.nvmlShutdown() #最后关闭管理工具

but I get the errors like this:

Traceback (most recent call last):
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/ctypes/__init__.py", line 361, in __getattr__
    func = self.__getitem__(name)
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/ctypes/__init__.py", line 366, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/nvidia-430/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "gpu_info.py", line 21, in <module>
    pidAllInfo = pynvml.nvmlDeviceGetComputeRunningProcesses(handle)#获取所有GPU上正在运行的进程信息
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
    return nvmlDeviceGetComputeRunningProcesses_v2(handle);
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/mnt/data0/home/dengjinhong/miniconda3/envs/python3/lib/python3.6/site-packages/pynvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.NVMLError_FunctionNotFound: Function Not Found

Here is the nvidia-smi information:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.64       Driver Version: 430.64       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0 Off |                  N/A |
| 20%   26C    P8     8W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 20%   28C    P8     8W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:82:00.0 Off |                  N/A |
| 20%   24C    P8     9W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  Off  | 00000000:83:00.0 Off |                  N/A |
| 20%   27C    P8     8W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

The version of nvidia-ml-py is 11.495.46. So why did this happen?

Activity

fbcotter

fbcotter commented on Apr 20, 2023

@fbcotter

This is also happening on the latest version, which now tries to call nvmlDeviceGetComputeRunningProcesses_v3 for me with nvidia driver version 470. I think calling the older function nvmlDeviceGetComputeRunningProcesses is still available when I try. Maybe could we add a try-except here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @fbcotter@kinredon

        Issue actions

          undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2 · Issue #43 · gpuopenanalytics/pynvml