Higher-level application

PyNVML bindings are great to do all GPU information management from Python, but they are almost entirely an identical a copy of the C API. This can be a barrier for Python users who need to find out from the [NVML API documentation](https://docs.nvidia.com/deploy/nvml-api/index.html) what the API provides, and then what are the appropriate types that need to be passed, etc. We currently utilize PyNVML in both [Distributed](https://github.com/dask/distributed) and [Dask-CUDA](https://github.com/rapidsai/dask-cuda), but there's also some overlap that leads to code duplication.

I feel one way to reduce code duplication and make it easier for new users, and thus make things overall better, is to provide a "High-level PyNVML library" that takes care of the basic needs for users. For example, I would imagine something like the following (but not limited to) to be available (implementation omitted for simplicity):

```python
class Handle:
    """A handle to a GPU device.

    Parameters
    ----------
    index: int, optional
        Integer representing the CUDA device index to get a handle to.
    uuid: bytes or str, optional
        UUID of a CUDA device to get a handle to.

    Raises
    ------
    ValueError
        If neither `index` nor `uuid` are specified or if both are specified.
    """
    def __init__(
        self, index: Optional[int] = None, uuid: Optional[Union[bytes, str]] = None
    )

    @property
    def free_memory(self) -> int:
        """
        Free memory of the CUDA device.
        """

    @property
    def total_memory(self) -> int:
        """
        Total memory of the CUDA device.
        """

    @property
    def used_memory(self) -> int:
        """
        Used memory of the CUDA device.
        """
```

There would be more than the above to be covered, such as getting the number of available GPUs in the system, whether a GPU has a context currently created, if a handle is MIG or physical GPU, etc. Additionally, we would have simple tools that are generally useful, for example [a small tool I wrote long ago to measure NVLink bandwidth and peak memory](https://gist.github.com/pentschev/2e2e3fe8059240f2679b6f7002faa891), and whatever else fits in the scope of a "High-level PyNVML library" that can make our users' lives easier.

So to begin this discussion I would like to know how people like @rjzamora and @kenhester feel about this idea. Would this be something that would fit in the scope of this project? Are there any impediments to adding such a library within the scope of this project/repository?

Also cc @quasiben for vis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Higher-level application #45

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Higher-level application #45

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions