-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Description
Observed Behavior
When creating model actors through Xorbits actor pools, nvidia-smi shows lingering processes with 0MiB GPU memory allocation entry while actual device memory is occupied.
# nvidia-smi showing "0MiB" processes
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 964159 C ...sheng/ModelService/.venv/bin/python 0MiB |
| 1 N/A N/A 963976 C ...sheng/ModelService/.venv/bin/python 0MiB |
+-----------------------------------------------------------------------------------------+
# Actual GPU memory utilization (409MiB shown)
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA L20 On | 00000000:02:00.0 Off | Off |
| N/A 57C P0 85W / 350W | 409MiB / 49140MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
**My code: **
async def create_worker_actor_pool(address: str) -> "xo.MainActorPoolType":
subprocess_start_method = "forkserver" if os.name != "nt" else "spawn"
return await xo.create_actor_pool(
address=address,
n_process=0,
auto_recover="process",
subprocess_start_method=subprocess_start_method,
)
async def test_create_model_actor():
# Setup main pool and sub-pools
main_pool = await create_worker_actor_pool("localhost:9999")
# Create sub-pools with different CUDA devices
sub_pool_1 = await main_pool.append_sub_pool(env={"CUDA_VISIBLE_DEVICES": "1"})
actor_1 = await xo.create_actor(ModelActor, address=sub_pool_1, ...)
sub_pool_2 = await main_pool.append_sub_pool(env={"CUDA_VISIBLE_DEVICES": "0"})
actor_2 = await xo.create_actor(ModelActor, address=sub_pool_2, ...)