Can PQL-D be run on one GPU?

I ran the following command:
```
python scripts/train_pql.py task=FrankaCubeStack algo.num_gpus=1 algo.p_learner_gpu=0 algo.v_learner_gpu=0 algo.distl=True algo.cri_class=DistributionalDoubleQ
```

However I get the following:

```
(PQLVLearner pid=88771) CUDA error: device-side assert triggered
(PQLVLearner pid=88771) CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
(PQLVLearner pid=88771) For debugging consider passing CUDA_LAUNCH_BLOCKING=1
(PQLVLearner pid=88771) Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
(PQLVLearner pid=88771) Traceback (most recent call last):
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/ray/_private/serialization.py", line 404, in deserialize_objects
(PQLVLearner pid=88771)     obj = self._deserialize_object(data, metadata, object_ref)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/ray/_private/serialization.py", line 270, in _deserialize_object
(PQLVLearner pid=88771)     return self._deserialize_msgpack_data(data, metadata_fields)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/ray/_private/serialization.py", line 225, in _deserialize_msgpack_data
(PQLVLearner pid=88771)     python_objects = self._deserialize_pickle5_data(pickle5_data)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/ray/_private/serialization.py", line 215, in _deserialize_pickle5_data
(PQLVLearner pid=88771)     obj = pickle.loads(in_band)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/storage.py", line 414, in _load_from_bytes
(PQLVLearner pid=88771)     return torch.load(io.BytesIO(b), weights_only=False)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/serialization.py", line 1114, in load
(PQLVLearner pid=88771)     return _legacy_load(
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/serialization.py", line 1348, in _legacy_load
(PQLVLearner pid=88771)     result = unpickler.load()
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/serialization.py", line 1281, in persistent_load
(PQLVLearner pid=88771)     obj = restore_location(obj, location)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/serialization.py", line 414, in default_restore_location
(PQLVLearner pid=88771)     result = fn(storage, location)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/serialization.py", line 392, in _deserialize
(PQLVLearner pid=88771)     return obj.to(device=device)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/storage.py", line 187, in to
(PQLVLearner pid=88771)     return _to(self, device, non_blocking)
(PQLVLearner pid=88771)   File "/home/stao/miniforge3/envs/pql/lib/python3.8/site-packages/torch/_utils.py", line 90, in _to
(PQLVLearner pid=88771)     untyped_storage.copy_(self, non_blocking)
(PQLVLearner pid=88771) RuntimeError: CUDA error: device-side assert triggered
(PQLVLearner pid=88771) CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
(PQLVLearner pid=88771) For debugging consider passing CUDA_LAUNCH_BLOCKING=1
(PQLVLearner pid=88771) Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
(PQLVLearner pid=88771)
```

I'm aware the instructions showed using the default number of GPUs so maybe it is due to that? Any help on this is appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can PQL-D be run on one GPU? #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can PQL-D be run on one GPU? #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions