-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Description
Hi, I was running geoarches.evaluation.eval_multistep according to the documentation here. The script automatically sets the device to 'cuda' on line 136 if it is available in the environment and fails with error below.
Traceback (most recent call last):
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/core/variable.py", line 150, in as_variable
obj = Variable(dims_, data_, *attrs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/core/variable.py", line 380, in __init__
dims=dims, data=as_compatible_data(data, fastpath=fastpath), attrs=attrs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/core/variable.py", line 295, in as_compatible_data
data = np.asarray(data)
^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torch/_tensor.py", line 1194, in __array__
return self.numpy()
^^^^^^^^^^^^
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/workspace/geoarches/geoarches/evaluation/eval_multistep.py", line 301, in <module>
main()
File "/workspace/geoarches/geoarches/evaluation/eval_multistep.py", line 270, in main
labelled_metric_output = metric.compute()
^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torchmetrics/metric.py", line 699, in wrapped_func
value = _squeeze_if_scalar(compute(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/geoarches/geoarches/metrics/metric_base.py", line 153, in compute
outputs = metric.compute()
^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/torchmetrics/metric.py", line 699, in wrapped_func
value = _squeeze_if_scalar(compute(*args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/geoarches/geoarches/metrics/label_wrapper.py", line 232, in compute
return self._convert(self.metric.compute())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/geoarches/geoarches/metrics/label_wrapper.py", line 220, in _convert
ds = xr.Dataset(
^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/core/dataset.py", line 389, in __init__
variables, coord_names, dims, indexes, _ = merge_data_and_coords(
^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/structure/merge.py", line 1082, in merge_data_and_coords
return merge_core(
^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/structure/merge.py", line 707, in merge_core
collected = collect_variables_and_indexes(aligned, indexes=indexes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/structure/merge.py", line 370, in collect_variables_and_indexes
variable = as_variable(variable, name=name, auto_convert=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/envs/py_3.12/lib/python3.12/site-packages/xarray/core/variable.py", line 152, in as_variable
raise error.__class__(
TypeError: Variable 'rankhist': Could not convert tuple of form (dims, data[, attrs, encoding]): (['prediction_timedelta', 'variable', 'rank'], tensor([[[2.7105, 1.1833, 0.9830, ..., 0.9851, 1.1592, 2.5510],
[2.4572, 1.0356, 0.8853, ..., 1.0872, 1.2597, 2.2862],
[4.0011, 1.3853, 1.0745, ..., 0.8121, 0.9961, 2.5702],
[2.8899, 1.1743, 0.9555, ..., 1.1012, 1.4019, 3.4411]],
...,
[[2.2738, 1.3107, 1.0814, ..., 1.1106, 1.3861, 2.6040],
[2.1716, 1.2771, 1.0447, ..., 1.1664, 1.4339, 2.5566],
[2.5498, 1.1100, 0.8741, ..., 1.2165, 1.7302, 4.7467],
[2.7417, 1.3088, 1.0228, ..., 1.1436, 1.5601, 4.1013]]],
device='cuda:0')) to Variable.
Changing cuda
to cpu
solved the issue and did not hinder performance significantly, but this might be something you may want to look into because it fails if you run according to the documentation.
Metadata
Metadata
Assignees
Labels
No labels