Replies: 2 comments 1 reply
-
|
Hey @RayZ0rr ! Nice to see there's interest in using arrow for this.
In
In VariableShapeTensor Python PR we propose dims_np = pa.array(shapes[i], pa.list_(pa.int32(), 2))[0].values.to_numpy()
o = data_np.reshape(dims_np) |
Beta Was this translation helpful? Give feedback.
-
FixedSizeListArray is more memory efficient (it doesn't require an offsets buffer like the ListArray) and we use FixedSizeListArray in the VariableShapeTensorArray specification for storing shapes. So if you'll switch to
Sorry, my example was not great. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I want to use objects which like numpy ndarray or pytorch tensors which can have atleast 1 dimension where the size varies. For example consider list of 2D pointclouds. Each pointcloud data or example has shape (N, 2). Here
Ncan be different for different pointcloud data.pyarrow.FixedShapeTensorTypedoesn't work for this usecase.VariableShapeTensorimplementations 1 and 2 has not been merged. While waiting for these merges I have implemented this in the following way for zero-copy retrieval of the original list of variable tensors from the pyarrow table.pyarrow.ListTypewith the child type same asdtypeof the tensor and other column of typepyarrow.ListTypewith child as int32."points_val"and other"points_shape". Each element of"points_val"will be a flattened list of values of a single tensor (view(-1)orreshape(-1)). Each element of"points_shape"will have the shape of the tensor.Does anyone know of a better way or think this is not zero-copy?
Beta Was this translation helpful? Give feedback.
All reactions