You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/arrow_writer.py", line 626, in write_batch
arrays.append(pa.array(typed_sequence))
File "pyarrow/array.pxi", line 255, in pyarrow.lib.array
File "pyarrow/array.pxi", line 117, in pyarrow.lib._handle_arrow_array_protocol
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/arrow_writer.py", line 258, in __arrow_array__
out = cast_array_to_feature(
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 1798, in wrapper
return func(array, *args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 2006, in cast_array_to_feature
arrays = [
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 2007, in <listcomp>
_c(array.field(name) if name in array_fields else null_array, subfeature)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 1798, in wrapper
return func(array, *args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 2066, in cast_array_to_feature
casted_array_values = _c(array.values, feature.feature)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 1798, in wrapper
return func(array, *args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 2103, in cast_array_to_feature
return array_cast(
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 1798, in wrapper
return func(array, *args, **kwargs)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/table.py", line 1949, in array_cast
raise TypeError(f"Couldn't cast array of type {_short_str(array.type)} to {_short_str(pa_type)}")
TypeError: Couldn't cast array of type string to null
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/load.py", line 2084, in load_dataset
builder_instance.download_and_prepare(
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/builder.py", line 925, in download_and_prepare
self._download_and_prepare(
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/builder.py", line 1649, in _download_and_prepare
super()._download_and_prepare(
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/builder.py", line 1001, in _download_and_prepare
self._prepare_split(split_generator, **prepare_split_kwargs)
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/builder.py", line 1487, in _prepare_split
for job_id, done, content in self._prepare_split_single(
File "/home/ubuntu/miniconda3/lib/python3.10/site-packages/datasets/builder.py", line 1644, in _prepare_split_single
raise DatasetGenerationError("An error occurred while generating the dataset") from e
datasets.exceptions.DatasetGenerationError: An error occurred while generating the dataset
Describe the bug
got
datasets==3.5.1
whats wrongits inner json structure is like
i'm 100% sure all the jsons satisfies the abovementioned format.
Steps to reproduce the bug
Expected behavior
load the dataset successfully, with the abovementioned json format and webp images
Environment info
Copy-and-paste the text below in your GitHub issue.
datasets
version: 3.5.1huggingface_hub
version: 0.30.2fsspec
version: 2025.3.0The text was updated successfully, but these errors were encountered: