Skip to content

JSON export is empty when there is blob type #2014

Closed
@cfahlgren1

Description

@cfahlgren1

What happens?

For example parquet for this dataset.

that has this schema:

0 id VARCHAR false null false
1 prompt VARCHAR false null false
2 chosen STRUCT(bytes BLOB, path VARCHAR) false null false
3 rejected STRUCT(bytes BLOB, path VARCHAR) false null false
4 chosen_model VARCHAR false null false
5 rejected_model VARCHAR false null false
6 evolution VARCHAR false null false
7 category VARCHAR false null false
8 sub_category VARCHAR false null false

it seems to download 5 empty rows when doing JSON export.

copy (select * from dataset limit 5) to 'output.json';

however the CSV and Parquet command works properly.

copy (select * from dataset limit 5) to 'output.csv';

To Reproduce

You can reproduce here with these queries in DuckDB Shell

https://shell.duckdb.org/#queries=v0,create-view-dataset-as-select-*-from-read_parquet('https%3A%2F%2Fhuggingface.co%2Fdatasets%2Fdata%20is%20better%20together%2Fopen%20image%20preferences%20v1%20binarized%2Fresolve%2Fmain%2Fdata%2Ftrain%2000001%20of%2000005.parquet')~,copy-(select-*-from-dataset-limit-5)-to-'output.csv'~,copy-(select-*-from-dataset-limit-5)-to-'output.json'~,.files-download-output.json,.files-download-output.csv

Browser/Environment:

Chrome Version 136

Device:

Mac

DuckDB-Wasm Version:

@duckdb/[email protected] (latest DuckDB Shell)

DuckDB-Wasm Deployment:

shell.duckdb.org

Full Name:

Caleb Fahlgren

Affiliation:

Hugging Face

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions