Skip to content

ValueError: Columns defined in both source dataset and generation config #444

@sn0rkmaiden

Description

@sn0rkmaiden

I am trying to create a large dataset with the script provided in the README.md. Here is my script:

hf jobs uv run --namespace $NAMESPACE \ 
-s HF_TOKEN=$HF_TOKEN \
 https://github.com/huggingface/aisheets/raw/refs/heads/main/scripts/extend_dataset/with_inference_client.py \
 snork-maiden/ToySnacks snork-maiden/snacksbench \
--config https://huggingface.co/datasets/snork-maiden/ToySnacks/raw/main/config.yml \
--num-rows 100

I get this error about overlapping columns and I don't really understand what's the problem. I just want to create a larger dataset.
Could someone please explain how to solve this? Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions