Skip to content

README training section refers to non-existent scripts and directories #870

@DvrkRain

Description

@DvrkRain

Problem: The training section references scripts/directories that don't exist in the repo (e.g., scripts/target/08). If it states "But you can use your own datasets", doesn't it mean that on path scripts/target/08 should be some datasets from author/developing team? Is it going to fail if not specify paths while trying to retrain model?

aibolit/README.md

Lines 270 to 277 in c8a8d91

6. You have to specify train and test dataset: set the `HOME_TRAIN_DATASET`
environment variable
for train dataset and the `HOME_TEST_DATASET` environment variable for test
dataset.
Usually, these files are in `scripts/target/08` directory after dataset
collection (if you have not skipped it).
But you can use your own datasets.

Suggestion 1: Add datasets (if they aren't private) to GitHub repository
Suggestion 2: Add datasets (if they aren't private) to HuggingFace and make dataset pull pipeline. There is some docs for that
Suggestion 3: Find open datasets as train & test example and make dataset pull pipeline
Suggestion 4: Rewrite the sentence in more clarified way

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueGood for newcomersgood-titleThe title was checked by ChatGPThelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions