Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Add dataset processors #184

Merged
merged 38 commits into from
Nov 14, 2024
Merged

✨ Add dataset processors #184

merged 38 commits into from
Nov 14, 2024

Commits on Oct 18, 2024

  1. Configuration menu
    Copy the full SHA
    1b9969d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    7980c1e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f29e9d2 View commit details
    Browse the repository at this point in the history

Commits on Oct 19, 2024

  1. Configuration menu
    Copy the full SHA
    b86a2e0 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    0bef124 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    687301a View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    d41d9f1 View commit details
    Browse the repository at this point in the history
  5. Merge remote-tracking branch 'origin/datasets-map-processing' into da…

    …tasets-map-processing
    
    # Conflicts:
    #	hezar/data/dataset_processors/sequence_labeling_processor.py
    arxyzan committed Oct 19, 2024
    Configuration menu
    Copy the full SHA
    5664078 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    64e511d View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    8f1f0b2 View commit details
    Browse the repository at this point in the history
  8. ✨ Add OCR dataset processor

    arxyzan committed Oct 19, 2024
    Configuration menu
    Copy the full SHA
    9c6c0a4 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    18e67e8 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    bb412f4 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    2318680 View commit details
    Browse the repository at this point in the history

Commits on Oct 22, 2024

  1. Configuration menu
    Copy the full SHA
    dae7d31 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    705f197 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    fc5c52d View commit details
    Browse the repository at this point in the history

Commits on Oct 25, 2024

  1. Configuration menu
    Copy the full SHA
    95ce482 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    79687c9 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    b9a2efb View commit details
    Browse the repository at this point in the history
  4. ✨ Return non-batched output when input is not batched in Tokenizer

    …and `ImageProcessor`
    
    Only applies when `return_tensors`="list"
    arxyzan committed Oct 25, 2024
    Configuration menu
    Copy the full SHA
    7265263 View commit details
    Browse the repository at this point in the history
  5. ✏️ Minor renamings

    arxyzan committed Oct 25, 2024
    Configuration menu
    Copy the full SHA
    e04da6e View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    95b566c View commit details
    Browse the repository at this point in the history

Commits on Oct 31, 2024

  1. Configuration menu
    Copy the full SHA
    57870ed View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f37a80e View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    2c32b67 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    4cdf79d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    6e2e3fa View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    a1bd4c2 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    6288c07 View commit details
    Browse the repository at this point in the history
  8. 🩹 Return decoder_attention_mask instead of attention_mask in imag…

    …e captioning dataset processor
    arxyzan committed Oct 31, 2024
    Configuration menu
    Copy the full SHA
    1e54f81 View commit details
    Browse the repository at this point in the history

Commits on Nov 7, 2024

  1. Configuration menu
    Copy the full SHA
    d1515a6 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    f6ae426 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    e14d106 View commit details
    Browse the repository at this point in the history

Commits on Nov 8, 2024

  1. Configuration menu
    Copy the full SHA
    2b147ff View commit details
    Browse the repository at this point in the history

Commits on Nov 14, 2024

  1. Configuration menu
    Copy the full SHA
    aa12363 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    07ab703 View commit details
    Browse the repository at this point in the history
  3. Merge branch 'refs/heads/main' into datasets-map-processing

    # Conflicts:
    #	hezar/data/data_collators.py
    #	hezar/preprocessors/tokenizers/tokenizer.py
    arxyzan committed Nov 14, 2024
    Configuration menu
    Copy the full SHA
    4866756 View commit details
    Browse the repository at this point in the history