Skip to content

Releases: xzuyn/Florence-2ner

0.2

13 Jul 16:38
ce50869
Compare
Choose a tag to compare
  • cpu or disk activation offloading, as well as a "hybrid" offloading method that lets you keep up to x MB of activations on GPU, then past that up to y MB of activations will move to CPU, and anything past that will be moved to disk. sort of overflowing to whatever is fastest whenever needed.
  • unsloth fixed gradient accumulation
  • rouge, google_bleu, and meteor evaluation metrics
  • overall code cleanup
  • rework dataset config method