Skip to content

Implements Offline APO#106

Closed
jdchang1 wants to merge 207 commits intomainfrom
wensun/offline_apo
Closed

Implements Offline APO#106
jdchang1 wants to merge 207 commits intomainfrom
wensun/offline_apo

Conversation

@jdchang1
Copy link
Collaborator

@jdchang1 jdchang1 commented Jul 7, 2025

Main Changes:

  • Added non-preference based offline RL track
  • Offline Dataloader
  • Offline Model (HF/MPT)
  • Offline forward / loss

Backwards compatibility breaking change
offline_rl callback now refers to the single prompt/response case
pairwise_offline_rl callback is what was previously offline_rl

Working runs are:
apo-single-stream-openr1-gBSwx1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants