Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add KTO support for preference tuning #1538

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

efsiatras
Copy link

Description

This PR introduces support for KTO (Kendall-Tau Optimization).

Related issues

Related to #1316.

Before submitting

  • This PR only changes documentation. (You can ignore the following checks in that case)
  • Did you read the contributor guideline Pull Request guidelines?
  • Did you link the issue(s) related to this PR in the section above?
  • Did you add / update tests where needed?

Reviewers

At least one review from a member of oumi-ai/oumi-staff is required.

@kaisopos
Copy link
Contributor

@efsiatras to automatically fix all these formatting issues you can run:
pre-commit run --all-files --show-diff-on-failure

@efsiatras efsiatras force-pushed the kto-implementation branch from 92f14c7 to 6be7dd7 Compare March 14, 2025 02:27
@efsiatras efsiatras force-pushed the kto-implementation branch from 6be7dd7 to 7d3689c Compare March 14, 2025 09:32
@efsiatras
Copy link
Author

Thank you! I also fixed the loading of the KTO dataset. It appears to run successfully, but it would be great if someone could test it further with larger models or longer training periods. Apologies for the force push.

@kaisopos kaisopos requested review from wizeng23, nikg4 and optas and removed request for nikg4 March 15, 2025 07:42
@wizeng23 wizeng23 requested a review from taenin March 17, 2025 22:45
@wizeng23
Copy link
Contributor

Adding oncall Matthew for review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants