Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-2170: Design Trainer for the LLM Runtimes #2321

Open
andreyvelich opened this issue Nov 5, 2024 · 0 comments
Open

KEP-2170: Design Trainer for the LLM Runtimes #2321

andreyvelich opened this issue Nov 5, 2024 · 0 comments

Comments

@andreyvelich
Copy link
Member

andreyvelich commented Nov 5, 2024

As part of Kubeflow Training V2 work, we should design and implement custom Trainer to fine-tune LLMs that we are planning to support via TrainingRuntimes in Kubeflow upstream.

We should discuss whether we should use native PyTorch APIs or HuggingFace Transformers in the LLM Trainer implementation.

The Trainer should allow users to configure LoRA, QLoRA, FSDP, and other important configurations.

Useful resources:

Part of: #2170

cc @saileshd1402 @deepanker13 @kubeflow/wg-training-leads

Love this feature?

Give it a 👍 We prioritize the features with most 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant