-
Notifications
You must be signed in to change notification settings - Fork 333
Add Flytekit-Lepton Plugin for Lepton AI Inference Endpoints #3328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Flytekit-Lepton Plugin for Lepton AI Inference Endpoints #3328
Conversation
Thank you for opening this pull request! 🙌 These tips will help get your PR across the finish line:
|
plugins/flytekit-dgxc-lepton/flytekitplugins/dgxc_lepton/connector.py
Outdated
Show resolved
Hide resolved
plugins/flytekit-dgxc-lepton/flytekitplugins/dgxc_lepton/connector.py
Outdated
Show resolved
Hide resolved
plugins/flytekit-dgxc-lepton/flytekitplugins/dgxc_lepton/connector.py
Outdated
Show resolved
Hide resolved
plugins/flytekit-dgxc-lepton/flytekitplugins/dgxc_lepton/connector.py
Outdated
Show resolved
Hide resolved
plugins/flytekit-dgxc-lepton/flytekitplugins/dgxc_lepton/connector.py
Outdated
Show resolved
Hide resolved
plugins/flytekit-dgxc-lepton/flytekitplugins/dgxc_lepton/endpoint_helpers.py
Outdated
Show resolved
Hide resolved
This commit introduces a new flytekit plugin for seamless integration with Lepton AI for deploying and managing AI inference endpoints within Flyte workflows. Key Features: - Support for multiple deployment types (vLLM, NIM, SGLang, custom containers) - Dynamic endpoint creation and lifecycle management - Storage mount configuration for model caching - Environment variable and secret management - Auto-scaling configuration support - Comprehensive example usage patterns Main Components: - LeptonConfig: Task configuration class for Lepton deployments - LeptonEndpointTask: Flyte task for creating inference endpoints - LeptonConnector: Backend connector for managing Lepton resources - Examples: Basic, advanced, and vLLM inference patterns Signed-off-by: ansjindal <[email protected]>
8f0e7de
to
3fab421
Compare
…ntegration - Replace multiple factory functions with single create_lepton_endpoint_task() - Eliminate CLI dependencies, use leptonai Python SDK exclusively Signed-off-by: ansjindal <[email protected]>
3fab421
to
e9dda25
Compare
- Add high-level abstractions (api_token, api_token_secret, autoscaling) - Update credentials setup to include origin_url requirement. Signed-off-by: ansjindal <[email protected]>
ae99730
to
cb41268
Compare
plugins/flytekit-dgxc-lepton/deployment/connector-deployment.yaml
Outdated
Show resolved
Hide resolved
d2fd700
to
2ab3cc1
Compare
…onnectors - **Separate Connectors**: Split into LeptonEndpointDeploymentConnector and LeptonEndpointDeletionConnector Closes: Plugin optimization and standardization Signed-off-by: ansjindal <[email protected]>
2ab3cc1
to
cd54401
Compare
The deployment configurations should be managed separately from the plugin code. This removes: - connector-deployment.yaml - connector-service.yaml - lepton-connector-config.yaml These files will be added to the appropriate deployment repository. Signed-off-by: ansjindal <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, LGTM. this is awesome
…ytekit-lepton-plugin
Congrats on merging your first pull request! 🎉 |
Bito Review Skipped - No Changes Detected |
…g#3328) Signed-off-by: ansjindal <[email protected]> Co-authored-by: Kevin Su <[email protected]> Signed-off-by: Atharva <[email protected]>
This commit introduces a new flytekit plugin for seamless integration with Lepton AI for deploying and managing AI inference endpoints within Flyte workflows.
Key Features:
Main Components:
How was this patch tested?
Setup process
Run Flyte sandbox
Agent build
update image tag in agent-deployment.yaml file along with secrets for workspace and token.
Screenshots
Basic:

Advanced:

corresponding lepton endpoints were created successfuly.
Check all the applicable boxes
Summary by Bito
This pull request introduces the Flytekit-Lepton plugin, enhancing Flyte workflows by integrating Lepton AI for managing AI inference endpoints. Key features include support for multiple deployment types, dynamic endpoint creation, and comprehensive user guidance examples, along with new deployment configurations and a configuration map for dynamic endpoint management. The updates significantly improve resource management and configuration for AI models, ensuring robust functionality through an updated test suite.