Skip to content

Conversation

ansjindal
Copy link
Contributor

@ansjindal ansjindal commented Sep 18, 2025

This commit introduces a new flytekit plugin for seamless integration with Lepton AI for deploying and managing AI inference endpoints within Flyte workflows.

Key Features:

  • Support for multiple deployment types (vLLM, NIM, SGLang, custom containers)
  • Dynamic endpoint creation and lifecycle management
  • Storage mount configuration for model caching
  • Environment variable and secret management
  • Auto-scaling configuration support
  • Comprehensive example usage patterns

Main Components:

  • LeptonConfig: Task configuration class for Lepton deployments
  • LeptonEndpointTask: Flyte task for creating inference endpoints
  • LeptonConnector: Backend connector for managing Lepton resources
  • Examples: Basic, advanced, and vLLM inference patterns

How was this patch tested?

pyflyte run --remote examples/basic_inference.py basic_inference_workflow
pyflyte run --remote examples/vllm_example.py vllm_inference_workflow
pyflyte run --remote examples/advanced_usage.py multi_model_comparison_workflow

Setup process

  1. Run Flyte sandbox

  2. Agent build

cd flytekit/plugins/flytekit-dgxc-lepton
docker build -f Dockerfile.agent -t localhost:30000/lepton-agent:latest .
docker push localhost:30000/lepton-agent:latest
  1. Deployment

update image tag in agent-deployment.yaml file along with secrets for workspace and token.

kubectl apply -f deployment/agent-deployment.yaml
kubectl get pods -n flyte -l app=lepton-agent
kubectl logs -n flyte deployment/lepton-agent --follow
  1. Update flyte config:
# deployment/lepton-agent-config.yaml
tasks:
  task-plugins:
    enabled-plugins:
      - container
      - sidecar
      - k8s-array
      - lepton_endpoint_task
    default-for-task-types:
      - container: container
      - container_array: k8s-array
      - lepton_endpoint_task: lepton-agent-service.flyte:8000

Screenshots

Basic:
image

Advanced:
image

corresponding lepton endpoints were created successfuly.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Summary by Bito

This pull request introduces the Flytekit-Lepton plugin, enhancing Flyte workflows by integrating Lepton AI for managing AI inference endpoints. Key features include support for multiple deployment types, dynamic endpoint creation, and comprehensive user guidance examples, along with new deployment configurations and a configuration map for dynamic endpoint management. The updates significantly improve resource management and configuration for AI models, ensuring robust functionality through an updated test suite.

Copy link

welcome bot commented Sep 18, 2025

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

This commit introduces a new flytekit plugin for seamless integration with
Lepton AI for deploying and managing AI inference endpoints within Flyte workflows.

Key Features:
- Support for multiple deployment types (vLLM, NIM, SGLang, custom containers)
- Dynamic endpoint creation and lifecycle management
- Storage mount configuration for model caching
- Environment variable and secret management
- Auto-scaling configuration support
- Comprehensive example usage patterns

Main Components:
- LeptonConfig: Task configuration class for Lepton deployments
- LeptonEndpointTask: Flyte task for creating inference endpoints
- LeptonConnector: Backend connector for managing Lepton resources
- Examples: Basic, advanced, and vLLM inference patterns

Signed-off-by: ansjindal <[email protected]>
@ansjindal ansjindal force-pushed the feature/flytekit-lepton-plugin branch 2 times, most recently from 8f0e7de to 3fab421 Compare September 24, 2025 06:50
…ntegration

- Replace multiple factory functions with single create_lepton_endpoint_task()
- Eliminate CLI dependencies, use leptonai Python SDK exclusively

Signed-off-by: ansjindal <[email protected]>
@ansjindal ansjindal force-pushed the feature/flytekit-lepton-plugin branch from 3fab421 to e9dda25 Compare September 24, 2025 06:53
- Add high-level abstractions (api_token, api_token_secret, autoscaling)

- Update credentials setup to include origin_url requirement.

Signed-off-by: ansjindal <[email protected]>
@ansjindal ansjindal force-pushed the feature/flytekit-lepton-plugin branch from ae99730 to cb41268 Compare September 24, 2025 11:54
@ansjindal ansjindal force-pushed the feature/flytekit-lepton-plugin branch 2 times, most recently from d2fd700 to 2ab3cc1 Compare September 25, 2025 19:26
…onnectors

- **Separate Connectors**: Split into LeptonEndpointDeploymentConnector and
  LeptonEndpointDeletionConnector

Closes: Plugin optimization and standardization
Signed-off-by: ansjindal <[email protected]>
@ansjindal ansjindal force-pushed the feature/flytekit-lepton-plugin branch from 2ab3cc1 to cd54401 Compare September 25, 2025 23:59
The deployment configurations should be managed separately from the plugin code.
This removes:
- connector-deployment.yaml
- connector-service.yaml
- lepton-connector-config.yaml

These files will be added to the appropriate deployment repository.

Signed-off-by: ansjindal <[email protected]>
Copy link
Member

@pingsutw pingsutw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, LGTM. this is awesome

@pingsutw pingsutw requested a review from machichima as a code owner October 3, 2025 19:47
@pingsutw pingsutw enabled auto-merge (squash) October 3, 2025 19:47
@pingsutw pingsutw merged commit ff4c79c into flyteorg:master Oct 3, 2025
117 checks passed
Copy link

welcome bot commented Oct 3, 2025

Congrats on merging your first pull request! 🎉

@flyte-bot
Copy link
Contributor

Bito Review Skipped - No Changes Detected

Bito didn't review this pull request because we did not detect any changes in the pull request to review.

Atharva1723 pushed a commit to Atharva1723/flytekit that referenced this pull request Oct 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants