Skip to content

Wave3 #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 30 commits into from
Jul 15, 2025
Merged

Wave3 #8

merged 30 commits into from
Jul 15, 2025

Conversation

smarunich
Copy link
Owner

@smarunich smarunich commented Jul 15, 2025

This pull request introduces a comprehensive model publishing workflow for external access, along with updates to authentication, infrastructure, and documentation. The changes include defining requirements, implementing the publishing workflow, enhancing authentication mechanisms, and removing deprecated configurations.

Model Publishing Workflow:

  • .kiro/specs/model-publishing-workflow/requirements.md: Added a detailed requirements document outlining user stories, acceptance criteria, and features for publishing inference models securely, including tenant-specific access controls, API key generation, and rate limiting.
  • .kiro/specs/model-publishing-workflow/tasks.md: Defined an extensive implementation plan with tasks for setting up publishing infrastructure, extending Kubernetes client operations, creating API endpoints, and adding monitoring and UI components.

Authentication Enhancements:

  • management/auth.go: Introduced EnhancedAuthMiddleware to validate both JWT tokens and API keys, added API key validation logic, and implemented metadata search across namespaces for tenant-specific access.
  • management/main.go: Updated service initialization to include PublishingService and integrated it with the HTTP server setup.

Infrastructure Updates:

Documentation Improvements:

  • docs/getting-started.md: Updated the repository URL in the cloning instructions and added minor formatting improvements to the setup script description. [1] [2]

Minor Code Changes:

@smarunich smarunich requested a review from Copilot July 15, 2025 16:49
Copilot

This comment was marked as outdated.

smarunich added 2 commits July 15, 2025 16:56
@smarunich smarunich requested a review from Copilot July 15, 2025 17:05
Copilot

This comment was marked as outdated.

smarunich added 2 commits July 15, 2025 17:10
@smarunich smarunich requested a review from Copilot July 15, 2025 17:14
@smarunich
Copy link
Owner Author

This pull request introduces significant enhancements to the model publishing workflow, including a detailed requirements document, an implementation plan, and updates to the authentication system. It also removes outdated configurations and adds new dependencies for better functionality. Below is a summary of the most important changes grouped by theme.

Enhancements to Model Publishing Workflow:

  • Added a comprehensive requirements document (.kiro/specs/model-publishing-workflow/requirements.md) outlining user stories and acceptance criteria for publishing models, tenant access, rate limiting, lifecycle management, documentation, and monitoring.
  • Introduced a detailed implementation plan (.kiro/specs/model-publishing-workflow/tasks.md) with tasks for infrastructure setup, API development, monitoring, UI components, and testing.

Authentication System Updates:

  • Enhanced AuthService to include Kubernetes client integration and support for validating API keys alongside JWT tokens (management/auth.go). Added EnhancedAuthMiddleware for dual authentication and ValidateAPIKey for API key validation. [1] [2]
  • Updated main.go to initialize PublishingService and integrate it with the HTTP server (management/main.go).

Removal of Outdated Configurations:

  • Removed deprecated Kubernetes configurations for the management service (configs/management/management.yaml.bak).

Minor Changes:

  • Updated repository URL in the "Getting Started" guide (docs/getting-started.md).
  • Added logging dependency to management/admin.go.

These changes collectively improve the functionality, security, and scalability of the model publishing workflow while simplifying the codebase and documentation.

Copilot

This comment was marked as outdated.

smarunich and others added 2 commits July 15, 2025 13:21
Added 'configmaps' to the list of resources and enabled the 'watch' verb for core resources in the management.yaml RBAC configuration. This allows for broader monitoring and management capabilities.
@smarunich smarunich requested a review from Copilot July 15, 2025 18:26
@smarunich
Copy link
Owner Author

This pull request introduces significant updates to the model publishing workflow, including new requirements documentation, implementation plans, and enhancements to authentication mechanisms. It also includes smaller changes to configuration files and documentation. Below is a summary of the most important changes:

Model Publishing Workflow Enhancements

Requirements Documentation:

  • Added a comprehensive requirements document detailing user stories, acceptance criteria, and system capabilities for the model publishing workflow. This includes tenant-specific access control, API key generation, gateway routing, rate limiting, and lifecycle management. (.kiro/specs/model-publishing-workflow/requirements.md)

Implementation Plan:

  • Outlined detailed tasks for implementing the model publishing workflow, including API endpoints, gateway configurations, rate limiting policies, and monitoring/auditing features. (.kiro/specs/model-publishing-workflow/tasks.md)

Authentication Enhancements

  • Introduced an EnhancedAuthMiddleware in AuthService to support both JWT token and API key authentication. Added methods for validating API keys and retrieving metadata from Kubernetes secrets. (management/auth.go)
  • Updated AuthService to include a k8sClient for API key management operations. (management/auth.go)

Configuration Updates

  • Added configmaps to the list of Kubernetes resources with permissions for get, list, and watch in the management service configuration. (configs/management/management.yaml)
  • Removed the backup configuration file management.yaml.bak, which contained outdated service account and RBAC definitions. (configs/management/management.yaml.bak)

Documentation Updates

  • Updated the repository clone URL in the "Getting Started" guide to reflect the correct organization. (docs/getting-started.md)
  • Added a missing section to clarify the installation steps for dependencies in the "Getting Started" guide. (docs/getting-started.md)

Minor Code Changes

  • Added a log package import for improved logging capabilities. (management/admin.go)
  • Removed the --backend=huggingface argument from the HuggingFace T5 model configuration. (configs/kserve/models/huggingface-t5.yaml)

Copilot

This comment was marked as outdated.

@smarunich smarunich requested a review from Copilot July 15, 2025 18:54
Copilot

This comment was marked as outdated.

@smarunich smarunich requested a review from Copilot July 15, 2025 20:00
Copilot

This comment was marked as outdated.

@smarunich smarunich requested a review from Copilot July 15, 2025 21:09
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a full end-to-end model publishing workflow, enhances authentication with API keys, removes deprecated management configs, and updates documentation.

  • Adds detailed publishing specs, tasks, and UI components for guided and quick publish flows
  • Introduces PublishingService with routes for publish, unpublish, key rotation, and validation
  • Cleans up old Kubernetes manifests and updates management-utils, server registration, and documentation

Reviewed Changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scripts/retest.sh New test script to redeploy and port-forward the management service
scripts/build-and-push-images.sh Removed deprecated manifest update helper
management/utils.go Simplified command execution and YAML conversion, switched to yaml.v2
management/ui/src/index.css Added CSS for publishing UI components and workflow indicators
management/ui/src/contexts/ApiContext.js Exposed model publishing and API key management endpoints
management/types.go Defined new types for publish configs, published models, and docs
management/server.go Registered publishing routes and initialized PublishingService
management/publishing.go Core implementation of the publishing workflow
Comments suppressed due to low confidence (1)

management/publishing.go:44

  • New core publishing logic is substantial and currently lacks unit or integration tests. Consider adding tests covering error paths, successful publish, and rollback scenarios.
func (s *PublishingService) PublishModel(c *gin.Context) {

const params = namespace ? { namespace } : {};
return api.post(`/models/${modelName}/publish/rotate-key`, {}, { params });
},
validateAPIKey: (apiKey) => api.post('/validate-api-key', { apiKey }),
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client is sending the API key in the request body but the server handler reads it from headers. Either adjust the client to set the X-API-Key header or update the server to parse the JSON body.

Suggested change
validateAPIKey: (apiKey) => api.post('/validate-api-key', { apiKey }),
validateAPIKey: (apiKey) => api.post('/validate-api-key', {}, { headers: { 'X-API-Key': apiKey } }),

Copilot uses AI. Check for mistakes.

},
},
"limit": map[string]interface{}{
"requests": rateLimiting.RequestsPerMinute,
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rate-limiting policy only enforces requests per minute; the RequestsPerHour and BurstLimit fields are not applied. You should add additional rules for per-hour limits and burst handling.

Copilot uses AI. Check for mistakes.

Comment on lines +944 to +947
if rateLimiting.TokensPerHour > 0 {
rules := policy["spec"].(map[string]interface{})["rateLimit"].(map[string]interface{})["global"].(map[string]interface{})["rules"].([]interface{})

// Add token-based rate limiting
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section handles token limits but doesn’t enforce the configured RequestsPerHour or BurstLimit. Consider appending an additional rule for those values.

Suggested change
if rateLimiting.TokensPerHour > 0 {
rules := policy["spec"].(map[string]interface{})["rateLimit"].(map[string]interface{})["global"].(map[string]interface{})["rules"].([]interface{})
// Add token-based rate limiting
rules := policy["spec"].(map[string]interface{})["rateLimit"].(map[string]interface{})["global"].(map[string]interface{})["rules"].([]interface{})
// Add token-based rate limiting
if rateLimiting.TokensPerHour > 0 {

Copilot uses AI. Check for mistakes.

color: #2563eb;
}

.form-section {
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The .form-section selector is defined twice (around line 101 and here). Consolidate duplicate rules to reduce CSS bloat and avoid conflicting styles.

Copilot uses AI. Check for mistakes.

};

const renderStepIndicator = () => (
<div style={{
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] There are large inline style blocks for the step indicator; consider moving these into CSS classes to improve readability and reuse.

Copilot uses AI. Check for mistakes.

@smarunich smarunich merged commit 7860b29 into main Jul 15, 2025
2 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant