-
Notifications
You must be signed in to change notification settings - Fork 6
refactor: move endpoint deletion logic to proper abstraction layers #124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements comprehensive error handling for missing API keys with actionable guidance for users. When RUNPOD_API_KEY is missing, users now receive helpful error messages that include: - Documentation URL for obtaining API keys - Three setup methods (env var, .env file, shell profile) - Context about which operation requires the key - Troubleshooting guidance for .env file location Implementation: - Created RunpodAPIKeyError with helpful default message - Added validate_api_key() helper functions - Updated API clients to use custom exception - Added resource deployment error context - Enhanced flash init with API key documentation link - Added 15 comprehensive tests (all passing) Users previously saw generic "Runpod API key is required" errors. Now they get clear, actionable steps to resolve the issue.
Simplify validation logic and extract duplicated error handling: - Simplify API key validation condition in validation.py Changed `api_key.strip() == ""` to `not api_key.strip()` for clarity - Extract duplicated error handling in resource_manager.py Created `_deploy_with_error_context` helper method to handle RunpodAPIKeyError with resource context, eliminating code duplication All tests pass (231 unit tests).
Implements comprehensive undeploy command with multiple interaction modes: - List all tracked endpoints with status indicators - Delete specific endpoint by name - Delete all endpoints with double confirmation - Interactive checkbox selection for batch deletion Changes: - Add undeploy.py command with Rich/questionary UI - Extend ResourceManager with list_all_resources() and find_resources_by_name() - Integrate undeploy command into CLI - Add comprehensive unit tests with mocking - Fix duplicate import in resource_manager.py Safety features include confirmation prompts, keyboard interrupt handling, and detailed error reporting per endpoint.
Fixes two critical bugs preventing flash undeploy from working correctly:
1. GraphQL deleteEndpoint success detection
- RunPod API returns {"deleteEndpoint": null} on success
- Changed from checking null value to checking key presence
- Added detailed comments explaining GraphQL response pattern
2. Unclosed aiohttp session
- Wrapped RunpodGraphQLClient in async context manager
- Ensures proper session cleanup after deletion
- Eliminates "Unclosed client session" warnings
3. Updated test mocks to support async context manager usage
Tested with real endpoint deletion - successfully removes endpoint
from both RunPod and local .tetra_resources.pkl tracking file.
Adds ability to clean up stale endpoint entries from .tetra_resources.pkl when endpoints have been deleted externally (via RunPod UI/API). Changes: - Added --cleanup-stale flag to flash undeploy command - New _cleanup_stale_endpoints() function that: - Checks all tracked endpoints for inactive status - Lists inactive endpoints for user review - Prompts for confirmation before removal - Removes only from tracking (endpoints already deleted remotely) - Added imports: Dict, DeployableResource, Confirm Use case: When users delete endpoints via RunPod UI/API instead of flash CLI, the tracking file (.tetra_resources.pkl) becomes stale. This flag identifies and removes those orphaned tracking entries. Usage: flash undeploy --cleanup-stale Note: The "Status" column in `flash undeploy list` makes a health check API call for each endpoint to determine Active/Inactive state. While this adds latency (6 endpoints = 6 API calls), it's valuable for identifying stale entries that need cleanup.
- Remove debugging print statements from test_undeploy.py - Use Tuple from typing module instead of lowercase tuple for Python 3.9 consistency - Update type hints in undeploy.py and resource_manager.py - Add Tuple to imports in both files All 289 tests pass, quality checks pass.
Add abstract undeploy() method to establish polymorphic contract for resource lifecycle management. Mirrors existing deploy() pattern for symmetry. Returns bool indicating success/failure of undeploy operation.
…rkVolume ServerlessResource: - Full implementation using RunpodGraphQLClient - Handles missing endpoint ID gracefully - Logs success/failure for debugging NetworkVolume: - NotImplementedError stub with helpful message - Guides users to manual deletion via RunPod UI/API - Ready for future implementation Both implementations fulfill the DeployableResource contract.
…ndeploy_resource() Refactor ResourceManager API to clarify public vs internal interfaces: Protected methods (internal use): - add_resource() → _add_resource() - remove_resource() → _remove_resource() New public method: - undeploy_resource(resource_id, resource_name) -> Dict[str, Any] - Orchestrates polymorphic resource.undeploy() - Removes from tracking on success - Handles NotImplementedError gracefully - Returns detailed status dict for CLI consumption Public API surface: - get_or_deploy_resource() for deployment - undeploy_resource() for removal - list_all_resources() for querying - find_resources_by_name() for searching This establishes clear separation between internal state management and public resource lifecycle operations.
…oy_resource() Major architectural improvement moving deletion logic from CLI to proper abstraction layers: undeploy.py: - Remove _delete_endpoint() helper function - Remove RunpodGraphQLClient import - Use manager.undeploy_resource() in all undeploy modes: - _undeploy_by_name() - _undeploy_all() - _interactive_undeploy() - Use manager._remove_resource() in _cleanup_stale_endpoints() resource.py: - Update to use manager._remove_resource() for tracking cleanup - Clarify comment: removal is tracking-only, no remote deletion CLI now acts as thin presentation layer focused on user interaction, while ResourceManager handles orchestration and resources handle their own lifecycle. This addresses PR review feedback about separation of concerns.
Update all tests to use new protected method names and public API: test_resource_manager.py: - Use _add_resource() instead of add_resource() - Use _remove_resource() instead of remove_resource() test_undeploy.py: - Remove obsolete TestDeleteEndpoint class (_delete_endpoint no longer exists) - Update mocking to use manager.undeploy_resource() - Remove AsyncMock import (no longer needed) - Mock asyncio.run to properly execute coroutines in tests test_concurrency_issues.py: - Add undeploy() implementation to MockDeployableResource - Use _add_resource() instead of add_resource() All 253 tests pass with 56.78% coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the endpoint deletion logic to follow proper abstraction layers and polymorphic design patterns. The changes move the _delete_endpoint() function from the CLI layer into appropriate abstraction layers (resource classes and ResourceManager), improving separation of concerns and enabling reusability.
Key changes:
- Added abstract
undeploy()method to the DeployableResource base class - Implemented
undeploy()in ServerlessResource and NetworkVolume classes - Refactored ResourceManager to make internal methods protected and added public
undeploy_resource()method - Simplified CLI code to use the new ResourceManager API
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/tetra_rp/core/resources/base.py | Added abstract undeploy() method to establish polymorphic contract |
| src/tetra_rp/core/resources/serverless.py | Implemented undeploy() method for serverless endpoints |
| src/tetra_rp/core/resources/network_volume.py | Added undeploy() stub that raises NotImplementedError |
| src/tetra_rp/core/resources/resource_manager.py | Renamed public methods to protected (_add_resource, _remove_resource) and added public undeploy_resource() orchestration method |
| src/tetra_rp/cli/commands/undeploy.py | Removed _delete_endpoint() helper and updated to use manager.undeploy_resource() |
| src/tetra_rp/cli/commands/resource.py | Updated to use renamed _remove_resource() method |
| tests/unit/test_concurrency_issues.py | Added undeploy() implementation to MockDeployableResource and updated method calls |
| tests/unit/resources/test_resource_manager.py | Updated test calls to use renamed protected methods |
| tests/unit/cli/test_undeploy.py | Updated mocking strategy and removed obsolete TestDeleteEndpoint class |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Remove clean_command, generate_cleanup_preview, and unused imports (Progress, SpinnerColumn, TextColumn, questionary) as the command was never connected to the CLI (commented out in main.py). This removes 77 lines of dead code and 4 unused dependencies.
Address PR #124 feedback by extracting the duplicated run_coro helper function into a reusable pytest fixture. This reduces code duplication in test_undeploy_by_name_success and test_undeploy_all_flag methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
Refactors the
flash undeploycommand architecture in response to PR #121 review feedback. Moves endpoint deletion logic from CLI to proper abstraction layers following polymorphic design patterns.Problem
The reviewer (jhcipar) noted that
_delete_endpoint()logic was tightly coupled to the CLI layer, suggesting it should be self-contained in ResourceManager or another layer for better separation of concerns and reusability.Solution
Implemented a comprehensive architectural refactoring following object-oriented design principles:
1. Abstract Contract - DeployableResource Base Class
Added abstract
undeploy()method to establish polymorphic lifecycle contract:2. Concrete Implementations
3. Orchestration Layer - ResourceManager
Refactored API to clarify public vs internal interfaces:
Protected methods (internal state management):
add_resource()→_add_resource()remove_resource()→_remove_resource()New public method:
undeploy_resource(resource_id, resource_name)- Orchestrates polymorphic undeploy + tracking removalPublic API surface:
get_or_deploy_resource()- Deploy resourcesundeploy_resource()- Remove resourceslist_all_resources()- Query trackingfind_resources_by_name()- Search by name4. Presentation Layer - CLI
CLI now acts as thin presentation layer:
_delete_endpoint()helpermanager.undeploy_resource()throughoutBenefits
undeploy_resource()can be used programmatically outside CLIdeploy()/undeploy()APIChanges by Commit
Commit 1: Add abstract undeploy() method
Commit 2: Implement undeploy() in resource classes
Commit 3: Refactor ResourceManager API
_add_resource(),_remove_resource()undeploy_resource()Commit 4: Update CLI to use new architecture
_delete_endpoint()from undeploy.pymanager.undeploy_resource()in all undeploy modesCommit 5: Update all tests
Testing
Related
flash undeploycommand for endpoint management #121