refactor: move endpoint deletion logic to proper abstraction layers #124

deanq · 2025-11-25T08:07:08Z

Summary

Refactors the flash undeploy command architecture in response to PR #121 review feedback. Moves endpoint deletion logic from CLI to proper abstraction layers following polymorphic design patterns.

Problem

The reviewer (jhcipar) noted that _delete_endpoint() logic was tightly coupled to the CLI layer, suggesting it should be self-contained in ResourceManager or another layer for better separation of concerns and reusability.

Solution

Implemented a comprehensive architectural refactoring following object-oriented design principles:

1. Abstract Contract - DeployableResource Base Class

Added abstract undeploy() method to establish polymorphic lifecycle contract:

@abstractmethod
async def undeploy(self) -> bool:
    """Undeploy/delete the resource."""

2. Concrete Implementations

ServerlessResource: Full implementation using RunpodGraphQLClient
NetworkVolume: NotImplementedError stub with helpful message for future implementation

3. Orchestration Layer - ResourceManager

Refactored API to clarify public vs internal interfaces:

Protected methods (internal state management):

add_resource() → _add_resource()
remove_resource() → _remove_resource()

New public method:

undeploy_resource(resource_id, resource_name) - Orchestrates polymorphic undeploy + tracking removal

Public API surface:

get_or_deploy_resource() - Deploy resources
undeploy_resource() - Remove resources
list_all_resources() - Query tracking
find_resources_by_name() - Search by name

4. Presentation Layer - CLI

CLI now acts as thin presentation layer:

Removed _delete_endpoint() helper
Uses manager.undeploy_resource() throughout
Focuses on user interaction, confirmation prompts, and output formatting

Benefits

Separation of Concerns: CLI handles UI, ResourceManager orchestrates, resources manage their own lifecycle
Polymorphism: Each resource type knows how to undeploy itself
Reusability: undeploy_resource() can be used programmatically outside CLI
Extensibility: Easy to add undeploy support for new resource types
Testability: Clean boundaries make mocking and testing easier
Consistency: Symmetric deploy()/undeploy() API

Changes by Commit

Commit 1: Add abstract undeploy() method

Establishes polymorphic contract in DeployableResource base class
Mirrors existing deploy() pattern

Commit 2: Implement undeploy() in resource classes

ServerlessResource: Full implementation with error handling
NetworkVolume: NotImplementedError stub for future work

Commit 3: Refactor ResourceManager API

Protected internal methods: _add_resource(), _remove_resource()
Public orchestration: undeploy_resource()
Clear separation of concerns

Commit 4: Update CLI to use new architecture

Remove _delete_endpoint() from undeploy.py
Use manager.undeploy_resource() in all undeploy modes
CLI becomes thin presentation layer

Commit 5: Update all tests

253 tests passing with 56.78% coverage
Updated mocking to use new API
Removed obsolete TestDeleteEndpoint class

Testing

✅ All 253 unit tests passing
✅ Code coverage: 56.78% (exceeds 35% requirement)
✅ Linting: Clean (ruff check)
✅ Formatting: Clean (ruff format)
✅ All undeploy modes tested: list, by-name, --all, --interactive, --cleanup-stale

Implements comprehensive error handling for missing API keys with actionable guidance for users. When RUNPOD_API_KEY is missing, users now receive helpful error messages that include: - Documentation URL for obtaining API keys - Three setup methods (env var, .env file, shell profile) - Context about which operation requires the key - Troubleshooting guidance for .env file location Implementation: - Created RunpodAPIKeyError with helpful default message - Added validate_api_key() helper functions - Updated API clients to use custom exception - Added resource deployment error context - Enhanced flash init with API key documentation link - Added 15 comprehensive tests (all passing) Users previously saw generic "Runpod API key is required" errors. Now they get clear, actionable steps to resolve the issue.

Simplify validation logic and extract duplicated error handling: - Simplify API key validation condition in validation.py Changed `api_key.strip() == ""` to `not api_key.strip()` for clarity - Extract duplicated error handling in resource_manager.py Created `_deploy_with_error_context` helper method to handle RunpodAPIKeyError with resource context, eliminating code duplication All tests pass (231 unit tests).

Implements comprehensive undeploy command with multiple interaction modes: - List all tracked endpoints with status indicators - Delete specific endpoint by name - Delete all endpoints with double confirmation - Interactive checkbox selection for batch deletion Changes: - Add undeploy.py command with Rich/questionary UI - Extend ResourceManager with list_all_resources() and find_resources_by_name() - Integrate undeploy command into CLI - Add comprehensive unit tests with mocking - Fix duplicate import in resource_manager.py Safety features include confirmation prompts, keyboard interrupt handling, and detailed error reporting per endpoint.

Fixes two critical bugs preventing flash undeploy from working correctly: 1. GraphQL deleteEndpoint success detection - RunPod API returns {"deleteEndpoint": null} on success - Changed from checking null value to checking key presence - Added detailed comments explaining GraphQL response pattern 2. Unclosed aiohttp session - Wrapped RunpodGraphQLClient in async context manager - Ensures proper session cleanup after deletion - Eliminates "Unclosed client session" warnings 3. Updated test mocks to support async context manager usage Tested with real endpoint deletion - successfully removes endpoint from both RunPod and local .tetra_resources.pkl tracking file.

Adds ability to clean up stale endpoint entries from .tetra_resources.pkl when endpoints have been deleted externally (via RunPod UI/API). Changes: - Added --cleanup-stale flag to flash undeploy command - New _cleanup_stale_endpoints() function that: - Checks all tracked endpoints for inactive status - Lists inactive endpoints for user review - Prompts for confirmation before removal - Removes only from tracking (endpoints already deleted remotely) - Added imports: Dict, DeployableResource, Confirm Use case: When users delete endpoints via RunPod UI/API instead of flash CLI, the tracking file (.tetra_resources.pkl) becomes stale. This flag identifies and removes those orphaned tracking entries. Usage: flash undeploy --cleanup-stale Note: The "Status" column in `flash undeploy list` makes a health check API call for each endpoint to determine Active/Inactive state. While this adds latency (6 endpoints = 6 API calls), it's valuable for identifying stale entries that need cleanup.

- Remove debugging print statements from test_undeploy.py - Use Tuple from typing module instead of lowercase tuple for Python 3.9 consistency - Update type hints in undeploy.py and resource_manager.py - Add Tuple to imports in both files All 289 tests pass, quality checks pass.

Add abstract undeploy() method to establish polymorphic contract for resource lifecycle management. Mirrors existing deploy() pattern for symmetry. Returns bool indicating success/failure of undeploy operation.

…rkVolume ServerlessResource: - Full implementation using RunpodGraphQLClient - Handles missing endpoint ID gracefully - Logs success/failure for debugging NetworkVolume: - NotImplementedError stub with helpful message - Guides users to manual deletion via RunPod UI/API - Ready for future implementation Both implementations fulfill the DeployableResource contract.

…ndeploy_resource() Refactor ResourceManager API to clarify public vs internal interfaces: Protected methods (internal use): - add_resource() → _add_resource() - remove_resource() → _remove_resource() New public method: - undeploy_resource(resource_id, resource_name) -> Dict[str, Any] - Orchestrates polymorphic resource.undeploy() - Removes from tracking on success - Handles NotImplementedError gracefully - Returns detailed status dict for CLI consumption Public API surface: - get_or_deploy_resource() for deployment - undeploy_resource() for removal - list_all_resources() for querying - find_resources_by_name() for searching This establishes clear separation between internal state management and public resource lifecycle operations.

…oy_resource() Major architectural improvement moving deletion logic from CLI to proper abstraction layers: undeploy.py: - Remove _delete_endpoint() helper function - Remove RunpodGraphQLClient import - Use manager.undeploy_resource() in all undeploy modes: - _undeploy_by_name() - _undeploy_all() - _interactive_undeploy() - Use manager._remove_resource() in _cleanup_stale_endpoints() resource.py: - Update to use manager._remove_resource() for tracking cleanup - Clarify comment: removal is tracking-only, no remote deletion CLI now acts as thin presentation layer focused on user interaction, while ResourceManager handles orchestration and resources handle their own lifecycle. This addresses PR review feedback about separation of concerns.

Update all tests to use new protected method names and public API: test_resource_manager.py: - Use _add_resource() instead of add_resource() - Use _remove_resource() instead of remove_resource() test_undeploy.py: - Remove obsolete TestDeleteEndpoint class (_delete_endpoint no longer exists) - Update mocking to use manager.undeploy_resource() - Remove AsyncMock import (no longer needed) - Mock asyncio.run to properly execute coroutines in tests test_concurrency_issues.py: - Add undeploy() implementation to MockDeployableResource - Use _add_resource() instead of add_resource() All 253 tests pass with 56.78% coverage.

Copilot

Pull request overview

This PR refactors the endpoint deletion logic to follow proper abstraction layers and polymorphic design patterns. The changes move the _delete_endpoint() function from the CLI layer into appropriate abstraction layers (resource classes and ResourceManager), improving separation of concerns and enabling reusability.

Key changes:

Added abstract undeploy() method to the DeployableResource base class
Implemented undeploy() in ServerlessResource and NetworkVolume classes
Refactored ResourceManager to make internal methods protected and added public undeploy_resource() method
Simplified CLI code to use the new ResourceManager API

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/tetra_rp/core/resources/base.py	Added abstract undeploy() method to establish polymorphic contract
src/tetra_rp/core/resources/serverless.py	Implemented undeploy() method for serverless endpoints
src/tetra_rp/core/resources/network_volume.py	Added undeploy() stub that raises NotImplementedError
src/tetra_rp/core/resources/resource_manager.py	Renamed public methods to protected (_add_resource, _remove_resource) and added public undeploy_resource() orchestration method
src/tetra_rp/cli/commands/undeploy.py	Removed _delete_endpoint() helper and updated to use manager.undeploy_resource()
src/tetra_rp/cli/commands/resource.py	Updated to use renamed _remove_resource() method
tests/unit/test_concurrency_issues.py	Added undeploy() implementation to MockDeployableResource and updated method calls
tests/unit/resources/test_resource_manager.py	Updated test calls to use renamed protected methods
tests/unit/cli/test_undeploy.py	Updated mocking strategy and removed obsolete TestDeleteEndpoint class

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tetra_rp/cli/commands/resource.py

tests/unit/cli/test_undeploy.py

Remove clean_command, generate_cleanup_preview, and unused imports (Progress, SpinnerColumn, TextColumn, questionary) as the command was never connected to the CLI (commented out in main.py). This removes 77 lines of dead code and 4 unused dependencies.

Address PR #124 feedback by extracting the duplicated run_coro helper function into a reusable pytest fixture. This reduces code duplication in test_undeploy_by_name_success and test_undeploy_all_flag methods.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tetra_rp/cli/commands/resource.py

deanq added 12 commits November 25, 2025 00:04

chore: quality check

5f9c609

feat(resources): add abstract undeploy() method to DeployableResource

143d17c

Add abstract undeploy() method to establish polymorphic contract for resource lifecycle management. Mirrors existing deploy() pattern for symmetry. Returns bool indicating success/failure of undeploy operation.

deanq requested review from Copilot and jhcipar November 25, 2025 08:07

Copilot AI reviewed Nov 25, 2025

View reviewed changes

src/tetra_rp/cli/commands/resource.py Outdated Show resolved Hide resolved

tests/unit/cli/test_undeploy.py Outdated Show resolved Hide resolved

deanq added 3 commits November 25, 2025 00:17

test(undeploy): extract run_coro helper to pytest fixture

8c3dab9

Address PR #124 feedback by extracting the duplicated run_coro helper function into a reusable pytest fixture. This reduces code duplication in test_undeploy_by_name_success and test_undeploy_all_flag methods.

chore: make format

2e8ff48

deanq requested a review from Copilot November 25, 2025 12:50

Copilot AI reviewed Nov 25, 2025

View reviewed changes

src/tetra_rp/cli/commands/resource.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: move endpoint deletion logic to proper abstraction layers #124

refactor: move endpoint deletion logic to proper abstraction layers #124

Uh oh!

deanq commented Nov 25, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

refactor: move endpoint deletion logic to proper abstraction layers #124

Are you sure you want to change the base?

refactor: move endpoint deletion logic to proper abstraction layers #124

Uh oh!

Conversation

deanq commented Nov 25, 2025

Summary

Problem

Solution

1. Abstract Contract - DeployableResource Base Class

2. Concrete Implementations

3. Orchestration Layer - ResourceManager

4. Presentation Layer - CLI

Benefits

Changes by Commit

Commit 1: Add abstract undeploy() method

Commit 2: Implement undeploy() in resource classes

Commit 3: Refactor ResourceManager API

Commit 4: Update CLI to use new architecture

Commit 5: Update all tests

Testing

Related

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants