Skip to content

Commit 008bd90

Browse files
committed
chore: merge main
2 parents 2cfc025 + bbc2449 commit 008bd90

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+3403
-268
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -99,22 +99,7 @@ jobs:
9999
run: make setup
100100

101101
- name: Test local handler execution
102-
run: |
103-
echo "Testing handler with all test_*.json files..."
104-
passed=0
105-
total=0
106-
for test_file in test_*.json; do
107-
total=$((total + 1))
108-
echo "Testing with $test_file..."
109-
if timeout 30s env PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat "$test_file")" uv run python src/handler.py >/dev/null 2>&1; then
110-
echo "✓ $test_file: PASSED"
111-
passed=$((passed + 1))
112-
else
113-
echo "✗ $test_file: FAILED"
114-
exit 1
115-
fi
116-
done
117-
echo "All $passed/$total handler tests passed!"
102+
run: make test-handler
118103

119104
release:
120105
runs-on: ubuntu-latest

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,12 @@
11
# Changelog
22

3+
## [0.5.0](https://github.com/runpod-workers/worker-tetra/compare/v0.4.1...v0.5.0) (2025-08-27)
4+
5+
6+
### Features
7+
8+
* Add download acceleration for dependencies & hugging face ([#22](https://github.com/runpod-workers/worker-tetra/issues/22)) ([f17e013](https://github.com/runpod-workers/worker-tetra/commit/f17e013263605758f17360abe684fa3de8c2f89e))
9+
310
## [0.4.1](https://github.com/runpod-workers/worker-tetra/compare/v0.4.0...v0.4.1) (2025-08-06)
411

512

CLAUDE.md

Lines changed: 15 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -68,12 +68,8 @@ make build-cpu # Build CPU-only Docker image
6868

6969
### Local Testing
7070
```bash
71-
# Test handler locally with test_input.json
72-
PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat test_input.json)" uv run python src/handler.py
73-
74-
# Test with other test files
75-
PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat test_class_input.json)" uv run python src/handler.py
76-
PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat test_hf_input.json)" uv run python src/handler.py
71+
# Test handler locally with test*.json
72+
make test-handler
7773
```
7874

7975
### Submodule Management
@@ -122,6 +118,14 @@ The handler automatically detects and utilizes `/runpod-volume` for persistent w
122118
- **Optimized Resource Usage**: Shared caches across multiple endpoints while maintaining isolation
123119
- **ML Model Efficiency**: Large HF models cached on volume prevent "No space left on device" errors
124120

121+
### HuggingFace Model Acceleration
122+
The system automatically leverages HuggingFace's native acceleration features:
123+
- **hf_transfer**: Accelerated downloads for large model files when available
124+
- **hf_xet**: Automatic chunk-level deduplication and incremental downloads (huggingface_hub>=0.32.0)
125+
- **Native Integration**: Uses HF Hub's `snapshot_download()` for optimal caching and acceleration
126+
- **Transparent Operation**: No code changes needed - acceleration is automatic when repositories support it
127+
- **Token Support**: Configured via `HF_TOKEN` environment variable for private repositories
128+
125129
## Configuration
126130

127131
### Environment Variables
@@ -160,11 +164,6 @@ make test-integration # Run integration tests only
160164
make test-coverage # Run tests with coverage report
161165
make test-fast # Run tests with fail-fast mode
162166
make test-handler # Test handler locally with all test_*.json files (same as CI)
163-
164-
# Test handler locally with specific test files
165-
PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat test_input.json)" uv run python src/handler.py
166-
PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat test_class_input.json)" uv run python src/handler.py
167-
PYTHONPATH=src RUNPOD_TEST_INPUT="$(cat test_hf_input.json)" uv run python src/handler.py
168167
```
169168

170169
### Testing Framework
@@ -261,3 +260,8 @@ Configure these in GitHub repository settings:
261260

262261
### Docker Guidelines
263262
- Docker container should never refer to src/
263+
264+
- Always run `make quality-check` before pronouncing you have finished your work
265+
- Always use `git mv` when moving existing files around
266+
267+
- Run the command `make test-handler` to run checks on test files. Do not try to run it one by one like `Bash(env RUNPOD_TEST_INPUT="$(cat test_input.json)" PYTHONPATH=. uv run python handler.py)`

Dockerfile

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
1010
&& chmod +x /usr/local/bin/uv
1111

1212
# Copy app code and install dependencies
13-
COPY README.md src/* pyproject.toml uv.lock test_*.json test-handler.sh ./
13+
COPY README.md src/* pyproject.toml uv.lock ./
1414
RUN uv sync
1515

1616

@@ -19,11 +19,12 @@ FROM pytorch/pytorch:2.2.0-cuda12.1-cudnn8-runtime
1919

2020
WORKDIR /app
2121

22+
# Install nala for system package acceleration in runtime stage
23+
RUN apt-get update && apt-get install -y --no-install-recommends nala \
24+
&& rm -rf /var/lib/apt/lists/*
25+
2226
# Copy app and uv binary from builder
2327
COPY --from=builder /app /app
2428
COPY --from=builder /usr/local/bin/uv /usr/local/bin/uv
2529

26-
# Clean up any unnecessary system tools
27-
RUN rm -rf /var/lib/apt/lists/*
28-
2930
CMD ["uv", "run", "handler.py"]

Dockerfile-cpu

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ RUN apt-get update && apt-get install -y --no-install-recommends \
1111
&& chmod +x /usr/local/bin/uv
1212

1313
# Copy app files and install deps
14-
COPY README.md src/* pyproject.toml uv.lock test_*.json test-handler.sh ./
14+
COPY README.md src/* pyproject.toml uv.lock ./
1515
RUN uv sync
1616

1717
# Stage 2: Runtime stage
@@ -21,7 +21,7 @@ WORKDIR /app
2121

2222
# Install runtime dependencies
2323
RUN apt-get update && apt-get install -y --no-install-recommends \
24-
curl ca-certificates \
24+
curl ca-certificates nala \
2525
&& apt-get clean \
2626
&& rm -rf /var/lib/apt/lists/*
2727

Makefile

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ test-fast: # Run tests with fast-fail mode
7878
uv run pytest tests/ -v -x --tb=short
7979

8080
test-handler: # Test handler locally with all test_*.json files
81-
./test-handler.sh
81+
cd src && ./test-handler.sh
8282

8383
test-runtime-two: build-runtime-two # Test Runtime Two container locally
8484
docker run --rm -p 8000:8000 $(FULL_IMAGE_RUNTIME_TWO)
@@ -110,7 +110,7 @@ format-check: # Check code formatting
110110

111111
# Type checking
112112
typecheck: # Check types with mypy
113-
uv run mypy .
113+
uv run mypy src/
114114

115115
# Quality gates (used in CI)
116-
quality-check: format-check lint typecheck test-coverage
116+
quality-check: format-check lint typecheck test-coverage test-handler

pyproject.toml

Lines changed: 28 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,22 @@
11
[project]
22
name = "worker-tetra"
3-
version = "0.4.1"
3+
version = "0.5.0"
44
description = "Dynamic GPU provisioning for ML workloads with transparent execution"
55
readme = "README.md"
66
requires-python = ">=3.9,<3.13"
77
dependencies = [
88
"cloudpickle>=3.1.1",
99
"pydantic>=2.11.4",
10+
"requests>=2.25.0",
1011
"runpod",
12+
<<<<<<< HEAD
1113
"fastapi>=0.104.0",
1214
"uvicorn[standard]>=0.24.0",
1315
"aiohttp>=3.9.0",
16+
=======
17+
"hf_transfer>=0.1.0",
18+
"huggingface_hub>=0.32.0",
19+
>>>>>>> main
1420
]
1521

1622
[dependency-groups]
@@ -21,6 +27,7 @@ dev = [
2127
"pytest-asyncio>=0.24.0",
2228
"ruff>=0.8.0",
2329
"mypy>=1.11.0",
30+
"types-requests>=2.25.0",
2431
]
2532

2633
[tool.pytest.ini_options]
@@ -51,40 +58,37 @@ filterwarnings = [
5158
"ignore::pytest.PytestUnknownMarkWarning"
5259
]
5360

54-
[tool.ruff]
55-
# Exclude tetra-rp directory since it's a separate repository
56-
exclude = [
57-
"tetra-rp/",
58-
]
59-
6061
[tool.mypy]
61-
# Basic configuration
6262
python_version = "3.9"
63-
warn_return_any = true
64-
warn_unused_configs = true
65-
disallow_untyped_defs = false # Start lenient, can be stricter later
66-
disallow_incomplete_defs = false
67-
check_untyped_defs = true
68-
69-
# Import discovery
70-
mypy_path = "src"
63+
mypy_path = ["src"]
64+
explicit_package_bases = true
7165
namespace_packages = true
72-
73-
# Error output
66+
check_untyped_defs = true
67+
disallow_any_generics = true
68+
disallow_untyped_defs = false
69+
warn_redundant_casts = true
70+
warn_unused_ignores = true
71+
warn_return_any = true
72+
strict_optional = true
7473
show_error_codes = true
7574
show_column_numbers = true
7675
pretty = true
77-
78-
# Exclude directories
7976
exclude = [
8077
"tetra-rp/",
81-
"tests/", # Start by excluding tests, can add later
8278
]
8379

84-
# Per-module options
8580
[[tool.mypy.overrides]]
8681
module = [
87-
"runpod.*",
88-
"cloudpickle.*",
82+
"cloudpickle",
83+
"runpod",
84+
"transformers",
85+
"hf_transfer",
86+
"huggingface_hub",
8987
]
9088
ignore_missing_imports = true
89+
90+
[tool.ruff]
91+
# Exclude tetra-rp directory since it's a separate repository
92+
exclude = [
93+
"tetra-rp/",
94+
]

src/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
"""Worker Tetra package."""

src/class_executor.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ def __init__(self, workspace_manager):
1818
super().__init__(workspace_manager)
1919
# Instance registry for persistent class instances
2020
self.class_instances: Dict[str, Any] = {}
21-
self.instance_metadata: Dict[str, Dict] = {}
21+
self.instance_metadata: Dict[str, Dict[str, Any]] = {}
2222

2323
def execute(self, request: FunctionRequest) -> FunctionResponse:
2424
"""Execute class method - required by BaseExecutor interface."""

src/constants.py

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,75 @@
2020

2121
RUNTIMES_DIR_NAME = "runtimes"
2222
"""Name of the runtimes directory containing per-endpoint workspaces."""
23+
24+
# Download Acceleration Settings
25+
MIN_SIZE_FOR_ACCELERATION_MB = 10
26+
"""Minimum file size in MB to trigger download acceleration."""
27+
28+
DOWNLOAD_TIMEOUT_SECONDS = 600
29+
"""Default timeout for download operations in seconds."""
30+
31+
# New download accelerator settings
32+
HF_TRANSFER_ENABLED = True
33+
"""Enable hf_transfer for fresh HuggingFace downloads."""
34+
35+
36+
# Size Conversion Constants
37+
BYTES_PER_MB = 1024 * 1024
38+
"""Number of bytes in a megabyte."""
39+
40+
MB_SIZE_THRESHOLD = 1 * BYTES_PER_MB
41+
"""Minimum file size threshold for considering acceleration (1MB)."""
42+
43+
# HuggingFace Model Patterns
44+
LARGE_HF_MODEL_PATTERNS = [
45+
"albert-large",
46+
"albert-xlarge",
47+
"bart-large",
48+
"bert-large",
49+
"bert-base",
50+
"codegen",
51+
"diffusion",
52+
"distilbert-base",
53+
"falcon",
54+
"gpt",
55+
"hubert",
56+
"llama",
57+
"mistral",
58+
"mpt",
59+
"pegasus",
60+
"roberta-large",
61+
"roberta-base",
62+
"santacoder",
63+
"stable-diffusion",
64+
"t5",
65+
"vae",
66+
"wav2vec2",
67+
"whisper",
68+
"xlm-roberta",
69+
"xlnet",
70+
]
71+
"""List of HuggingFace model patterns that benefit from download acceleration."""
72+
73+
# System Package Acceleration with Nala
74+
LARGE_SYSTEM_PACKAGES = [
75+
"build-essential",
76+
"cmake",
77+
"cuda-toolkit",
78+
"curl",
79+
"g++",
80+
"gcc",
81+
"git",
82+
"libssl-dev",
83+
"nvidia-cuda-dev",
84+
"python3-dev",
85+
"wget",
86+
]
87+
"""List of system packages that benefit from nala's accelerated installation."""
88+
89+
NALA_CHECK_CMD = ["which", "nala"]
90+
"""Command to check if nala is available."""
91+
92+
# Logging Configuration
93+
LOG_FORMAT = "%(asctime)s - %(levelname)s - %(name)s - %(message)s"
94+
"""Standard log format string used across the application."""

0 commit comments

Comments
 (0)