Skip to content

Commit 771d3b6

Browse files
feat: added HF_TOKEN into hub.json
1 parent 0d45c78 commit 771d3b6

File tree

2 files changed

+113
-0
lines changed

2 files changed

+113
-0
lines changed

.runpod/hub.json

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,16 @@
2020
"required": true
2121
}
2222
},
23+
{
24+
"key": "HF_TOKEN",
25+
"input": {
26+
"name": "HF_TOKEN",
27+
"type": "password",
28+
"description": "Hugging Face access token for gated & private models",
29+
"default": "",
30+
"required": false
31+
}
32+
},
2333
{
2434
"key": "TOKENIZER_PATH",
2535
"input": {

docs/planning/002_hf_token.md

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# User Story: Add HuggingFace Token Support for Gated/Private Models
2+
3+
## Overview
4+
5+
Add HuggingFace token support to the SGLang worker to enable access to gated and private models from the HuggingFace Hub. This includes adding a secure password input field in the RunPod Hub configuration and ensuring the token is properly passed to the SGLang engine.
6+
7+
## Current State
8+
9+
- **hub.json**: Contains comprehensive model configuration options but lacks HuggingFace token support
10+
- **engine.py**: SGLang engine initialization without HF_TOKEN environment variable
11+
- **Dockerfile**: Container setup without HuggingFace token environment variable handling
12+
13+
## Goal
14+
15+
Enable the SGLang worker to access gated and private models from HuggingFace Hub by implementing secure token handling throughout the worker infrastructure.
16+
17+
## Acceptance Criteria
18+
19+
### RunPod Hub Configuration
20+
21+
- [ ] Add `HF_TOKEN` environment variable configuration to `hub.json`:
22+
- Use `password` type for secure input handling
23+
- Set as non-required (optional) field
24+
- Include descriptive text about gated/private model access
25+
- Position prominently alongside MODEL_PATH (not in advanced section)
26+
27+
### Environment Variable Integration
28+
29+
- [ ] Ensure `HF_TOKEN` environment variable is properly exported in Docker container
30+
- [ ] Verify SGLang engine can access `HF_TOKEN` from environment
31+
- [ ] Confirm token is passed to HuggingFace transformers library for model downloads
32+
33+
### Security & Best Practices
34+
35+
- [ ] Use `password` input type to mask token in RunPod Hub UI
36+
- [ ] Ensure token is not logged or exposed in error messages
37+
- [ ] Follow HuggingFace token security best practices
38+
- [ ] Maintain compatibility with existing non-gated model workflows
39+
40+
### Validation
41+
42+
- [ ] Test worker with gated model (e.g., Llama models requiring approval)
43+
- [ ] Test worker with private model (requires HF token)
44+
- [ ] Verify worker continues to work with public models (without token)
45+
- [ ] Confirm token is properly masked in RunPod Hub interface
46+
- [ ] Validate Docker build succeeds with new configuration
47+
48+
### Documentation
49+
50+
- [ ] Update any relevant documentation about using gated/private models
51+
- [ ] Ensure token field has clear description for users
52+
53+
## Implementation Details
54+
55+
### hub.json Changes
56+
57+
Add new environment variable configuration:
58+
59+
```json
60+
{
61+
"key": "HF_TOKEN",
62+
"input": {
63+
"name": "HuggingFace Token",
64+
"type": "password",
65+
"description": "HuggingFace access token for gated and private models",
66+
"default": "",
67+
"required": false
68+
}
69+
}
70+
```
71+
72+
### Engine Integration
73+
74+
- Verify `HF_TOKEN` environment variable is available to SGLang engine
75+
- Ensure HuggingFace transformers library automatically uses the token when set
76+
- No explicit token passing required if environment variable is properly set
77+
78+
### Docker Environment
79+
80+
- Confirm Docker container inherits `HF_TOKEN` environment variable from RunPod
81+
- Test that SGLang can access HuggingFace Hub with the provided token
82+
83+
## Files to Modify
84+
85+
- `.runpod/hub.json` - Add HF_TOKEN configuration
86+
- Test with `engine.py` - Verify token usage (may not require code changes)
87+
- Validate `Dockerfile` - Ensure environment variable handling (may not require changes)
88+
89+
## Success Metrics
90+
91+
- User can input HuggingFace token through RunPod Hub interface
92+
- Token input is masked (password type) for security
93+
- Worker successfully downloads and runs gated models when token is provided
94+
- Worker continues to work with public models when no token is provided
95+
- No token information is exposed in logs or error messages
96+
97+
## Testing Scenarios
98+
99+
1. **Public Model**: Test without HF_TOKEN (existing functionality)
100+
2. **Gated Model**: Test with valid HF_TOKEN for gated model
101+
3. **Private Model**: Test with valid HF_TOKEN for private model
102+
4. **Invalid Token**: Test with invalid HF_TOKEN (should fail gracefully)
103+
5. **UI Security**: Verify token is masked in RunPod Hub interface

0 commit comments

Comments
 (0)