Skip to content

Commit a3aaa8f

Browse files
fix: the mount on runpod must be "/workspace/data"
1 parent 1fc3fd7 commit a3aaa8f

File tree

2 files changed

+32
-2
lines changed

2 files changed

+32
-2
lines changed

docker-compose.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,12 @@ services:
1818
- "8888:8888" # Jupyter Lab (from base image)
1919
- "2222:22" # SSH access (from base image)
2020

21-
# Environment variables for training configuration
21+
22+
# Environment variables for training configuration
2223
environment:
2324
# Required credentials
2425
- HF_TOKEN=${HF_TOKEN}
25-
# - WANDB_API_KEY=${WANDB_API_KEY}
26+
- WANDB_API_KEY=${WANDB_API_KEY}
2627

2728
# Training configuration (examples - customize as needed)
2829
- AXOLOTL_BASE_MODEL=TinyLlama/TinyLlama_v1.1
@@ -44,6 +45,7 @@ services:
4445
# - PUBLIC_KEY=${PUBLIC_KEY}
4546

4647
# Volume mounts for persistent data
48+
# IMPORTANT: Mount to /workspace/data, NOT /workspace (which overwrites everything!)
4749
volumes:
4850
- ./outputs:/workspace/data/axolotl-artifacts
4951
- ./configs:/workspace/fine-tuning/configs

docs/conventions.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,34 @@ export AXOLOTL_GRADIENT_ACCUMULATION_STEPS="16"
314314

315315
## 🔧 Troubleshooting
316316

317+
### Critical RunPod Issue
318+
319+
#### ⚠️ **NEVER Mount Volumes to `/workspace`**
320+
321+
```bash
322+
# ❌ WRONG - This will overwrite the entire Docker image structure
323+
volumes:
324+
- ./data:/workspace
325+
326+
# ✅ CORRECT - Mount to subdirectories only
327+
volumes:
328+
- ./data:/workspace/data/axolotl-artifacts
329+
```
330+
331+
**Why**: RunPod volume mounts to `/workspace` will completely overwrite:
332+
333+
- `/workspace/axolotl/` (Axolotl installation)
334+
- `/workspace/fine-tuning/` (Your scripts)
335+
- All symlinks and directory structure
336+
337+
**Symptoms**:
338+
339+
- `ln: failed to create symbolic link '/workspace/axolotl/outputs': No such file or directory`
340+
- `/root/cloud-entrypoint.sh: line 93: /workspace/fine-tuning/autorun.sh: No such file or directory`
341+
- Infinite restart loop
342+
343+
**Solution**: Always mount to `/workspace/data/` or other subdirectories, never `/workspace` directly.
344+
317345
### Common Issues
318346

319347
#### 1. **Environment Variables Not Applied**

0 commit comments

Comments
 (0)