Skip to content

Commit 0d7c580

Browse files
committed
Initial commit
1 parent 6608f30 commit 0d7c580

File tree

83 files changed

+4417
-789
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+4417
-789
lines changed
Lines changed: 161 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,161 @@
1+
# Restore Checkpoint Chain Action
2+
3+
A reusable GitHub Actions composite action for restoring build outputs from checkpoint tarballs using progressive restoration.
4+
5+
## Purpose
6+
7+
When using checkpoint-based builds with GitHub Actions cache:
8+
1. **On first run**: Build creates checkpoint tarballs in `build/{mode}/checkpoints/` and `build/shared/checkpoints/`
9+
2. **On cache hit**: Build may be skipped to save time if latest checkpoint exists
10+
3. **Progressive restoration**: Walks backward through checkpoint chain to find latest valid checkpoint
11+
4. **Resumable builds**: Restores from any checkpoint and resumes building remaining checkpoints
12+
13+
## Usage
14+
15+
```yaml
16+
- name: Restore build output from checkpoint chain
17+
id: restore-checkpoint
18+
uses: ./.github/actions/restore-checkpoint
19+
with:
20+
package-name: 'onnxruntime-builder'
21+
build-mode: ${{ steps.build-mode.outputs.mode }}
22+
checkpoint-chain: 'finalized,wasm-synced,wasm-released,wasm-compiled,source-cloned'
23+
cache-hit: ${{ steps.checkpoint-cache.outputs.cache-hit }}
24+
cache-valid: ${{ steps.validate-cache.outputs.cache_valid }}
25+
26+
- name: Build (if needed)
27+
if: steps.restore-checkpoint.outputs.needs-build == 'true'
28+
run: pnpm --filter onnxruntime-builder build --prod
29+
```
30+
31+
## Inputs
32+
33+
| Input | Required | Description | Example |
34+
|-------|----------|-------------|---------|
35+
| `package-name` | Yes | Package name in `packages/` directory | `onnxruntime-builder` |
36+
| `build-mode` | Yes | Build mode (dev or prod) | `prod` |
37+
| `checkpoint-chain` | Yes | Comma-separated list of checkpoints (newest to oldest) | `finalized,wasm-synced,wasm-compiled` |
38+
| `cache-hit` | Yes | Whether checkpoint cache was hit (`true`/`false`) | `${{ steps.cache.outputs.cache-hit }}` |
39+
| `cache-valid` | Yes | Whether checkpoint validation passed (`true`/`false`) | `${{ steps.validate.outputs.cache_valid }}` |
40+
41+
## Outputs
42+
43+
| Output | Description | Values |
44+
|--------|-------------|--------|
45+
| `restored` | Whether any checkpoint was successfully restored | `true` or `false` |
46+
| `checkpoint-restored` | Name of the checkpoint that was restored | Checkpoint name or empty |
47+
| `checkpoint-index` | Index of restored checkpoint in chain | `0` (newest) to `N-1` (oldest), or `-1` if none |
48+
| `needs-build` | Whether build needs to run to complete remaining checkpoints | `true` or `false` |
49+
50+
## How It Works
51+
52+
### Progressive Restoration Algorithm
53+
54+
1. **Parse checkpoint chain**: Splits comma-separated list into array
55+
2. **Walk backward** through chain (newest → oldest):
56+
- Check if checkpoint exists
57+
- Verify tarball integrity
58+
- If valid, restore and break
59+
3. **Extract checkpoint** to output directory
60+
4. **Determine if build needed**:
61+
- Index 0 (newest): Build can be skipped
62+
- Index > 0 (older): Build must run to complete remaining checkpoints
63+
64+
### Checkpoint Locations
65+
66+
- **Shared checkpoints**: `build/shared/checkpoints/` (e.g., `source-cloned`)
67+
- **Mode-specific checkpoints**: `build/{mode}/checkpoints/` (e.g., `finalized`, `wasm-compiled`)
68+
69+
Currently only `source-cloned` is shared across dev/prod modes.
70+
71+
## Example Scenarios
72+
73+
### Scenario 1: Complete Cache (finalized found)
74+
```
75+
Checkpoint chain: finalized,wasm-synced,wasm-compiled,source-cloned
76+
Found: finalized (index 0)
77+
Result: restored=true, needs-build=false
78+
Action: Skip build entirely
79+
```
80+
81+
### Scenario 2: Partial Cache (wasm-compiled found)
82+
```
83+
Checkpoint chain: finalized,wasm-synced,wasm-compiled,source-cloned
84+
Found: wasm-compiled (index 2)
85+
Result: restored=true, needs-build=true
86+
Action: Build runs to create wasm-synced → finalized
87+
```
88+
89+
### Scenario 3: Early Cache (source-cloned found)
90+
```
91+
Checkpoint chain: finalized,wasm-synced,wasm-compiled,source-cloned
92+
Found: source-cloned (index 3)
93+
Result: restored=true, needs-build=true
94+
Action: Build runs to create wasm-compiled → wasm-synced → finalized
95+
```
96+
97+
### Scenario 4: No Cache
98+
```
99+
Checkpoint chain: finalized,wasm-synced,wasm-compiled,source-cloned
100+
Found: none
101+
Result: restored=false, needs-build=true
102+
Action: Build runs from scratch
103+
```
104+
105+
## Package-Specific Checkpoint Chains
106+
107+
| Package | Checkpoint Chain |
108+
|---------|------------------|
109+
| **ONNX Runtime** (dev) | `finalized,wasm-synced,wasm-released,wasm-compiled,source-cloned` |
110+
| **ONNX Runtime** (prod) | `finalized,wasm-synced,wasm-optimized,wasm-released,wasm-compiled,source-cloned` |
111+
| **Yoga Layout** (dev) | `finalized,wasm-synced,wasm-released,wasm-compiled,source-configured,source-cloned` |
112+
| **Yoga Layout** (prod) | `finalized,wasm-synced,wasm-optimized,wasm-released,wasm-compiled,source-configured,source-cloned` |
113+
| **Models** | `finalized,quantized,converted,downloaded` |
114+
| **Node.js Smol** | `finalized,binary-compressed,binary-stripped,binary-released,source-patched,source-cloned` |
115+
116+
Note: ONNX and Yoga include `wasm-optimized` only in prod mode.
117+
118+
## Expected Checkpoint Structure
119+
120+
Checkpoints should contain a `Final/` directory with build outputs:
121+
122+
```
123+
finalized.tar.gz
124+
└── Final/
125+
├── output.wasm
126+
├── output.mjs
127+
└── output.js
128+
```
129+
130+
## Error Handling
131+
132+
The action will fail with detailed error messages if:
133+
- No valid checkpoints found in chain
134+
- All tarballs are corrupted
135+
- Extraction fails
136+
- Output directory is invalid
137+
138+
## Benefits
139+
140+
### Progressive Restoration
141+
- **Partial cache hits are useful**: Don't waste intermediate checkpoints
142+
- **Resumable builds**: Continue from any point in the pipeline
143+
- **Faster iterations**: Skip completed phases even if final checkpoint is missing
144+
145+
### Consistency
146+
- **Single restoration logic**: Shared across all workflows
147+
- **Maintainability**: Update in one place
148+
- **Debugging**: Detailed logging shows which checkpoint was used
149+
150+
### Efficiency
151+
- **Maximize cache utilization**: Use any valid checkpoint, not just the final one
152+
- **Reduce build times**: Skip unnecessary rebuild of early phases
153+
- **CI cost savings**: Less compute time = lower costs
154+
155+
## Migration Notes
156+
157+
This action replaced the older single-checkpoint restoration pattern. All packages now use progressive restoration with standardized checkpoint naming:
158+
159+
- All final checkpoints are named **`finalized`** (previously `wasm-finalized`, `quantized`, etc.)
160+
- All restoration happens through checkpoint chains
161+
- No separate Final output caches (checkpoint-only caching)

0 commit comments

Comments
 (0)