-
Notifications
You must be signed in to change notification settings - Fork 137
Add Appendix C: Practical Issues #238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Add new appendix covering practical post-training considerations - Include OLMo 3 excerpt on compute costs for SFT/DPO/RL stages - Add evaluation variance table (high variance vs stable benchmarks) - Stub sections for multiple seeds, bad training jobs, model merging - Fix appendix nav styling: use letter-only list (A, B, C) without bullets - Update appendix-b-style.md nav links to point to new appendix Co-Authored-By: Claude Opus 4.5 <[email protected]>
The repo tooling was forked from wikiti/pandoc-book-template (MIT), and LICENSE-CODE is MIT, so adaptations should be MIT. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Adds caching for /usr/local/texlive to skip the slow TeX Live installation (~1989 packages) on subsequent runs. Cache key is based on the workflow file hash, so it invalidates when TeX packages are changed. Expected speedup: ~5-8 minutes on cache hit. Co-Authored-By: Claude Opus 4.5 <[email protected]>
|
🚀 Deployed on https://6982782587d62a7c1b71ca17--rlhfbook.netlify.app |
The cache action cannot restore to /usr/local/texlive without sudo. Create the directory and chown it to the runner user before cache restore so it has write permissions. Co-Authored-By: Claude Opus 4.5 <[email protected]>
When skipping basictex install on cache hit, /Library/TeX/texbin symlinks don't exist. Find and add the actual binary path from the cached /usr/local/texlive directory. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add gemini-feedback skill for diagram review via Gemini API - Add push-to-pr skill for creating/updating PRs - Add update-pr-body skill for editing PR descriptions - Update .gitignore to track skills while ignoring local settings Co-Authored-By: Claude Opus 4.5 <[email protected]>
Use GEMINI_API_KEY environment variable instead. Co-Authored-By: Claude Opus 4.5 <[email protected]>
… variance - Add context on recipe development vs final training costs - Expand evaluation variance section with reasoning model context - Add table caption and reference tag for eval variance table - Include LiveCodeBench example of variance mitigation Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add refs div file to control bibliography placement before appendices - Add LaTeX-only \appendix command to switch to letter numbering - No impact on website HTML (each chapter builds independently) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Chapters were reordered in January 2026 but bib.bib still used the old numbering. Sections now match chapters 1-17, appendices A-B, with chapter names in headers. All 388 entries preserved. Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
code/README.md(MIT, not Apache 2.0)Appendix C Content
Compute Costs: Context on post-training compute with OLMo 3 paper excerpt (SFT sweeps, DPO runs, RL training timeline)
Evaluation Variance: Table of benchmark stability categorized by variance level:
Training vs Deployment System Prompts: Practical lesson on identity/persona via system prompts vs weights
Stub sections (to be filled in): Running multiple seeds, Identifying bad training jobs, Model merging
Build & Infrastructure
appendix-00-references.mdwith{#refs}div +\appendixplaces bibliography before appendices in PDF ToCactions/cache@v4, cutting build time significantlycode/README.mdnow correctly lists MIT for adaptations (upstream is MIT).claude/commands/to.claude/skills/directory formatTest plan
🤖 Generated with Claude Code