Releases: Red-Hat-AI-Innovation-Team/training_hub
v0.5.0 - Pre-training Support and LoRA Memory Estimation
We're excited to announce v0.5.0 of Training Hub! This release introduces pre-training capabilities for SFT and OSFT algorithms, expanded memory estimation for LoRA/QLoRA workloads, and improved documentation.
Highlights
Pre-training Support
- New pre-training mode for SFT and OSFT algorithms
- Supports document-based training with configurable block sizes
- Enables continued pre-training workflows on custom datasets
LoRA Memory Estimation
- Extended memory profiler now supports LoRA and QLoRA fine-tuning
- Persistent model metadata caching for faster estimates
- OSFT/LoRA/QLoRA-aware memory calculations
Documentation & Examples
- New runnable LoRA/QLoRA Jupyter notebook example
- Expanded multi-GPU training guidance for LoRA
- Runtime estimates guide with wall-clock measurements across models
Usage
Pre-training Mode
Enable pre-training mode for document-based training:
# SFT pre-training
sft(
model_path="your-model",
data_path="/path/to/your/data",
...
is_pretraining=True,
block_size=4096,
# optional
document_column_name="document"
)
# OSFT pre-training
osft(
model_path="your-model",
data_path="/path/to/your/data",
...
is_pretraining=True,
block_size=4096,
# optional
document_column_name="document"
)Changes
Features & Enhancements
- Expose pre-training APIs in SFT and OSFT by @RobotSail in #26
- Memory Estimation for LoRA by @mazam-lab in #29
Documentation
- Adding Runnable LoRA Notebook and Doc Updates by @Maxusmusti in #30
Infrastructure
- Update dependencies for pretraining support w SFT/OSFT by @Maxusmusti in #31
Contributors
We'd like to thank all the contributors who made this release possible:
Full Changelog: v0.4.0...v0.5.0
v0.4.0 - LoRA, QLoRA, and Unsloth Backend
v0.4.0
We're excited to announce v0.4.0 of Training Hub! This release brings new training algorithms, expanded community integrations, and significant improvements to our documentation.
Highlights
New Training Algorithms
- LoRA (Low-Rank Adaptation) SFT
- QLoRA (Quantized LoRA) SFT
Community Integrations
- Unsloth integration for efficient LoRA/QLoRA training
Documentation & Tooling
- New documentation site at ai-innovation.team/training_hub
- LoRA/QLoRA example scripts
- Granite SFT and OSFT training examples
- Checkpoint evaluation notebook
- Runtime estimation guide for various model configurations
- OOTB runnable example notebooks
Installation
This release introduces a new lora extras option for simplified installation:
pip install training_hub[lora] # Includes unsloth and xformersChanges
Features & Enhancements
- Add Granite 4 SFT example by @mtake in #20
- Add granite training example by @mtake in #16
- Adding lightweight, ootb runnable notebooks by @Maxusmusti in #12
- Adding Unsloth for LoRA/QLoRA SFT Algorithm by @Maxusmusti in #23
- Add OSFT notebook for different batch sizes by @RobotSail in #5
- Adds notebook for running checkpoint evaluation by @RobotSail in #8
Documentation
- Adds docsify renderer for training-hub docs by @RobotSail in #17
- Eliminates cover page for docs by @RobotSail in #24
- Adding documentation showcasing estimated runtime for various models and training setups by @mazam-lab in #22
Infrastructure
- Update python build action for 3.14 compatibility by @Maxusmusti in #21
Contributors
We'd like to thank all the contributors who made this release possible:
- @mtake made their first contribution in #16
- @Maxusmusti
- @RobotSail
- @mazam-lab
Full Changelog: v0.3.0...v0.4.0
v0.3.0 - Granite 4, Mamba, Env var support, and Memory Estimation
This release introduces memory profiling capabilities, enhanced distributed training orchestration, and support for Granite 4 and Mamba models. Backend implementations have been updated to instructlab-training v0.12.1 and mini-trainer v0.3.0.
What's New
Memory Profiling API (Experimental)
- New memory estimation tool for fine-tuning workloads
- Reports per-GPU VRAM requirements (parameters, optimizer state, gradients, activations, outputs)
- Supports both SFT and OSFT algorithms
- Returns low/expected/high memory bounds for better resource planning
- Includes Liger-kernel-aware adjustments
- Example notebook and documentation included
Enhanced Distributed Training
- Automatic torchrun configuration from environment variables
- Full compatibility with Kubeflow and other orchestration systems
- Support for auto and gpu process count specifications
- Centralized launch parameter handling with hierarchical priority
- Improved validation with clear conflict warnings and error messages
- Flexible argument types (string or integer) for multi-node parameters
- Explicit master address and port configuration options
Model Support Expansion
- Granite 4 support (transformers>=4.57.0)
- Mamba model support with optional CUDA acceleration (mamba-ssm[causal-conv1d]>=2.2.5)
- Enhanced compatibility through dependency updates
Infrastructure Improvements
- Uncapped NumPy for better forward compatibility
- Minimum Numba version raised to 0.62.0
- Liger kernel pinned to >=0.5.10 for stability
- Updated backend implementations (instructlab-training>=0.12.1, rhai-innovation-mini-trainer>=0.3.0)
What's Changed
- Pinning liger-kernal version by @Fiona-Waters in #9
- Adding min dependencies for Granite 4 / Mamba support by @Maxusmusti in #14
- uncap numpy and raise minimum numba version by @RobotSail in #15
- Adding basic API for memory profiling (src/training_hub/profiling) by @mazam-lab in #11
- feat(traininghub): Use torchrun environment variables for default configuration by @szaher in #13
- Update backend implementation dep versions in pyproject.toml by @Maxusmusti in #19
New Contributors
- @Fiona-Waters made their first contribution in #9
- @mazam-lab made their first contribution in #11
- @szaher made their first contribution in #13
Full Changelog: v0.2.0...v0.3.0
v0.2.0 - GPT-OSS Support
Both SFT and OSFT now support gpt-oss models, alongside new example scripts, documentation updates, and dependency version adjustments.
What's Changed
- Update dependencies, examples, and docs for GPT-OSS by @Maxusmusti in #6
Full Changelog: v0.1.0...v0.2.0
v0.1.0 - SFT, OSFT (Continual Learning), and Examples
This update includes new docs for OSFT, alongside minor bug fixes and doc amendments.
What's Changed
- Adds notebooks for OSFT by @RobotSail in #3
Full Changelog: v0.1.0a3...v0.1.0
v0.1.0 Alpha 3 - OSFT Param/README updates
What's Changed
- update main README to include OSFT by @RobotSail in #2
Full Changelog: v0.1.0a2...v0.1.0a3
v0.1.0 Alpha 2 - OSFT (Continual Learning) Functionality
What's Changed
- Add OSFT implementation through mini-trainer by @RobotSail in #1
New Contributors
- @RobotSail made their first contribution in #1
Full Changelog: v0.1.0a1...v0.1.0a2
v0.1.0 Alpha 1 - Initial Release for Basic SFT Functionality
Cutting the first Training Hub alpha release, available on PyPI!
pip install training-hub, pip install training-hub[cuda]
Full Changelog: https://github.com/Red-Hat-AI-Innovation-Team/training_hub/commits/v0.1.0a1