01 Feb 16:17

bghira

f4f165b

v4.0.4 - ramtorch quality improvements, better LTX-2 audio-only training and video validations Latest

Latest

What's Changed

Z-Image example (non-turbo) should use model_flavour=base by @bghira in #2519
ramtorch: percentage-based offload fix for text encoder moving to CPU and back inadvertently causing device mismatch error by @bghira in #2525
(#2504) add --gradient_checkpointing_backend=unsloth, default to torch by @bghira in #2521
bugfix: checkpoint preview page missing validation samples by @bghira in #2522
[UI] add dataset configuration missing options; allow configuring audio duration for standalone sets by @bghira in #2526
(#2510) allow mask conditioning_type to work on edit models that require latent conditioning by @bghira in #2520
adamw_bf16 compatibility with unsloth checkpointing by @bghira in #2530
unsloth checkpointing: flux2, hv, kv5, ltx2, wan, zim by @bghira in #2533
ramtorch should disable quantisation and device moving by @bghira in #2532
enable full ramtorch mode by default for the transformer so flux2 RMSNorm gets offloaded by @bghira in #2531
fix error when validation is not None by @bghira in #2535
ltx2: audio-only mode should skip video layers, TREAD, CREPA, and aim for ideal LoRA targets by @bghira in #2534
ramtorch: fix Gemma3 output corruption by @bghira in #2538
bypass validation scheduler setup for special models by @bghira in #2539
torchao: fix int8 weight only quant via pipeline by @bghira in #2540
add --ramtorch_disable_extensions and --ramtorch_disable_sync_hooks to disable custom features by @bghira in #2541
prevent double-encoding of captions for audio auto-split dataset by @bghira in #2543
ui: add audio options for video datasets on models which support a+v by @bghira in #2545
ui: save gradient checkpointing option by default by @bghira in #2544
(#2523) validation epoch interval should calculate starting point the same as global step by @bghira in #2546
(#2524) ui: reduce severity of non-fatal errors by @bghira in #2547
ramtorch: disable extensions by default for speedup on most systems by @bghira in #2548
bug: kill complete process tree using psutil, same to how the Stop command works, during Shutdown by @bghira in #2550
(#2542) add video preview to model card by @bghira in #2551
LTX-2: allow custom schedules, since validation issue is resolved by @bghira in #2553
z-image: remove low memory flag which now seems to not be needed by @bghira in #2552
fix validation error for models like z-image that do not use dynamic shift by @bghira in #2555
set timestep index to 0 to avoid lookup in scheduler.step by @bghira in #2554
(#2529) remove LongCat specific phrasing on block swap desc by @bghira in #2556
(#2527) move checkpointing disk section stuff to advanced subsection by @bghira in #2557
merge by @bghira in #2549

Full Changelog: v4.0.3...v4.0.4

Contributors

bghira

Assets 2

29 Jan 01:22

bghira

v4.0.3

807adea

v4.0.3 - LTX-2 IC-LoRA, Z-Image base flavour, end_step/end_epoch dataset scheduling and GPU health checks

What's Changed

(#2484) fix use of spread operator on ES6 object with getters by @bghira in #2486
environment creation wizard size constraints for smaller (1920x1080) viewports under 4k by @bghira in #2487
(#2479) add TEXT_JSON field type for complex data types in a simple text field input by @bghira in #2488
use TEXT_JSON field type for TREAD by @bghira in #2489
(#2480) adjust num_frames automatically by limit instead of throwing error by @bghira in #2490
(#2475) bypass batch size for eval dataset by @bghira in #2491
add max_num_samples per-dataset by @bghira in #2492
(#2477) GPU circuit breaker by @bghira in #2493
(#2474) surface processing statistics in webui; store count of too_small etc image count in dataset metadata files by @bghira in #2494
(#2483) validation epoch tracking should simulate dataset scheduling by @bghira in #2495
(#2274) add end_step / end_epoch scheduling for datasets by @bghira in #2496
(#2470) multi-aspect input conditioning for kontext, flux2 and qwen edit by @bghira in #2497
(#1812) i2v validation using image datasets and documentation updates by @bghira in #2499
ss_tag_frequency should contain only terms in more than 50% of all captions by @bghira in #2500
mkDocs: move to Zensical instead, and fix the theme by @bghira in #2501
GPU circuit-breaker should treat thermal events as warning only, and display GPU thermal throttling in UI by @bghira in #2502
avoid reusing stale job pid by canceling local running jobs at startup by @bghira in #2503
LTX-2: IC-LoRA training with reference videos by @bghira in #2498
z image (base) by @bghira in #2505
(#2509) end-to-end JSON field handling fix for CLI launched training job by @bghira in #2511
(#2507) eval dataset should have effective_batch_size of 1 by @bghira in #2512
(#2508) calculate and sum all epoch stats as we receive them instead of incorrectly only counting the prev by @bghira in #2513
face detection fixes for TrainingSample with PIL fallback by @bghira in #2515
webui/webhooks: error reporting refactor by @bghira in #2516
UI event system should rely on SSE manager by @bghira in #2517
merge by @bghira in #2518

Full Changelog: v4.0.2...v4.0.3

Contributors

bghira

Assets 2

22 Jan 13:08

bghira

v4.0.2

af8f322

v4.0.2 - audio-only LTX-2 training, HeartMuLa, CUDA 13 for Blackwell

What's Changed

(#2435) lycoris example for klein 9b by @bghira in #2437
enhanced ipc event emissions for Accelerate subprocess failures by @bghira in #2436
refactor model foundation methods into mixin classes by @bghira in #2440
cleanup some skipped tests, hidden errors by @bghira in #2441
emit lifecycle event progress to webhooks for extracting captions by @bghira in #2444
HeartMuLa reimplementation by @bghira in #2442
show error message when crash occurs due to config parser by @bghira in #2445
automatically override flow shift instead of erroring when auto flow shift is enabled by @bghira in #2446
add cuda13 install target by @bghira in #2447
add cuda13 install instructions to docs, and recommend python3.13 instead of 3.12 by @bghira in #2448
support flux2 validation preview streaming by @bghira in #2449
add allow_empty to some fields that need to be unsettable by @bghira in #2450
store pid when starting job via start_training_job by @bghira in #2451
twinflow: adversarial loss, doc updates by @bghira in #2453
suggest agents to use python3.13 by @bghira in #2456
better validation ux for attention mechanism selection by @bghira in #2455
cuda-stable, cuda-nightly, and cuda13-stable, cuda13-nightly install targets by @bghira in #2457
clarify edit vs reference dataset names in qwen edit quickstart by @bghira in #2458
low disk space detection and script execution action by @bghira in #2460
Support aws_session_token for S3 backends by @bghira in #2462
LTX-2: audio-only training by @bghira in #2461
qwen_image: fixes for TREAD by @bghira in #2459
killing orphaned child processes by @bghira in #2463
skip comfyui format conversion for models which support natively by @bghira in #2464
add --ramtorch_transformer_percent and --ramtorch_text_encoder_percent to treat it more like block swap by @bghira in #2465
structured error reporting by @bghira in #2466
resume training directly from s3 storage by @bghira in #2468
merge by @bghira in #2471

Full Changelog: v4.0.1...v4.0.2

Contributors

bghira

Assets 2

17 Jan 01:37

bghira

v4.0.1

d1bc070

v4.0.1 - klein, scheduled CREPA, and disable_multiline_split for captions with newlines

This release introduces flux2 klein 4b and 9b, a disable_multiline_split option for disabling multi-caption split on newlines; new options for customizing text encoder layers in FLUX.2 models, enhancements for model metadata, expanded validation strategies using datasets, and detailed CREPA regularization scheduling controls.

Data Loader Options:

Added disable_multiline_split option to dataloader documentation in English (DATALOADER.md), Spanish (DATALOADER.es.md), Portuguese (DATALOADER.pt-BR.md), Hindi (DATALOADER.hi.md), Japanese (DATALOADER.ja.md), and Chinese (DATALOADER.zh.md). This option prevents splitting captions by newlines, useful for preserving intentional line breaks. Updated example configs to include this option. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12]

Model Training Options:

Added --custom_text_encoder_intermediary_layers option to Spanish (OPTIONS.es.md) and Hindi (OPTIONS.hi.md) documentation, allowing users to override which hidden state layers are extracted from the text encoder for FLUX.2 models. Includes format, defaults, usage notes, and warnings about cache invalidation. [1] [2]
Added --modelspec_comment option to Spanish (OPTIONS.es.md) documentation, enabling embedding custom comments into model metadata, visible in external viewers. Supports environment variable substitution and multiple lines. Updated CLI usage and options reference. [1] [2] [3]

Validation and Conditioning:

Documented new validation strategies in Spanish (OPTIONS.es.md): --validation_using_datasets for img2img validation using training dataset images, and --eval_dataset_id for selecting a specific dataset for evaluation. Includes detailed explanations of conditioning modes, dataset types, and how these options interact.

CREPA Regularization Scheduling:

Expanded documentation for CREPA regularization in Spanish (OPTIONS.es.md) with new options: --crepa_scheduler, --crepa_warmup_steps, --crepa_decay_steps, --crepa_lambda_end, --crepa_power, --crepa_cutoff_step, --crepa_similarity_threshold, --crepa_similarity_ema_decay, and --crepa_threshold_mode. Includes configuration examples and usage notes for advanced scheduling and stopping criteria. [1] [2]

Assets 2

12 Jan 12:06

bghira

v4.0.0

52efc0b

v4.0.0 - multi-user, cloud training, webUI overhaul

SimpleTuner v4.0.0 Release Notes

Release Date: January 2026

This is a major release introducing enterprise-grade multi-user features, new model architectures, and significant infrastructure improvements. The diff comprises 354,291 lines across 1,199 files.

Highlights

2 New Model Architectures: LTX-Video 2 with audio generation and Wan S2V for speech-to-video
Enterprise Multi-User Support with organizations, teams, RBAC, OIDC/LDAP SSO, and audit logging
Job Queue System with priority scheduling, approval workflows, and quota management
Remote Worker Orchestration for distributed GPU training
200+ New API Endpoints with comprehensive authentication
Light Theme (Windows 98-inspired) and new admin UI
Context Parallelism support across all transformer models
86 New Test Files with 1,000+ new test methods

Breaking Changes
New Model Architectures
Enterprise Features
CLI Changes
API Changes
Training Improvements
UI/UX Improvements
Infrastructure Changes
Test Coverage
Migration Guide

Breaking Changes

CLI Entry Point

Breaking: Main CLI entry point moved from simpletuner.cli:main to st_cli:main
Update any scripts referencing the old module path

Docker Image

Breaking: Base image upgraded to nvidia/cuda:12.8.1-cudnn-devel-ubuntu24.04 (was 12.4.1 on Ubuntu 22.04)
Breaking: Container now starts SimpleTuner server instead of sleep infinity
Breaking: Working directory changed from /workspace to /app
New target architecture: TORCH_CUDA_ARCH_LIST=8.9 (Ada Lovelace)
SimpleTuner now installed from git release branch instead of PyPI

API Authentication

Breaking: All API endpoints now require authentication
Previously open endpoints return 401 Unauthorized without valid credentials
Use /api/auth/login for session auth or API keys via X-API-Key header

Documentation System

Breaking: Migrated from Sphinx to MkDocs
Documentation URL changed to https://simpletuner.dev

New Model Architectures

LTX-Video 2 (LTX-2)

The first model in SimpleTuner with native audio-video generation.

19B Parameter Transformer (LTX2VideoTransformer3DModel)
Audio Autoencoder (AutoencoderKLLTX2Audio) for audio latent processing
Vocoder (LTX2Vocoder) for mel-spectrogram to waveform conversion
Text Encoder: Gemma3 (12B) via Gemma3ForConditionalGeneration
Latent Channels: 128
Pipelines: Text-to-Video and Image-to-Video with audio
Flavours: dev, dev-fp4, dev-fp8, 2.0
Block Swap: Up to 47 swappable transformer blocks for memory optimization

Wan S2V (Speech-to-Video)

Generate video from audio, text, and reference images.

14B Parameter Model (WanS2VTransformer3DModel)
Audio Encoding: Wav2Vec2 (facebook/wav2vec2-large-xlsr-53)
Motion Encoder: WanS2VMotionEncoder with causal convolutions
VAE: AutoencoderKLWan (16 latent channels)
Flavour: s2v-14b-2.2

Context Parallelism Support

All transformers now include _cp_plan definitions for distributed training:

ACE-Step, AuraFlow, Chroma, Cosmos, Flux, HiDream
HunyuanVideo, Kandinsky5Video, LongCat-Image/Video
LTXVideo, LTX-2, Lumina2, OmniGen, PixArt
Sana, SanaVideo, SD3, Wan, Z-Image, Z-Image Omni

Enterprise Features

Multi-User Authentication

Local Authentication: Username/password with secure session management
OIDC Integration: Connect to external identity providers (Google, Okta, Auth0, etc.)
LDAP/Active Directory: Enterprise directory integration
API Keys: Scoped API keys for automation

Role-Based Access Control (RBAC)

4 Default Levels: Admin, Lead, Researcher, Viewer
17+ Granular Permissions: admin.approve, admin.audit, admin.users, etc.
Resource Rules: GPU limits, job limits, cost caps using glob patterns

Organizations & Teams

Hierarchical Structure: Organization → Teams → Users
Quota Inheritance: Ceiling model with org → team → user quotas
Member Roles: admin, lead, member per team

Job Queue System

5 Priority Levels: Critical, High, Normal, Low, Background
Fair-Share Scheduling: Optional equal distribution across teams
Configurable Concurrency: Global, per-user, per-team limits
Starvation Prevention: Priority boosting for long-waiting jobs

Approval Workflows

Rule-Based Requirements: Trigger approvals by cost threshold, hardware type, provider
Request Lifecycle: Pending → Approved/Rejected → Expired
Bulk Operations: Approve/reject multiple requests at once
Email Response Integration: Approve via email reply

Quota Management

Quota Types: Monthly/daily cost, concurrent jobs, jobs per hour/day, local GPUs
Actions: Block, warn, or require approval when exceeded
Real-Time Status: Usage tracking with 80% warning threshold

Audit Logging

Tamper-Evident: Cryptographic hash chaining (HMAC-SHA256)
Append-Only: Immutable audit trail
Chain Verification: Detect tampering via integrity checks
SIEM Integration: Export to Elasticsearch/Splunk via webhooks
Event Types: Auth, user management, jobs, quotas, security events

Worker Orchestration

Remote GPU Workers: Register workers via token authentication
SSE Job Dispatch: Real-time job assignment streaming
Heartbeat Monitoring: Automatic offline detection
Orphan Recovery: Retry failed jobs when workers disconnect

Notification System

Channels: Email (SMTP), Slack, Webhooks
Event Routing: Per-user preferences by event type
IMAP Response Handling: Email-based approval workflow
Delivery History: Track notification delivery status

Circuit Breaker Resilience

Per-Provider Breakers: Prevent cascading failures
States: Closed → Open → Half-Open
Configurable Thresholds: Failure count, timeout, success count

State Backend Options

Pluggable backends for multi-node deployments:

Redis: Optimal for production (native async)
PostgreSQL: Row-level locking with connection pooling
MySQL: aiomysql support
SQLite: WAL mode for single-node
Memory: For testing/development

CLI Changes

New Commands

Command	Description
`simpletuner jobs`	Job management (submit, list, cancel, retry, logs, approval)
`simpletuner quota`	Quota management (list, create, delete, status)
`simpletuner notifications`	Notification channels and preferences
`simpletuner backup`	Database backup and restore
`simpletuner database`	Database operations (health, verify, vacuum, migrate)
`simpletuner metrics`	Monitoring (prometheus, costs, usage, circuit breakers)
`simpletuner webhooks`	Webhook management (create, test, history)
`simpletuner worker`	Run as worker agent for orchestration
`simpletuner auth`	Authentication and user management
`simpletuner cloud`	Cloud training management
`simpletuner shutdown`	Graceful server shutdown

Auth Subcommands

simpletuner auth setup          # Bootstrap first admin
simpletuner auth users list     # List users
simpletuner auth users create   # Create user
simpletuner auth orgs list      # List organizations
simpletuner auth orgs create    # Create organization
simpletuner auth audit list     # Query audit logs
simpletuner auth audit verify   # Verify chain integrity

Server Enhancements

New flags for simpletuner server:

--host, --port: Bind configuration
--ssl, --ssl-cert, --ssl-key: SSL support
--reload: Development auto-reload
--workers: Multi-process workers

Environment Variables

Variable	Purpose
`SIMPLETUNER_SKIP_TORCH`	Fast CLI startup (skip torch imports)
`SIMPLETUNER_SSL_ENABLED`	Enable SSL
`SIMPLETUNER_API_KEY`	API key for authenticated requests
`SIMPLETUNER_ORCHESTRATOR_URL`	Worker orchestrator URL
`SIMPLETUNER_WORKER_TOKEN`	Worker authentication token

API Changes

New Endpoint Categories

Authentication: /api/auth/* (login, logout, API keys, OIDC, LDAP)
Users: /api/users/* (CRUD, levels, permissions, credentials)
Organizations: /api/orgs/* (orgs, teams, members, quotas)
Approvals: /api/approvals/* (rules, requests, bulk operations)
Queue: /api/queue/* (submit, cancel, priority, stats)
Quotas: /api/quotas/* (types, limits, usage)
Audit: /api/audit/* (logs, stats, verification, export)
Metrics: /api/metrics/* (prometheus, health, circuit breakers)
Backup: /api/backup/* (create, restore, delete)
Database: /api/database/* (health, migrations, vacuum)
Workers: /api/workers/* and /api/admin/workers/*
Themes: /api/themes/* (list, assets, CSS)
Webhooks: /api/webhooks/* (test, progress)

Authentication Methods

Session Auth: POST /api/auth/login → session cookie
API Key: X-API-Key header
Worker Token: X-Worker-Token header (for workers only)

Statistics

200+ new endpoints added
179 endpoints now require get_current_user
217 endpoints use require_permission(...)

Training Improvements

Memory Optimizations

Lazy Optimizer Loading: Deferred imports for TorchAO, BitsAndBytes, Prodigy,...

Contributors

kabachuha and bghira

Assets 2

31 Dec 01:15

bghira

v3.3.4

e36a7f4

v3.3.4

What's Changed

ui: preserving changed value and formDirty states between tab changes by @bghira in #2252
ui: remove annoying 2px layout shift by @bghira in #2253
ui: mobile-friendly changes by @bghira in #2254
ui: add webhook config builder by @bghira in #2256
cog: stream logs via lightweight http listener by @bghira in #2257
Implement frames slicing for CREPA video encoders by @kabachuha in #2258
merge by @bghira in #2271
Bump version from 3.3.3 to 3.3.4 by @bghira in #2273

Full Changelog: v3.3.3...v3.3.4

Contributors

kabachuha and bghira

Assets 2

24 Dec 15:16

bghira

v3.3.3

7fb0126

v3.3.3 - more memory optimisations

Features

SDNQ quantisation engine for weights and optimisers
Musubi block swap expanded to cover auraflow, chroma, longcat-image, lumina2, omnigen, hidream, sana, sd3, and z-image
Kandinsky5 memory-efficient VAE now used instead of Diffusers' HunyuanVideo implementation (runs on consumer hw)
resolution_frames bucket strategy for video training so that multi-length dataset is possible with just a single config entry
WebUI: Training configuration wizard now allows filling in the number of checkpoints to keep
metadata will be written to the model / LoRA checkpoint for ComfyUI LoRA Auto Trigger Words node to make use of
OmniGen & Lumina2: TREAD, TwinFlow, and LayerSync
Qwen Image: experimental tiled attention support that avoids OOM in attention calc (disabled, have to enter the code to enable it for now)

Bugfixes

RamTorch
- Now applies to text encoders properly (incl CLIP)
- Extended to support Conv2D and Embedding layers (eg. SDXL offload)
- Compatibility with Quanto (tested with int2, int4, int8-quanto)
- System memory use reduction by not calculating gradients when requires_grad=False
Text encoder memory not unloading fixed for Qwen Image
No more quantize_via pipeline error when no quantisation is enabled
Qwen Image batch size > 1 training fixed (padded)
ROCm: bypass PyTorch bug for building kernels, enabling full Quanto compatibility (int2, int4, int8, fp8)

What's Changed

add metadata for ComfyUI-Lora-Auto-Trigger-Words node by @bghira in #2222
auraflow: implement musubi block swap by @bghira in #2227
chroma: implement musubi block swap by @bghira in #2228
longcat image: implement musubi block swap by @bghira in #2230
modernise lumina2 implementation with TREAD, block swapping, twinflow and layersync by @bghira in #2231
modernise omnigen implementation with TREAD, block swapping, twinflow and layersync by @bghira in #2232
pixart: implement musubi block swap by @bghira in #2233
add qwen-edit-2511 support, and an edit-v2+ flavour which enables 2511 features on 2509 by @bghira in #2223
hidream: implement musubi block swap by @bghira in #2234
sana & sanavideo: implement musubi block swap by @bghira in #2235
sd3: implement musubi block swap by @bghira in #2236
z-image turbo & omni: implement musubi block swap by @bghira in #2237
use kandinsky5 optimised VAE with added temporal roll and chunked conv3d by @bghira in #2229
when preparing model with offload enabled, do not move to accelerator by @bghira in #2238
docs: document SIMPLETUNER_JOB_ID env var for webhook job_id by @rafstahelin in #2239
sdnq quant engine by @bghira in #2225
fix error str vs int comparison by @bghira in #2241
fix error when quantize_via=pipeline but no_change level was provided by @bghira in #2242
ramtorch: when using it for text encoders, do not move to gpu by @bghira in #2244
add resolution_frames bucket strategy for video datasets so that different lengths can exist in one dataset by @bghira in #2240
add checkpoints total limit to wizard by @bghira in #2243
qwen image: fix padding for text embeds by @bghira in #2246
quanto: fix ROCm compiler error for int2-quanto; fix for RamTorch compatibility by @bghira in #2248
qwen image: tiled attention fallback when we hit OOM by @bghira in #2249
ramtorch: fix for gradient memory ballooning; fix text encoder application; extend for Conv2D and Embedding offload by @bghira in #2250
merge by @bghira in #2251

New Contributors

@rafstahelin made their first contribution in #2239

Full Changelog: v3.3.2...v3.3.3

Contributors

bghira and rafstahelin

Assets 2

23 Dec 02:55

bghira

v3.3.2

8974b43

v3.3.2 - easily optimise memory consumption

Features

Better diffusion loss tracking when using LayerSync + CREPA
WebUI easy memory optimisation config for light/medium/aggressive configs
TUI simpletuner configure also able to apply optimisation presets to existing configs

Bugfixes

ComfyUI will now automatically enable v-prediction and ztsnr for relevant checkpoints
LongCat batched training now works correctly
LongCat edit fixed
ControlNet demo dataset repeats boosted
Chroma indent issue fixed, now trains again
Example configs fixed, populate in UI correctly
Example configs no longer use constant LR scheduler with warmup steps incorrectly
SDXL hidden state buffer arg removed
TinyGemm device mismatch
Examples no longer suggest validation_torch_compile or lion optimiser for video models (degrades)

What's Changed

add pure diffusion loss term pre-augmentation when aux loss is enabled by @bghira in #2201
switch video training example configs from Lion to AdamW BF16 by @bghira in #2206
remove validation torch compile option from examples by @bghira in #2207
(#2175) move scale_shift to _data device by @bghira in #2202
when example uses lr warmup, use constant_with_warmup by @bghira in #2208
Fixup crepa states extraction for K5 by @kabachuha in #2209
fix: remove unsupported hidden_states_buffer from SDXL model_predict by @joeqzzuo in #2213
fix config syntax by @bghira in #2214
(#2211) fix Chroma indent issue and resolve validation and training noise by @bghira in #2215
use repeats of 4 by default on demo CN datasets by @bghira in #2218
add lycoris example for longcat edit by @bghira in #2217
longcat image: fix text encoder padding on inputs and initialisation of text processor by @bghira in #2216
(#1822) add --delete_model_after_load to remove files from disk after they're loaded into memory by @bghira in #2210
comfyui: ztsnr and vpred compatibility by @bghira in #2220
easy memory optimisation presets by @bghira in #2221
merge by @bghira in #2219

New Contributors

@kabachuha made their first contribution in #2209
@joeqzzuo made their first contribution in #2213

Full Changelog: v3.3.1...v3.3.2

Contributors

kabachuha, bghira, and joeqzzuo

Assets 2

19 Dec 03:30

bghira

v3.3.1

1d20509

v3.3.1

What's Changed

flux2: do not bypass the special model loader by @bghira in #2170
(#2030) scheduled dataset sampling by @bghira in #2167
GLANCE: better code example by @bghira in #2171
TwinFlow: do not initialise neg time embed when disabled by @bghira in #2174
UI (datasets): remove ControlNet conditioning option from selections when CN is disabled; select reference_strict by default otherwise by @bghira in #2177
add missing LayerSync support to kandinsky5 video by @bghira in #2179
qwen-edit: fix text embed cache generation with image context; disable image embeddings for multi-conditioning input by @bghira in #2176
chroma 4d text embed fix by @bghira in #2181
ensure edit-v2 either uses 1:1 or 0 image embeds by @bghira in #2186
upload zip: preserve subdirs by @bghira in #2189
allow simpletuner server env=... to auto-start training after webUI launches by @bghira in #2191
add more indicators to dataset page when conditioning parameters are not set by @bghira in #2192
Git-based configuration sync across SimpleTuner nodes (wip) by @bghira in #2172
Z-Image-Omni with optional SigLIP conditioning support, TREAD, LayerSync, CFG layer skip, fp16 clamping, and TwinFlow by @bghira in #2183
(#2182) add --peft_lora_target_modules for arbitrary layer definition by @bghira in #2193
(#2190) add webUI onboarding config to "simpletuner configure" by @bghira in #2194
merge by @bghira in #2196
(#2173) remove early check for CREPA since we are using LayerSync features with certain configs by @bghira in #2195
(#2187) better image resizing for validation inputs when validation resolution != training resolution by @bghira in #2197
adjust default resolution on dataset page to equal --resolution, and ensure min/max/target down sample size are equal by @bghira in #2198
merge by @bghira in #2199

Full Changelog: v3.3.0...v3.3.1

Contributors

bghira

Assets 2

16 Dec 21:51

bghira

v3.3.0

2e018b5

v3.3.0 - TwinFlow, LayerSync, and Flux.2 edit training

Features

TwinFlow, a distillation method that works on most flow-matching arch and converges in much less time than typical distillation
LayerSync, a self-regularisation method for practically all transformer models supported in SimpleTuner
CREPA can combine forces with LayerSync to self-regulate instead of using DINO features
Flux.2 can now accept conditioning datasets
Custom flow-matching timesteps can be provided for training, allowing configuration of "Glance" style training runs
WebUI: better path handling for datasets, sensible defaults will be set instead of requiring the user to figure it out
CLI: When configuring dataset cache directories, you can now use {id}, {output_dir} in addition to {model_family} to make dynamic paths that adjust automatically based on these attributes

Bugfixes

WebUI: Search box race condition resolved that prevented items from highlighting, or subsections from expanding

What's Changed

TwinFlow self-directed distillation by @bghira in #2159
(#2136) add --flow_custom_timesteps with Glance "distillation" example by @bghira in #2160
flux2: adjust comfyUI lora export format to use their custom keys instead of generic LoRA layout by @bghira in #2162
[webUI] refactoring validation and default paths for text embed and VAE caches by @bghira in #2163
flux2: support conditioning datasets by @bghira in #2164
fix search box race condition that prevented expanding subsection or highlighting results by @bghira in #2165
LayerSync + CREPA adaptation by @bghira in #2161
merge by @bghira in #2166

Full Changelog: v3.2.3...v3.3.0

Contributors

bghira

Assets 2

Releases: bghira/SimpleTuner

v4.0.4 - ramtorch quality improvements, better LTX-2 audio-only training and video validations

What's Changed

Contributors

Uh oh!

v4.0.3 - LTX-2 IC-LoRA, Z-Image base flavour, end_step/end_epoch dataset scheduling and GPU health checks

What's Changed

Contributors

Uh oh!

v4.0.2 - audio-only LTX-2 training, HeartMuLa, CUDA 13 for Blackwell

What's Changed

Contributors

Uh oh!

v4.0.1 - klein, scheduled CREPA, and disable_multiline_split for captions with newlines

Uh oh!

v4.0.0 - multi-user, cloud training, webUI overhaul

SimpleTuner v4.0.0 Release Notes

Highlights

Table of Contents

Breaking Changes

CLI Entry Point

Docker Image

API Authentication

Documentation System

New Model Architectures

LTX-Video 2 (LTX-2)

Wan S2V (Speech-to-Video)

Context Parallelism Support

Enterprise Features

Multi-User Authentication

Role-Based Access Control (RBAC)

Organizations & Teams

Job Queue System

Approval Workflows

Quota Management

Audit Logging

Worker Orchestration

Notification System

Circuit Breaker Resilience

State Backend Options

CLI Changes

New Commands

Auth Subcommands

Server Enhancements

Environment Variables

API Changes

New Endpoint Categories

Authentication Methods

Statistics

Training Improvements

Memory Optimizations

Contributors

Uh oh!

v3.3.4

What's Changed

Contributors

Uh oh!

v3.3.3 - more memory optimisations

Features

Bugfixes

What's Changed

New Contributors

Contributors

Uh oh!

v3.3.2 - easily optimise memory consumption

Features

Bugfixes

What's Changed

New Contributors

Contributors

Uh oh!

v3.3.1

What's Changed

Contributors

Uh oh!

v3.3.0 - TwinFlow, LayerSync, and Flux.2 edit training

Features

Bugfixes

What's Changed

Contributors