TorchRL v0.11.0 Release Notes
Highlights
- Dreamer overhaul - Comprehensive improvements to Dreamer world model training: async collectors with profiling, RSSM fixes (scan mode, noise injection, explicit dimensions), torch.compile compatibility for value functions and TDLambda estimator, optimized DreamerEnv to avoid CUDA syncs, and updated sota-implementation with better configs. @vmoens
- Weight synchronization schemes - New modular weight sync infrastructure (
torchrl.weight_update) with SharedMem, MultiProcess, and vLLM-specific (NCCL, double-buffer) schemes. Collectors now integrate seamlessly with weight sync schemes for distributed training. @vmoens - Major collector refactor - The collector codebase has been completely restructured. The monolithic
collectors.pyis now split into focused modules (_single.py,_multi_base.py,_multi_sync.py,_multi_async.py,_runner.py,base.py), with cleaner separation of concerns. (#3233) @vmoens - LLM objectives: DAPO & CISPO - New DAPO (Direct Advantage Policy Optimization) and CISPO (Clipped Importance Sampling Policy Optimization) algorithms for LLM training. @vmoens
- Trainer infrastructure - New SAC Trainer, configuration system for algorithms, timing utilities, and async collection support within trainers. @vmoens
- Tool services - New tool service infrastructure for LLM agents with Python executor, MCP tools, and web search capabilities. @vmoens
- Deprecated APIs removed - All deprecation warnings from v0.10 have been promoted to hard errors for v0.11. (#3369) @vmoens
- New environment backends - Added Procgen environments support with a new
ProcgenEnvwrapper. (#3331) @ParamThakkar123 - Multi-env execution - GymEnv, BraxEnv, and DMControlEnv now support a
num_envs/num_workersparameter to run multiple environments in a single call viaParallelEnv. (#3343, #3370, #3337) @ParamThakkar123
Installation
pip install torchrl==0.11.0Breaking Changes
-
[v0.11] Remove deprecated features and replace warnings with errors (#3369) @vmoens
- Removes deprecated
KLRewardTransformfromtransforms/llm.py(usetorchrl.envs.llm.KLRewardTransform) - Removes
LogRewardandRecorderclasses from trainers (useLogScalarandLogValidationReward) - Removes
unbatched_*_specproperties from VmasWrapper/VmasEnv (usefull_*_spec_unbatched) - Deletes deprecated
rlhf.pymodules (data/rlhf.py,envs/transforms/rlhf.py,modules/models/rlhf.py) - Removes
replay_buffer_chunkparameter from MultiCollector - Replaces
minimum/maximumdeprecation warnings withTypeErrorin Bounded spec - Replaces
critic_coef/entropy_coefdeprecation warnings withTypeErrorin PPO and A2C losses
- Removes deprecated
-
[Major] Major refactoring of collectors (#3233) @vmoens
- Splits the 5000+ line
collectors.pyinto focused modules for single/multi sync/async collectors - Creates new
_constants.py,_runner.py,base.pymodules - Introduces cleaner weight synchronization scheme integration
- Improves test coverage for multi-device and shared-device weight updates
- Some internal APIs have changed; external API remains compatible
- Splits the 5000+ line
Dreamer World Model Improvements
These changes significantly improve Dreamer training performance, torch.compile compatibility, and usability. @vmoens
-
[Feature] Refactor Dreamer training with async collectors, profiling, and improved config (cc917ba)
- Major overhaul of the Dreamer sota-implementation with async data collection
- Adds profiling support for performance analysis
- Improved configuration with better defaults and documentation
- Updated README with detailed usage instructions
-
[Refactor] Dreamer implementation updates (3ab4b30)
- Refactors Dreamer training script for better maintainability
- Updates config.yaml with improved hyperparameters
- Enhances dreamer_utils.py with additional helper functions
-
[Feature] Add noise argument and scan mode to RSSMRollout (8653b6e)
- Adds
noiseargument to control stochastic sampling during rollout - Implements scan mode for efficient sequential processing
- 109 lines added to model_based.py for improved RSSM flexibility
- Adds
-
[Feature] Add explicit dimensions and device support to RSSM modules (f350fe0)
- Adds explicit dimension handling for batch, time, and feature dims
- Improves device placement for RSSM components
-
[Feature] Logging & RSSM fixes (d082979)
- Fixes RSSM module behavior and adds logging improvements
- Updates batched_envs.py and wandb logger
-
[Refactor] Use compile-aware helpers in Dreamer objectives (d8c3887)
- Updates Dreamer objectives to use torch.compile-compatible helpers
- Improves performance when using torch.compile
-
[BugFix] Optimize DreamerEnv to avoid CUDA sync in done checks (d57fdec)
- Eliminates unnecessary CUDA synchronizations in done flag checking
- Significant performance improvement for GPU-based Dreamer training
-
[BugFix] Fix ModelBasedEnvBase for torch.compile compatibility (7ff663d)
- Makes ModelBasedEnvBase compatible with torch.compile
-
[Feature] Add allow_done_after_reset parameter to ModelBasedEnvBase (24ae042)
- Adds flexibility for environments that may signal done immediately after reset
-
[BugFix] Final polish for Dreamer utils and Collector tests (5aefdd5)
- Final cleanup and polish for Dreamer implementation
Weight Synchronization Schemes
New modular infrastructure for weight synchronization between training and inference workers. @vmoens
-
[Feature] Weight Synchronization Schemes - Core Infrastructure (e8f6fa5)
- New
torchrl.weight_updatemodule with 2400+ lines of weight sync infrastructure SharedMemWeightSyncScheme: Uses shared memory for fast intra-node syncMultiProcessWeightSyncScheme: Uses multiprocessing queues for cross-process sync- Comprehensive documentation and examples in
docs/source/reference/collectors.rst
- New
-
[Feature] vLLM Weight Synchronization Schemes (d0c8b7e)
VllmNCCLWeightSyncScheme: NCCL-based weight sync for vLLM distributed inferenceVllmDoubleBufferWeightSyncScheme: Double-buffered async weight updates- 1876 lines of vLLM-specific weight sync code
-
[Feature] Collectors - Weight Sync Scheme Integration (a7707ca)
- Integrates weight sync schemes into collector infrastructure
- Updates GRPO and expert-iteration implementations to use new schemes
- Adds examples for multi-weight update patterns
-
[Refactor] Weight sync schemes refactor (ae0ae06)
- Refines weight sync API and adds additional schemes
- Improves test coverage with 342+ new test lines
LLM Training: DAPO & CISPO
New policy optimization algorithms for LLM fine-tuning. @vmoens
-
[Feature] DAPO (9d5c276)
- Implements Direct Advantage Policy Optimization for LLM training
- Adds DAPO-specific loss computation to
torchrl/objectives/llm/grpo.py
-
[Feature] CISPO (ed0d8dc)
- Implements Clipped Importance Sampling Policy Optimization
- Alternative to PPO/GRPO with different clipping strategy
-
[Refactor] Refactor GRPO as a separate class (2bc3cb7)
- Separates GRPO implementation for better modularity
Trainer Infrastructure
New trainer algorithms, configuration system, and utilities. @vmoens
-
[Trainers] SAC Trainer and algorithms (02d4bfd)
- New SAC Trainer implementation with 786 lines of code
- Complete sota-implementation in
sota-implementations/sac_trainer/ - Trainer configuration via YAML files
-
[Feature] Trainer Algorithms - Configuration System (6bc201a)
- New configuration system in
torchrl/trainers/algorithms/configs/ - Configs for collectors, data, modules, objectives, transforms, weight sync schemes
- Enables hydra-style configuration composition
- New configuration system in
-
[Feature] Trainer Infrastructure - Timing and Utilities (dc21523)
- Adds timing utilities to trainer infrastructure
- 263 lines of enhanced trainer functionality
-
[Feature] Async collection within trainers (5f1eb2c)
- Enables asynchronous data collection during training
- Improves training throughput
-
[Feature] PPO Trainer Updates (129f3d5)
- Updates to PPO trainer with new features
Tool Services for LLM Agents
New infrastructure for tool-augmented LLM agents. @vmoens
-
[Feature] Tool services (9ca0e40)
- New
torchrl/services/module for tool execution - Python executor service for safe code execution
- MCP (Model Context Protocol) tool integration
- Web search tool example
- 609 lines of documentation in
docs/source/reference/services.rst - Comprehensive test coverage in
test/test_services.py
- New
-
[Feature] Transform Module - ModuleTransform and Ray Service Refactor (7b85c71)
- New
ModuleTransformfor applying nn.Modules as transforms - Refactored Ray service integration
- Moves
ray_service.pytotorchrl/envs/transforms/
- New
torch.compile Compatibility
Fixes to enable torch.compile with various TorchRL components. @vmoens
-
[BugFix] Fix value functions for torch.compile compatibility (3bdc7b1)
- 295 lines of new tests in
test/compile/test_value.py - Fixes value function implementations for compile compatibility
- 295 lines of new tests in
-
[BugFix] Fix TDLambdaEstimator for torch.compile compatibility (11e22ee)
- Updates TDLambdaEstimator to work with torch.compile
-
[Feature] Add compile-aware timing and profiling helpers (764d9f7)
- New utilities for profiling compiled code
-
[Refactor] Add compile-aware timing and profiling helpers (c82447e)
- Performance utilities that work correctly under torch.compile
Features
-
[Feature] Added EXP3 Scoring function (#3013) @ParamThakkar123
- Implements the EXP3 (Exponential-weight algorithm for Exploration and Exploitation) scoring function for MCTS
- Adds
torchrl/data/map/module with hash-based storage, query utilities, and tree structures for efficient state lookups
-
[Feature] Add num_workers parameter to BraxEnv (#3370) @ParamThakkar123
- Allows running multiple Brax environments in parallel via
ParallelEnvby specifyingnum_workers > 1 - Consistent API with GymEnv and DMControlEnv multi-env support
- Allows running multiple Brax environments in parallel via
-
[Feature] Added num_envs parameter in GymEnv (#3343) @ParamThakkar123
- When
num_envs > 1, GymEnv automatically returns a lazyParallelEnvwrapping multiple environment instances - Simplifies multi-env setup without manually constructing ParallelEnv
- When
-
[Environments] Added Procgen environments (#3331) @ParamThakkar123
- New
ProcgenWrapperandProcgenEnvclasses to wrap OpenAI Procgen environments - Converts Procgen observations to TorchRL-compliant TensorDict outputs
- Supports all 16 Procgen game environments (coinrun, starpilot, etc.)
- New
-
[Feature] Loss make_value_estimator takes a ValueEstimatorBase class (#3336) @ParamThakkar123
- Allows passing a
ValueEstimatorBaseclass directly tomake_value_estimator()for more flexible value estimation configuration - Adds tests and documentation for the new API
- Allows passing a
-
[Feature] Make custom_range public in ActionDiscretizer (#3333) @vmoens
- Exposes
custom_rangeparameter in ActionDiscretizer for user-defined discretization ranges
- Exposes
-
[Feature] Added num_envs parameter in DMControlEnv (#3337) @ParamThakkar123
- Similar to GymEnv, allows running multiple DMControl environments via
num_workersparameter - Returns a lazy
ParallelEnvwhennum_workers > 1
- Similar to GymEnv, allows running multiple DMControl environments via
-
[Feature] Auto-configure exploration module specs from environment (#3317) @bsprenger
- Collectors now automatically configure exploration module specs (like
AdditiveGaussianModule) from the environment's action spec - Adds
set_exploration_modules_spec_from_env()utility function - Reduces boilerplate when using exploration modules with delayed spec initialization
- Collectors now automatically configure exploration module specs (like
-
[Feature] Add NPU Support for Single Agent (#3229) @lowdy1
- Adds Huawei NPU (Ascend) device support for single-agent training
- Updates device handling logic to recognize NPU devices
-
[Feature] Add support for
trackio(#3196) @Xmaster6y- Integrates with trackio for experiment tracking and logging
- New
TrackioLoggerclass for seamless integration
-
[Feature] Add a new Trainer hook point
process_loss(#3259) @Xmaster6y- Adds a new hook point in the Trainer that runs after loss computation but before backward pass
- Useful for loss modification, logging, or gradient accumulation strategies
-
[Feature] Ensure
MultiSyncDataCollectorsreturns data ordered by worker id (#3243) @LCarmi- Data from MultiSyncDataCollectors is now consistently ordered by worker ID
- Makes debugging and analysis easier when working with multi-worker collectors
-
[Feature] Add TanhModuleConfig (#3255) @bsprenger
- Adds configuration class for TanhModule to support serialization and hydra-style configs
-
[Feature] add TensorDictSequentialConfig (#3248) @bsprenger
- Adds configuration class for TensorDictSequential modules
-
[Feature] Weight loss outputs when using prioritized sampler (#3235) @vmoens
- Loss modules now properly weight outputs when using prioritized replay buffers
- Adds
importance_keyparameter to loss functions (DQN, DDPG, SAC, TD3, TD3+BC) - When importance weights are present in the sampled data, losses are automatically weighted
- Adds comprehensive tests for prioritized replay buffer integration with all loss functions
-
[Feature] Composite specs can create named tensors with 'zero' and 'rand' (#3214) @louisfaury
- Composite specs now propagate dimension names when creating tensors via
zero()andrand()
- Composite specs now propagate dimension names when creating tensors via
-
[Feature] Enable storing rollouts on a different device (#3199) @Xmaster6y
- Collectors can now store rollouts on a different device than the execution device
- Useful for CPU storage while running on GPU
-
[Feature] Named dims in Composite (#3174) @vmoens
- Adds named dimension support to Composite specs via
namesproperty andrefine_names()method - Enables better integration with PyTorch's named tensors
- Supports ellipsis notation for partial name specification
- Adds named dimension support to Composite specs via
Bug Fixes
-
[BugFix] Fix TransformersWrapper ChatHistory.full not being set (#3375) @vmoens
-
[BugFix] Fix AsyncEnvPool + LLMCollector with yield_completed_trajectories (#3373) @vmoens
- Fixes interaction between AsyncEnvPool and LLMCollector when yielding completed trajectories
-
[BugFix] Fix LLM test failures (#3360) @vmoens
- Comprehensive fixes for LLM-related test failures across multiple modules
-
[BugFix] Fixed MultiSyncCollector set_seed and split_trajs issue (#3352) @ParamThakkar123
- Fixes seed propagation and trajectory splitting in MultiSyncDataCollector
-
[BugFix] Fix VecNormV2 device gathering (#3368) @vmoens
- Fixes device handling when gathering statistics in VecNormV2 transform
-
[BugFix] Fix AsyncEnv - LLMCollector integration (#3365) @vmoens
- Resolves integration issues between async environments and LLM collectors
-
[BugFix] Fix VecNormV2 GPU device handling for stateful mode (#3364) @BY571
- Fixes GPU device handling in VecNormV2 when using stateful normalization mode
-
[BugFix] Fix Ray collector iterator bug and process group cleanup (#3363) @vmoens
- Fixes iterator behavior in Ray-based collectors and ensures proper cleanup of process groups
-
[BugFix] Fix LLM CI by replacing uvx with uv package (#3356) @vmoens
-
[Bugfix] Wrong minari download first element (#3106) @marcosgalleterobbva
- Fixes incorrect first element handling when downloading Minari datasets
-
[BugFix] Fixed ParallelEnv + aSyncDataCollector / MultiSyncDataCollector not working if replay_buffer is given (#3341) @ParamThakkar123
- Fixes compatibility issue when using ParallelEnv with collectors that have a replay buffer attached
-
[BugFix] treat 1-D MultiDiscrete as MultiCategorical and accept flattened masks (#3342) @ParamThakkar123
- Improves MultiDiscrete action space handling to accept flattened action masks
-
[BugFix] Fixes register_save_hook bug (#3340) @ParamThakkar123
- Fixes save hook registration in replay buffers
-
[BugFix,Test] recompiles with string failure (#3338) @vmoens
-
[BugFix] Fix Safe modules annotation and doc for losses (#3334) @vmoens
-
[BugFix] Fix SACLoss target_entropy="auto" ignoring action space dimensionality (#3292) @vmoens
- Fixes automatic entropy target computation to properly account for action space dimensions
-
[BugFix] Fix agent_dim in multiagent nets & account for neg dims (#3290) @vmoens
- Fixes agent dimension handling in multi-agent networks, including support for negative dimension indices
-
[BugFix] Added a missing .to(device) call in _from_transformers_generate_history (#3289) @michalgregor
-
[BugFix] Fix old pytorch dependencies (#3266) @vmoens
- Updates compatibility shims for older PyTorch versions
-
[BugFix] RSSMRollout not advancing state/belief across time steps in Dreamer (#3236) @cmdout
- Fixes critical bug where RSSM rollout was not properly advancing hidden states in Dreamer world model
-
[BugFix] use correct field names in InitTrackerConfig (#3245) @bsprenger
-
[BugFix] Use torch.zeros for argument in torch.where (#3239) @sebimarkgraf
-
[BugFix] Fix wrong assertion about collector and buffer (#3176) @vmoens
-
[BugFix] AttributeError in accept_remote_rref_udf_invocation (#3168) @vmoens
Bug Fixes (ghstack commits without PR numbers)
-
[BugFix] Final polish for Collector and Exploration (558396d) @vmoens
- Final fixes for collector and exploration module interactions
-
[BugFix] Collector robustness & Async fixes (cadf23b) @vmoens
- Improves collector robustness and fixes async collection issues
-
[BugFix] TensorStorage key filtering (0667a58) @vmoens
- Fixes key filtering in TensorStorage
-
[BugFix] Replay Buffer prefetch & SliceSampler (9d34dbe) @vmoens
- Fixes prefetching behavior and SliceSampler issues
-
[BugFix] Fix target_entropy computation for composite action specs (06960c6) @vmoens
- Correctly computes target entropy for composite action spaces
-
[BugFix] Fix device initialization in CrossQLoss.maybe_init_target_entropy (416e454) @vmoens
- Fixes device placement for CrossQ loss entropy initialization
-
[BugFix] Fix schemes and refactor collectors to make them readable (364e038) @vmoens
- Fixes weight sync schemes and improves collector code readability
-
[BugFix] Fix collector devices (888095f) @vmoens
- Fixes device handling in collectors
-
[BugFix] Handle Lock/RLock types in EnvCreator (6985ca2) @vmoens
- Properly handles threading locks in EnvCreator serialization
-
[BugFix] Defer filter_warnings import to avoid module load issues (8ea954c) @vmoens
-
[BugFix] Fix CUDA sync in forked subprocess (4a2a274) @vmoens
- Fixes CUDA synchronization issues in forked processes
-
[BugFix] Add pybind11 check and Windows extension pattern fix (5e89e4e) @vmoens
Additional Features (ghstack commits without PR numbers)
-
[Feature] Storage Shared Initialization for Multiprocessing (61c178e) @vmoens
- Enables shared storage initialization for multiprocessing collectors
- Updates distributed collector examples
-
[Feature] Memmap storage cleanup (7f9ea74) @vmoens
- Improves memmap storage cleanup and resource management
-
[Feature] Collector Profiling (02ed47e) @vmoens
- Adds profiling capabilities to collectors for performance analysis
-
[Feature] auto_wrap_envs in PEnv (d781f9e) @vmoens
- Automatic environment wrapping in ParallelEnv
-
[Feature] Auto-wrap lambda functions with EnvCreator for spawn compatibility (d250c18) @vmoens
- Automatically wraps lambda functions with EnvCreator for spawn multiprocessing compatibility
-
[Feature] Collectors' getattr_policy and getattr_env (fbdbb61) @vmoens
- Adds attribute access methods for policy and env in collectors
-
[Feature] track_policy_version in collectors.py (a089cc4) @vmoens
- Adds policy version tracking in collectors for distributed training
-
[Feature] Aggregation strategies (0e6a356) @vmoens
- New aggregation strategies for multi-worker data collection
-
[Feature] kl_mask_threshold (7ab48a4) @vmoens
- Adds KL divergence masking threshold for LLM training
-
[Feature] Add timing options (13434eb) @vmoens
- Additional timing options for performance monitoring
-
[Feature] float32 patch (01d2801) @vmoens
- Float32 precision handling improvements
-
[Feature] Support callable scale in IndependentNormal and TanhNormal distributions (d3aba7d) @vmoens
- Allows scale parameter to be a callable in distribution modules
Documentation
-
[Doc] Huge doc refactoring (3d5dd1a) @vmoens
- Major documentation restructuring with new sections:
collectors_basics.rst,collectors_single.rst,collectors_distributed.rstcollectors_weightsync.rst,collectors_replay.rstdata_datasets.rst,data_replaybuffers.rst,data_samplers.rst
- Adds pre-commit hook for Sphinx section underline checking
- Major documentation restructuring with new sections:
-
[Docs] Update LLM_TEST_ISSUES.md with fix status (#3374) @vmoens
-
[Docs] Enable doc builds and tutorial runs (#3335) @vmoens
- Re-enables Sphinx-gallery tutorial execution in documentation builds
- Adds process cleanup between tutorials to prevent resource leaks
CI / Infrastructure
-
[CI] Speed up slow tests in tests-gpu/tests-cpu (#3395) @vmoens
- Optimizes slow test execution with better parallelization and test isolation
-
[CI] Better release workflow (#3386) @vmoens
- Comprehensive release workflow with sanity checks, wheel collection, docs updates, and PyPI publishing
- Adds dry-run mode for testing releases
- Automatically updates stable docs symlink and versions.html
-
[CI] Auto-tag PRs (#3381) @vmoens
- Adds automatic labeling for PRs based on changed files
-
[CI] Add release agent prompt for LLM-assisted releases (#3380) @vmoens
- Adds comprehensive release instructions for LLM-assisted release automation
-
[CI] Add release workflow with PyPI trusted publishing (#3379) @vmoens
- Implements PyPI trusted publishing via OIDC authentication
- Eliminates need for PyPI API tokens in secrets
-
[CI] Add granular label support for environment and data workflow jobs (#3371) @vmoens
-
[CI] Fix PettingZoo CI by updating Python 3.9 to 3.10 (#3362) @vmoens
-
[CI] Fix GenDGRL CI by adding missing requests dependency (#3361) @vmoens
-
[CI] Fix IsaacLab tests by using explicit conda Python path (#3358) @vmoens
-
[CI] Fix Windows build for free-threaded Python (3.13t, 3.14t) (#3357) @vmoens
-
[CI,BugFix] Fix Habitat CI by upgrading to Python 3.10 (#3346) @vmoens
- Upgrades Habitat CI to Python 3.10 and builds habitat-sim from source
-
[CI] Fix Jumanji CI by adding missing requests dependency (#3349) @vmoens
-
[CI] Fix Chess CI by correcting test file path typo (#3353) @vmoens
-
[CI] Skip Windows-incompatible tests in optional deps CI (#3348) @vmoens
-
[CI] Use pip install (#3200) @vmoens
- Migrates CI workflows from setup.py install to pip install
CI / Infrastructure (ghstack commits without PR numbers)
-
[Setup] Python 3.14 in, python 3.9 out (ff86ab7) @vmoens
- Adds Python 3.14 support and removes Python 3.9
-
[CI] LLM tests integration (cccfaa6) @vmoens
- Integrates LLM tests into CI pipeline
-
[CI] Fix SOTA runs (57000fcc) @vmoens
-
[CI] Add --upgrade flag for torch installs and install ffmpeg/xvfb (ce99319) @vmoens
Tests
-
[Tests] Check traj_ids shape with unbatched envs (#3393) @vmoens
-
[Tests] Fix test isolation in test_set_gym_environments and related tests (#3382) @vmoens
- Improves test isolation to prevent cross-test contamination
-
[Test] Use mock Llama tokenizer instead of skipping gated model test (#3376) @vmoens
-
[Test] Skip Llama test when tokenizer unavailable instead of xfail (#3372) @vmoens
-
[Tests] Remove _utils_internal.py in tests (#3281) @vmoens
- Removes deprecated internal test utilities and updates tests to use public APIs
-
[Test,Benchmark] Move mocking classes file and bench for non-tensor env (#3257) @vmoens
-
[Tests] Fix vmas seeding test (#3210) @matteobettini
-
[Tests] Reintroduce VMAS football (#3178) @matteobettini
Tests (ghstack commits without PR numbers)
-
[Test] Fix test_num_threads that instantiate the env in the main process (852dd61) @vmoens
-
[Test] Use class references instead of lambdas in transform tests (7dd1e61) @vmoens
-
[Test] Add test for lambda wrapping in ParallelEnv (4b2b227) @vmoens
-
[Test] Ensure collector shutdown with try/finally (aa2b031) @vmoens
-
[Test] Add filterwarnings for unclosed resources (b493ab3) @vmoens
-
[Test] Add retry decorator to flaky vecnorm test (6d908b6) @vmoens
-
[Test] Simplify num_threads test to check threads in env factory (295cc20) @vmoens
-
[Test] Ignore script_method deprecation warning (eaaf97a) @vmoens
-
[Test] Remove the forked decorator of the distributed checks (4db5966) @vmoens
Refactors / Maintenance
-
[Refactor] Updated num_envs parameter to num_workers for consistency (#3354) @ParamThakkar123
- Renames
num_envstonum_workersfor consistency across environment wrappers
- Renames
-
[Quality] replace reduce to reduction, better error message for invalid mask (#3179) @Kayzwer
Refactors (ghstack commits without PR numbers)
-
[Refactor] Move WEIGHT_SYNC_TIMEOUT to collectors._constants (3ceb6b9) @vmoens
- Centralizes weight sync timeout configuration
-
*[Refactor] Remove TensorSpec classes (7ea8d86) @vmoens
- Removes legacy TensorSpec class aliases
-
[Refactor] Rename collectors (8779f2a) @vmoens
- Renames collector classes for consistency
-
[Refactor] Non-daemonic processes in PEnv (aeb2e9b) @vmoens
- Changes ParallelEnv to use non-daemonic processes
-
[Refactor] Make env creator optional for Ray (b599d9b) @vmoens
- Makes environment creator optional in Ray collectors
-
[Refactor] Move decorate_thread_sub_func to torchrl.testing.mp_helpers (e3e9e6a) @vmoens
- Moves multiprocessing test helpers to proper location
-
[Refactor] Use non_blocking transfers in distribution modules (5c75777) @vmoens
- Uses non-blocking GPU transfers in distribution modules for better performance
-
[Refactor] Remove MUJOCO_EGL_DEVICE_ID auto-setting from torchrl init (8fac4c5) @vmoens
-
[Refactor] Add WEIGHT_SYNC_TIMEOUT constant for collector weight synchronization (ab3768a) @vmoens
-
[Refactor,Test] Move compile test to dedicated folder (253b8dd) @vmoens
Dependencies
-
[Dependencies] Bump ray from 2.46.0 to 2.52.1 (#3258) @dependabot
-
[Environment] Fix envpool wrapper (#3339) @vmoens
- Updates envpool wrapper for compatibility with latest envpool versions
Typing
- [Typing] Edit wrongfully set str type annotations (e7583b3) @vmoens
- Fixes incorrect string type annotations across the codebase
Other
-
Revert "replace reduce to reduction, better error message for invalid mask" (#3182) @vmoens
-
Fix (#3180) @matteobettini
Contributors
Thanks to all contributors to this release:
- @vmoens (Vincent Moens)
- @ParamThakkar123 (Param Thakkar)
- @bsprenger (Ben Sprenger)
- @Xmaster6y (Yoann Poupart)
- @BY571
- @LCarmi (Luca Carminati)
- @louisfaury (Faury Louis)
- @matteobettini (Matteo Bettini)
- @lowdy1
- @michalgregor (Michal Gregor)
- @cmdout
- @sebimarkgraf (Sebastian Moßburger)
- @Kayzwer
- @marcosgalleterobbva (Marcos Galletero Romero)
- @dependabot