Skip to content

TorchRL v0.11.0

Latest

Choose a tag to compare

@github-actions github-actions released this 28 Jan 08:30
· 3 commits to main since this release
f9ca748

TorchRL v0.11.0 Release Notes

Highlights

  • Dreamer overhaul - Comprehensive improvements to Dreamer world model training: async collectors with profiling, RSSM fixes (scan mode, noise injection, explicit dimensions), torch.compile compatibility for value functions and TDLambda estimator, optimized DreamerEnv to avoid CUDA syncs, and updated sota-implementation with better configs. @vmoens
  • Weight synchronization schemes - New modular weight sync infrastructure (torchrl.weight_update) with SharedMem, MultiProcess, and vLLM-specific (NCCL, double-buffer) schemes. Collectors now integrate seamlessly with weight sync schemes for distributed training. @vmoens
  • Major collector refactor - The collector codebase has been completely restructured. The monolithic collectors.py is now split into focused modules (_single.py, _multi_base.py, _multi_sync.py, _multi_async.py, _runner.py, base.py), with cleaner separation of concerns. (#3233) @vmoens
  • LLM objectives: DAPO & CISPO - New DAPO (Direct Advantage Policy Optimization) and CISPO (Clipped Importance Sampling Policy Optimization) algorithms for LLM training. @vmoens
  • Trainer infrastructure - New SAC Trainer, configuration system for algorithms, timing utilities, and async collection support within trainers. @vmoens
  • Tool services - New tool service infrastructure for LLM agents with Python executor, MCP tools, and web search capabilities. @vmoens
  • Deprecated APIs removed - All deprecation warnings from v0.10 have been promoted to hard errors for v0.11. (#3369) @vmoens
  • New environment backends - Added Procgen environments support with a new ProcgenEnv wrapper. (#3331) @ParamThakkar123
  • Multi-env execution - GymEnv, BraxEnv, and DMControlEnv now support a num_envs/num_workers parameter to run multiple environments in a single call via ParallelEnv. (#3343, #3370, #3337) @ParamThakkar123

Installation

pip install torchrl==0.11.0

Breaking Changes

  • [v0.11] Remove deprecated features and replace warnings with errors (#3369) @vmoens

    • Removes deprecated KLRewardTransform from transforms/llm.py (use torchrl.envs.llm.KLRewardTransform)
    • Removes LogReward and Recorder classes from trainers (use LogScalar and LogValidationReward)
    • Removes unbatched_*_spec properties from VmasWrapper/VmasEnv (use full_*_spec_unbatched)
    • Deletes deprecated rlhf.py modules (data/rlhf.py, envs/transforms/rlhf.py, modules/models/rlhf.py)
    • Removes replay_buffer_chunk parameter from MultiCollector
    • Replaces minimum/maximum deprecation warnings with TypeError in Bounded spec
    • Replaces critic_coef/entropy_coef deprecation warnings with TypeError in PPO and A2C losses
  • [Major] Major refactoring of collectors (#3233) @vmoens

    • Splits the 5000+ line collectors.py into focused modules for single/multi sync/async collectors
    • Creates new _constants.py, _runner.py, base.py modules
    • Introduces cleaner weight synchronization scheme integration
    • Improves test coverage for multi-device and shared-device weight updates
    • Some internal APIs have changed; external API remains compatible

Dreamer World Model Improvements

These changes significantly improve Dreamer training performance, torch.compile compatibility, and usability. @vmoens

  • [Feature] Refactor Dreamer training with async collectors, profiling, and improved config (cc917ba)

    • Major overhaul of the Dreamer sota-implementation with async data collection
    • Adds profiling support for performance analysis
    • Improved configuration with better defaults and documentation
    • Updated README with detailed usage instructions
  • [Refactor] Dreamer implementation updates (3ab4b30)

    • Refactors Dreamer training script for better maintainability
    • Updates config.yaml with improved hyperparameters
    • Enhances dreamer_utils.py with additional helper functions
  • [Feature] Add noise argument and scan mode to RSSMRollout (8653b6e)

    • Adds noise argument to control stochastic sampling during rollout
    • Implements scan mode for efficient sequential processing
    • 109 lines added to model_based.py for improved RSSM flexibility
  • [Feature] Add explicit dimensions and device support to RSSM modules (f350fe0)

    • Adds explicit dimension handling for batch, time, and feature dims
    • Improves device placement for RSSM components
  • [Feature] Logging & RSSM fixes (d082979)

    • Fixes RSSM module behavior and adds logging improvements
    • Updates batched_envs.py and wandb logger
  • [Refactor] Use compile-aware helpers in Dreamer objectives (d8c3887)

    • Updates Dreamer objectives to use torch.compile-compatible helpers
    • Improves performance when using torch.compile
  • [BugFix] Optimize DreamerEnv to avoid CUDA sync in done checks (d57fdec)

    • Eliminates unnecessary CUDA synchronizations in done flag checking
    • Significant performance improvement for GPU-based Dreamer training
  • [BugFix] Fix ModelBasedEnvBase for torch.compile compatibility (7ff663d)

    • Makes ModelBasedEnvBase compatible with torch.compile
  • [Feature] Add allow_done_after_reset parameter to ModelBasedEnvBase (24ae042)

    • Adds flexibility for environments that may signal done immediately after reset
  • [BugFix] Final polish for Dreamer utils and Collector tests (5aefdd5)

    • Final cleanup and polish for Dreamer implementation

Weight Synchronization Schemes

New modular infrastructure for weight synchronization between training and inference workers. @vmoens

  • [Feature] Weight Synchronization Schemes - Core Infrastructure (e8f6fa5)

    • New torchrl.weight_update module with 2400+ lines of weight sync infrastructure
    • SharedMemWeightSyncScheme: Uses shared memory for fast intra-node sync
    • MultiProcessWeightSyncScheme: Uses multiprocessing queues for cross-process sync
    • Comprehensive documentation and examples in docs/source/reference/collectors.rst
  • [Feature] vLLM Weight Synchronization Schemes (d0c8b7e)

    • VllmNCCLWeightSyncScheme: NCCL-based weight sync for vLLM distributed inference
    • VllmDoubleBufferWeightSyncScheme: Double-buffered async weight updates
    • 1876 lines of vLLM-specific weight sync code
  • [Feature] Collectors - Weight Sync Scheme Integration (a7707ca)

    • Integrates weight sync schemes into collector infrastructure
    • Updates GRPO and expert-iteration implementations to use new schemes
    • Adds examples for multi-weight update patterns
  • [Refactor] Weight sync schemes refactor (ae0ae06)

    • Refines weight sync API and adds additional schemes
    • Improves test coverage with 342+ new test lines

LLM Training: DAPO & CISPO

New policy optimization algorithms for LLM fine-tuning. @vmoens

  • [Feature] DAPO (9d5c276)

    • Implements Direct Advantage Policy Optimization for LLM training
    • Adds DAPO-specific loss computation to torchrl/objectives/llm/grpo.py
  • [Feature] CISPO (ed0d8dc)

    • Implements Clipped Importance Sampling Policy Optimization
    • Alternative to PPO/GRPO with different clipping strategy
  • [Refactor] Refactor GRPO as a separate class (2bc3cb7)

    • Separates GRPO implementation for better modularity

Trainer Infrastructure

New trainer algorithms, configuration system, and utilities. @vmoens

  • [Trainers] SAC Trainer and algorithms (02d4bfd)

    • New SAC Trainer implementation with 786 lines of code
    • Complete sota-implementation in sota-implementations/sac_trainer/
    • Trainer configuration via YAML files
  • [Feature] Trainer Algorithms - Configuration System (6bc201a)

    • New configuration system in torchrl/trainers/algorithms/configs/
    • Configs for collectors, data, modules, objectives, transforms, weight sync schemes
    • Enables hydra-style configuration composition
  • [Feature] Trainer Infrastructure - Timing and Utilities (dc21523)

    • Adds timing utilities to trainer infrastructure
    • 263 lines of enhanced trainer functionality
  • [Feature] Async collection within trainers (5f1eb2c)

    • Enables asynchronous data collection during training
    • Improves training throughput
  • [Feature] PPO Trainer Updates (129f3d5)

    • Updates to PPO trainer with new features

Tool Services for LLM Agents

New infrastructure for tool-augmented LLM agents. @vmoens

  • [Feature] Tool services (9ca0e40)

    • New torchrl/services/ module for tool execution
    • Python executor service for safe code execution
    • MCP (Model Context Protocol) tool integration
    • Web search tool example
    • 609 lines of documentation in docs/source/reference/services.rst
    • Comprehensive test coverage in test/test_services.py
  • [Feature] Transform Module - ModuleTransform and Ray Service Refactor (7b85c71)

    • New ModuleTransform for applying nn.Modules as transforms
    • Refactored Ray service integration
    • Moves ray_service.py to torchrl/envs/transforms/

torch.compile Compatibility

Fixes to enable torch.compile with various TorchRL components. @vmoens

  • [BugFix] Fix value functions for torch.compile compatibility (3bdc7b1)

    • 295 lines of new tests in test/compile/test_value.py
    • Fixes value function implementations for compile compatibility
  • [BugFix] Fix TDLambdaEstimator for torch.compile compatibility (11e22ee)

    • Updates TDLambdaEstimator to work with torch.compile
  • [Feature] Add compile-aware timing and profiling helpers (764d9f7)

    • New utilities for profiling compiled code
  • [Refactor] Add compile-aware timing and profiling helpers (c82447e)

    • Performance utilities that work correctly under torch.compile

Features

  • [Feature] Added EXP3 Scoring function (#3013) @ParamThakkar123

    • Implements the EXP3 (Exponential-weight algorithm for Exploration and Exploitation) scoring function for MCTS
    • Adds torchrl/data/map/ module with hash-based storage, query utilities, and tree structures for efficient state lookups
  • [Feature] Add num_workers parameter to BraxEnv (#3370) @ParamThakkar123

    • Allows running multiple Brax environments in parallel via ParallelEnv by specifying num_workers > 1
    • Consistent API with GymEnv and DMControlEnv multi-env support
  • [Feature] Added num_envs parameter in GymEnv (#3343) @ParamThakkar123

    • When num_envs > 1, GymEnv automatically returns a lazy ParallelEnv wrapping multiple environment instances
    • Simplifies multi-env setup without manually constructing ParallelEnv
  • [Environments] Added Procgen environments (#3331) @ParamThakkar123

    • New ProcgenWrapper and ProcgenEnv classes to wrap OpenAI Procgen environments
    • Converts Procgen observations to TorchRL-compliant TensorDict outputs
    • Supports all 16 Procgen game environments (coinrun, starpilot, etc.)
  • [Feature] Loss make_value_estimator takes a ValueEstimatorBase class (#3336) @ParamThakkar123

    • Allows passing a ValueEstimatorBase class directly to make_value_estimator() for more flexible value estimation configuration
    • Adds tests and documentation for the new API
  • [Feature] Make custom_range public in ActionDiscretizer (#3333) @vmoens

    • Exposes custom_range parameter in ActionDiscretizer for user-defined discretization ranges
  • [Feature] Added num_envs parameter in DMControlEnv (#3337) @ParamThakkar123

    • Similar to GymEnv, allows running multiple DMControl environments via num_workers parameter
    • Returns a lazy ParallelEnv when num_workers > 1
  • [Feature] Auto-configure exploration module specs from environment (#3317) @bsprenger

    • Collectors now automatically configure exploration module specs (like AdditiveGaussianModule) from the environment's action spec
    • Adds set_exploration_modules_spec_from_env() utility function
    • Reduces boilerplate when using exploration modules with delayed spec initialization
  • [Feature] Add NPU Support for Single Agent (#3229) @lowdy1

    • Adds Huawei NPU (Ascend) device support for single-agent training
    • Updates device handling logic to recognize NPU devices
  • [Feature] Add support for trackio (#3196) @Xmaster6y

    • Integrates with trackio for experiment tracking and logging
    • New TrackioLogger class for seamless integration
  • [Feature] Add a new Trainer hook point process_loss (#3259) @Xmaster6y

    • Adds a new hook point in the Trainer that runs after loss computation but before backward pass
    • Useful for loss modification, logging, or gradient accumulation strategies
  • [Feature] Ensure MultiSyncDataCollectors returns data ordered by worker id (#3243) @LCarmi

    • Data from MultiSyncDataCollectors is now consistently ordered by worker ID
    • Makes debugging and analysis easier when working with multi-worker collectors
  • [Feature] Add TanhModuleConfig (#3255) @bsprenger

    • Adds configuration class for TanhModule to support serialization and hydra-style configs
  • [Feature] add TensorDictSequentialConfig (#3248) @bsprenger

    • Adds configuration class for TensorDictSequential modules
  • [Feature] Weight loss outputs when using prioritized sampler (#3235) @vmoens

    • Loss modules now properly weight outputs when using prioritized replay buffers
    • Adds importance_key parameter to loss functions (DQN, DDPG, SAC, TD3, TD3+BC)
    • When importance weights are present in the sampled data, losses are automatically weighted
    • Adds comprehensive tests for prioritized replay buffer integration with all loss functions
  • [Feature] Composite specs can create named tensors with 'zero' and 'rand' (#3214) @louisfaury

    • Composite specs now propagate dimension names when creating tensors via zero() and rand()
  • [Feature] Enable storing rollouts on a different device (#3199) @Xmaster6y

    • Collectors can now store rollouts on a different device than the execution device
    • Useful for CPU storage while running on GPU
  • [Feature] Named dims in Composite (#3174) @vmoens

    • Adds named dimension support to Composite specs via names property and refine_names() method
    • Enables better integration with PyTorch's named tensors
    • Supports ellipsis notation for partial name specification

Bug Fixes

  • [BugFix] Fix TransformersWrapper ChatHistory.full not being set (#3375) @vmoens

  • [BugFix] Fix AsyncEnvPool + LLMCollector with yield_completed_trajectories (#3373) @vmoens

    • Fixes interaction between AsyncEnvPool and LLMCollector when yielding completed trajectories
  • [BugFix] Fix LLM test failures (#3360) @vmoens

    • Comprehensive fixes for LLM-related test failures across multiple modules
  • [BugFix] Fixed MultiSyncCollector set_seed and split_trajs issue (#3352) @ParamThakkar123

    • Fixes seed propagation and trajectory splitting in MultiSyncDataCollector
  • [BugFix] Fix VecNormV2 device gathering (#3368) @vmoens

    • Fixes device handling when gathering statistics in VecNormV2 transform
  • [BugFix] Fix AsyncEnv - LLMCollector integration (#3365) @vmoens

    • Resolves integration issues between async environments and LLM collectors
  • [BugFix] Fix VecNormV2 GPU device handling for stateful mode (#3364) @BY571

    • Fixes GPU device handling in VecNormV2 when using stateful normalization mode
  • [BugFix] Fix Ray collector iterator bug and process group cleanup (#3363) @vmoens

    • Fixes iterator behavior in Ray-based collectors and ensures proper cleanup of process groups
  • [BugFix] Fix LLM CI by replacing uvx with uv package (#3356) @vmoens

  • [Bugfix] Wrong minari download first element (#3106) @marcosgalleterobbva

    • Fixes incorrect first element handling when downloading Minari datasets
  • [BugFix,Test] Fix envpool failing tests (#3345) @vmoens

  • [BugFix] Fixed ParallelEnv + aSyncDataCollector / MultiSyncDataCollector not working if replay_buffer is given (#3341) @ParamThakkar123

    • Fixes compatibility issue when using ParallelEnv with collectors that have a replay buffer attached
  • [BugFix] treat 1-D MultiDiscrete as MultiCategorical and accept flattened masks (#3342) @ParamThakkar123

    • Improves MultiDiscrete action space handling to accept flattened action masks
  • [BugFix] Fixes register_save_hook bug (#3340) @ParamThakkar123

    • Fixes save hook registration in replay buffers
  • [BugFix,Test] recompiles with string failure (#3338) @vmoens

  • [BugFix] Fix Safe modules annotation and doc for losses (#3334) @vmoens

  • [BugFix] Fix ray modules circular import (#3319) @vmoens

  • [BugFix] Fix SACLoss target_entropy="auto" ignoring action space dimensionality (#3292) @vmoens

    • Fixes automatic entropy target computation to properly account for action space dimensions
  • [BugFix] Fix agent_dim in multiagent nets & account for neg dims (#3290) @vmoens

    • Fixes agent dimension handling in multi-agent networks, including support for negative dimension indices
  • [BugFix] Added a missing .to(device) call in _from_transformers_generate_history (#3289) @michalgregor

  • [BugFix] Fix old pytorch dependencies (#3266) @vmoens

    • Updates compatibility shims for older PyTorch versions
  • [BugFix] RSSMRollout not advancing state/belief across time steps in Dreamer (#3236) @cmdout

    • Fixes critical bug where RSSM rollout was not properly advancing hidden states in Dreamer world model
  • [Doc,BugFix] Fix doc and collectors (#3250) @vmoens

  • [BugFix] use correct field names in InitTrackerConfig (#3245) @bsprenger

  • [BugFix] Use torch.zeros for argument in torch.where (#3239) @sebimarkgraf

  • [BugFix] Fix wrong assertion about collector and buffer (#3176) @vmoens

  • [BugFix] AttributeError in accept_remote_rref_udf_invocation (#3168) @vmoens

Bug Fixes (ghstack commits without PR numbers)

  • [BugFix] Final polish for Collector and Exploration (558396d) @vmoens

    • Final fixes for collector and exploration module interactions
  • [BugFix] Collector robustness & Async fixes (cadf23b) @vmoens

    • Improves collector robustness and fixes async collection issues
  • [BugFix] TensorStorage key filtering (0667a58) @vmoens

    • Fixes key filtering in TensorStorage
  • [BugFix] Replay Buffer prefetch & SliceSampler (9d34dbe) @vmoens

    • Fixes prefetching behavior and SliceSampler issues
  • [BugFix] Fix target_entropy computation for composite action specs (06960c6) @vmoens

    • Correctly computes target entropy for composite action spaces
  • [BugFix] Fix device initialization in CrossQLoss.maybe_init_target_entropy (416e454) @vmoens

    • Fixes device placement for CrossQ loss entropy initialization
  • [BugFix] Fix schemes and refactor collectors to make them readable (364e038) @vmoens

    • Fixes weight sync schemes and improves collector code readability
  • [BugFix] Fix collector devices (888095f) @vmoens

    • Fixes device handling in collectors
  • [BugFix] Fix tests (963fdd4) @vmoens

  • [BugFix] Fix GRPO tests and runs (47ad9d8) @vmoens

  • [BugFix] Handle Lock/RLock types in EnvCreator (6985ca2) @vmoens

    • Properly handles threading locks in EnvCreator serialization
  • [BugFix] Defer filter_warnings import to avoid module load issues (8ea954c) @vmoens

  • [BugFix] Fix CUDA sync in forked subprocess (4a2a274) @vmoens

    • Fixes CUDA synchronization issues in forked processes
  • [BugFix] Fix unique ref to lambda func (b6fe45e) @vmoens

  • [BugFix] Add pybind11 check and Windows extension pattern fix (5e89e4e) @vmoens


Additional Features (ghstack commits without PR numbers)

  • [Feature] Storage Shared Initialization for Multiprocessing (61c178e) @vmoens

    • Enables shared storage initialization for multiprocessing collectors
    • Updates distributed collector examples
  • [Feature] Memmap storage cleanup (7f9ea74) @vmoens

    • Improves memmap storage cleanup and resource management
  • [Feature] Collector Profiling (02ed47e) @vmoens

    • Adds profiling capabilities to collectors for performance analysis
  • [Feature] auto_wrap_envs in PEnv (d781f9e) @vmoens

    • Automatic environment wrapping in ParallelEnv
  • [Feature] Auto-wrap lambda functions with EnvCreator for spawn compatibility (d250c18) @vmoens

    • Automatically wraps lambda functions with EnvCreator for spawn multiprocessing compatibility
  • [Feature] Collectors' getattr_policy and getattr_env (fbdbb61) @vmoens

    • Adds attribute access methods for policy and env in collectors
  • [Feature] track_policy_version in collectors.py (a089cc4) @vmoens

    • Adds policy version tracking in collectors for distributed training
  • [Feature] Aggregation strategies (0e6a356) @vmoens

    • New aggregation strategies for multi-worker data collection
  • [Feature] kl_mask_threshold (7ab48a4) @vmoens

    • Adds KL divergence masking threshold for LLM training
  • [Feature] Add timing options (13434eb) @vmoens

    • Additional timing options for performance monitoring
  • [Feature] float32 patch (01d2801) @vmoens

    • Float32 precision handling improvements
  • [Feature] Support callable scale in IndependentNormal and TanhNormal distributions (d3aba7d) @vmoens

    • Allows scale parameter to be a callable in distribution modules

Documentation

  • [Doc] Huge doc refactoring (3d5dd1a) @vmoens

    • Major documentation restructuring with new sections:
      • collectors_basics.rst, collectors_single.rst, collectors_distributed.rst
      • collectors_weightsync.rst, collectors_replay.rst
      • data_datasets.rst, data_replaybuffers.rst, data_samplers.rst
    • Adds pre-commit hook for Sphinx section underline checking
  • [Docs] Update LLM_TEST_ISSUES.md with fix status (#3374) @vmoens

  • [Docs] Enable doc builds and tutorial runs (#3335) @vmoens

    • Re-enables Sphinx-gallery tutorial execution in documentation builds
    • Adds process cleanup between tutorials to prevent resource leaks

CI / Infrastructure

  • [CI] Speed up slow tests in tests-gpu/tests-cpu (#3395) @vmoens

    • Optimizes slow test execution with better parallelization and test isolation
  • [CI] Bump version to 0.11.0 (#3392) @vmoens

  • [CI] Fix auto-label workflow for fork PRs (#3388) @vmoens

  • [CI] Better release workflow (#3386) @vmoens

    • Comprehensive release workflow with sanity checks, wheel collection, docs updates, and PyPI publishing
    • Adds dry-run mode for testing releases
    • Automatically updates stable docs symlink and versions.html
  • [CI] Fix auto-labelling (#3387) @vmoens

  • [CI] Auto-tag PRs (#3381) @vmoens

    • Adds automatic labeling for PRs based on changed files
  • [CI] Add release agent prompt for LLM-assisted releases (#3380) @vmoens

    • Adds comprehensive release instructions for LLM-assisted release automation
  • [CI] Add release workflow with PyPI trusted publishing (#3379) @vmoens

    • Implements PyPI trusted publishing via OIDC authentication
    • Eliminates need for PyPI API tokens in secrets
  • [CI] Add granular label support for environment and data workflow jobs (#3371) @vmoens

  • [CI] Fix PettingZoo CI by updating Python 3.9 to 3.10 (#3362) @vmoens

  • [CI] Fix GenDGRL CI by adding missing requests dependency (#3361) @vmoens

  • [CI] Fix IsaacLab tests by using explicit conda Python path (#3358) @vmoens

  • [CI] Fix M1 build pip command not found (#3359) @vmoens

  • [CI] Fix Windows build for free-threaded Python (3.13t, 3.14t) (#3357) @vmoens

  • [CI,BugFix] Fix Habitat CI by upgrading to Python 3.10 (#3346) @vmoens

    • Upgrades Habitat CI to Python 3.10 and builds habitat-sim from source
  • [CI] Fix Jumanji CI by adding missing requests dependency (#3349) @vmoens

  • [CI] Fix Chess CI by correcting test file path typo (#3353) @vmoens

  • [CI] Skip Windows-incompatible tests in optional deps CI (#3348) @vmoens

  • [CI] Fix GPU benchmark failures (#3347) @vmoens

  • [CI] Upgrade doc python version (#3222) @vmoens

  • [CI] Use pip install (#3200) @vmoens

    • Migrates CI workflows from setup.py install to pip install

CI / Infrastructure (ghstack commits without PR numbers)


Tests

  • [Tests] Check traj_ids shape with unbatched envs (#3393) @vmoens

  • [Tests] Fix test isolation in test_set_gym_environments and related tests (#3382) @vmoens

    • Improves test isolation to prevent cross-test contamination
  • [Test] Use mock Llama tokenizer instead of skipping gated model test (#3376) @vmoens

  • [Test] Skip Llama test when tokenizer unavailable instead of xfail (#3372) @vmoens

  • [Tests] Remove _utils_internal.py in tests (#3281) @vmoens

    • Removes deprecated internal test utilities and updates tests to use public APIs
  • [Test] Check LinearizeReward obs transform (#3241) @vmoens

  • [Test,Benchmark] Move mocking classes file and bench for non-tensor env (#3257) @vmoens

  • [Tests] Fix vmas seeding test (#3210) @matteobettini

  • [Tests] Reintroduce VMAS football (#3178) @matteobettini

Tests (ghstack commits without PR numbers)


Refactors / Maintenance

  • [Refactor] Updated num_envs parameter to num_workers for consistency (#3354) @ParamThakkar123

    • Renames num_envs to num_workers for consistency across environment wrappers
  • [Quality] replace reduce to reduction, better error message for invalid mask (#3179) @Kayzwer

Refactors (ghstack commits without PR numbers)

  • [Refactor] Move WEIGHT_SYNC_TIMEOUT to collectors._constants (3ceb6b9) @vmoens

    • Centralizes weight sync timeout configuration
  • *[Refactor] Remove TensorSpec classes (7ea8d86) @vmoens

    • Removes legacy TensorSpec class aliases
  • [Refactor] Rename collectors (8779f2a) @vmoens

    • Renames collector classes for consistency
  • [Refactor] Non-daemonic processes in PEnv (aeb2e9b) @vmoens

    • Changes ParallelEnv to use non-daemonic processes
  • [Refactor] Make env creator optional for Ray (b599d9b) @vmoens

    • Makes environment creator optional in Ray collectors
  • [Refactor] Move decorate_thread_sub_func to torchrl.testing.mp_helpers (e3e9e6a) @vmoens

    • Moves multiprocessing test helpers to proper location
  • [Refactor] Use non_blocking transfers in distribution modules (5c75777) @vmoens

    • Uses non-blocking GPU transfers in distribution modules for better performance
  • [Refactor] Remove MUJOCO_EGL_DEVICE_ID auto-setting from torchrl init (8fac4c5) @vmoens

  • [Refactor] Add WEIGHT_SYNC_TIMEOUT constant for collector weight synchronization (ab3768a) @vmoens

  • [BugFix,Test,Refactor] Refactor tests (eb8a885) @vmoens

  • [Refactor,Test] Move compile test to dedicated folder (253b8dd) @vmoens


Dependencies

  • [Dependencies] Bump ray from 2.46.0 to 2.52.1 (#3258) @dependabot

  • [Environment] Fix envpool wrapper (#3339) @vmoens

    • Updates envpool wrapper for compatibility with latest envpool versions

Typing

  • [Typing] Edit wrongfully set str type annotations (e7583b3) @vmoens
    • Fixes incorrect string type annotations across the codebase

Other


Contributors

Thanks to all contributors to this release: