Skip to content

Releases: pytorch/rl

v0.3.0: Data hub, universal env converter and more!

31 Jan 21:40
Compare
Choose a tag to compare

In this release, we focused on building a Data Hub for offline RL, providing a universal 2gym conversion tool (#1795) and improving the doc.

TorchRL Data Hub

TorchRL now offers many offline datasets in robotics and control or gaming, all under a single data format (TED for TorchRL Episode Data Format). All datasets are one step away of being downloaded: dataset = <Name>ExperienceReplay(dataset_id, root="/path/to/storage", download=True) is all you need to get started.
This means that you can now download OpenX #1751 or Roboset #1743 datasets and combine them in a single replay buffer #1768 or swap one another in no time and with no extra code.
We allow many new sampling techniques, like sampling slices of trajectories with or without repetition etc.
As always you can append your favourite transform to these transforms.

TorchRL2Gym universal converter

#1795 introduces a new universal converter for simulation libraries to gym.
As RL practitioner, it's sometimes difficult to accommodate for the many different environment APIs that exist. TorchRL now provides a way of registering any env in gym(nasium). This allows users to build their dataset in torchrl and integrate them in their code base with no effort if they are already using gym as a backend. It also allows to transform DMControl or Brax envs (among others) to gym without the need for an extra library.

PPO and A2C compatibility with distributed models

Functional calls can now be turned off for PPO and A2C loss modules, allowing users to run RLHF training loops at scale! #1804

## TensorDict-free replay buffers

You can now use TorchRL's replay buffer with ANY tensor-based structure, whether it involves dict, tuples or lists. In principle, storing data contiguously on disk given any gym environment is as simple as

rb = ReplayBuffer(storage=LazyMemmapStorage(capacity))
obs_, reward, terminal, truncated, info = env.step(action)
rb.add((obs, obs_, reward, terminal, truncated, info, action))

# sampling a tuple obs, reward, terminal, truncated, info
obs, obs_, reward, terminal, truncated, info = rb.sample()

This is independent of TensorDict and it supports many components of our replay buffers as well as transforms. Check the doc here.

## Multiprocessed replay buffers

TorchRL's replay buffers can now be shared across processes. Multiprocessed RBs can not only be read from but also extended on different workers. #1724

SOTA checks

We introduce a list of scripts to check that our training scripts work ok before each release: #1822

Throughput of Gym and DMControl

We removed loads of checks in GymLikeEnv if some basic conditions are met, which improves the throughput significantly for simple envs. #1803

## Algorithms

We introduce discrete CQL #1666 , discrete IQL #1793 and Impala #1506.

What's Changed: PR description

Read more

v0.2.1: Faster parallel envs, fixes in transforms and M1 wheel fix

25 Oct 17:24
1bb192e
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.2.0...v0.2.1

0.2.0: Faster collection, MARL compatibility and RLHF prototype

05 Oct 16:45
bf264e0
Compare
Choose a tag to compare

TorchRL 0.2.0

This release provides many new features and bug fixes.

TorchRL now publishes Apple Silicon compatible wheels.
We drop coverage of python 3.7 in favour of 3.11.

New and updated algorithms

Most algorithms have been cleaned and designed to reach (at least) SOTA results.

image

Compatibility with MARL settings has been drastically improved, and we provide a good amount of MARL examples within the library:

image

A prototype RLHF training script is also proposed (#1597)

A whole new category of offline RL algorithms have been integrated: Decision transformers.

New features

One of the major new features of the library is the introduction of the terminated / truncated / done distinction at no cost within the library. All third-party and primary environments are now compatible with this, as well as losses and data collection primitives (collector etc). This feature is also compatible with complex data structures, such as those found in MARL training pipelines.

All losses are now compatible with tensordict-free inputs, for a more generic deployment.

New transforms

Atari games can now benefit from a EndOfLifeTransform that allows to use the end-of-life as a done state in the loss (#1605)

We provide a KL transform to add a KL factor to the reward in RLHF settings.

Action masking is made possible through the ActionMask transform (#1421)

VC1 is also integrated for better image embedding.

  • [Feature] Allow sequential transforms to work offline by @vmoens in #1136
  • [Feature] ClipTransform + rename min/maximum -> low/high by @vmoens in #1500
  • [Feature] End-of-life transform by @vmoens in #1605
  • [Feature] KL Transform for RLHF by @vmoens in #1196
  • [Features] Conv3dNet and PermuteTransform by @xmaples in #1398
  • [Feature, Refactor] Scale in ToTensorImage based on the dtype and new from_int parameter by @hyerra in #1208
  • [Feature] CatFrames used as inverse by @BY571 in #1321
  • [Feature] Masking actions by @vmoens in #1421
  • [Feature] VC1 integration by @vmoens in #1211

New models

We provide GRU alongside LSTM for POMDP training.

MARL model coverage is now richer of a MultiAgentMLP and MultiAgentCNN! Other improvments for MARL include coverage for nested keys in most places of the library (losses, data collection, environments...)/

Other features (misc)

New environments and third-party improvements

We now cover SMAC-v2, PettingZoo, IsaacGymEnvs (prototype) and RoboHive. The D4RL dataset can now be used without the eponym library, which permit training with more recent or older versions of gym.

Performance improvements

We provide several speed improvements, in particular for data collection.

image

Read more

v0.1.1

06 May 21:34
6d030c9
Compare
Choose a tag to compare

What's Changed

Read more

v0.1.0 - Beta

16 Mar 20:31
Compare
Choose a tag to compare

First official beta release of the library!

What's Changed

Full Changelog: v0.0.5...v0.1.0

0.0.5

08 Mar 20:58
Compare
Choose a tag to compare

We change the env.step API, see #941 for more info.

What's Changed

New Contributors

Full Changelog: v0.0.4...v0.0.5

v0.0.4-beta

11 Feb 10:28
eec263f
Compare
Choose a tag to compare
v0.0.4-beta Pre-release
Pre-release

What's Changed

  • [CI, Doc] Update functorch source installation command by @zou3519 in #446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in #467
  • [Feature] Cleanup mocking envs init and new by @vmoens in #469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in #435
  • [Logging]: implement MLFlow logging integration by @rayanht in #432
  • [BugFix] MLFlow import fix by @vmoens in #473
  • [BugFix] Fixed pip install by @brandonsj in #475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in #464
  • [Feature]: ModelBased Envs by @nicolas-dufour in #333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
  • [Tutorial] DQN tutorial by @vmoens in #474
  • [Feature] reader hooks for GymLike by @vmoens in #478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in #384
  • [Feature] Replaced device_safe() with device by @ordinskiy in #485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in #456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in #480
  • [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
  • [BugFix] removing unwanted device_safe() by @vmoens in #486
  • [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
  • [Feature] VIP Integration by @JasonMa2016 in #487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in #491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in #477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in #494
  • [BugFix] Multi-agent fixes by @vmoens in #488
  • [BugFix] Defaulting passing_devices to None by @vmoens in #495
  • [Feature] Lazy initialization of CatTensors by @vmoens in #497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in #498
  • [BugFix] Migration to pytorch org by @vmoens in #499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
  • [BugFix] python version for linting checks by @vmoens in #502
  • [Feature] Replay Buffers refactor by @bamaxw in #330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in #512
  • [Lint] re-instantiate F821 by @vmoens in #516
  • [BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
  • [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
  • [CI] Add coverage with codecov by @silvestrebahi in #523
  • Revert "[CI] Add coverage with codecov" by @vmoens in #525
  • [Quality] Use relative imports for local c++ deps by @apbard in #526
  • [Feature] Nightly release by @vmoens in #519
  • [Feature] Add make_tensordict() function by @sicong-huang in #522
  • [Doc] Misc readme fixes by @GavinPHR in #532
  • [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
  • [Doc] Add coverage banner by @vmoens in #533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
  • [BugFix] Fix optional imports by @vmoens in #535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in #521
  • [Lint] reorganize imports by @apbard in #545
  • [BugFix] Single-cpu compatibility by @vmoens in #548
  • [BugFix] vision install and other deps in optdeps by @vmoens in #552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in #524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in #560
  • [BugFix] Disabling video step for wandb by @vmoens in #561
  • [BugFix] Various device fix by @vmoens in #558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
  • [BugFix] Fix push binary nightly action by @psolikov in #566
  • [BugFix] TensorDict comparison by @vmoens in #567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in #571
  • [Doc] Banners on README.md by @vmoens in #572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
  • [BugFix] Add eps to reward normalization by @vmoens in #574
  • [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
  • [Feature] Restructure torchrl/objectives by @sgrigory in #580
  • [Docs] Documentation revamp by @vmoens in #581
  • [Doc] Publishing on pytorch.org by @vmoens in #582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
  • [Doc] Publishing on pytorch.org by @vmoens in #585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
  • [Doc] Publishing on pytorch.org by @vmoens in #587
  • [Feature] More restrictive tests on docstrings by @vmoens in #457
  • [BugFix] Wrong stack import in tests by @vmoens in #590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in #589
  • [Feature]: Dreamer support by @nicolas-dufour in #341
  • [Doc] Missing doc for prototype RB by @vmoens in #595
  • [Feature] Update list of supported libraries by @vmoens in #594
  • [BugFix] Fix timeit count registration by @vmoens in #598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in #603
  • [Feature] Categorical encoding for action space by @artkorenev in #593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in #621
  • [Doc] Integrate knowledge base in docs by @hatala91 in #622
  • [Doc] Updating docs requirements by @vmoens in #624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
  • [Feature] Habitat integration by @vmoens in #514
  • [Feature] Checkpointing by @vmoens in #549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in #608
  • [Version] Updating to torch 1.13 by @vmoens in #627
  • [Feature] Sub-memmap tensors by @vmoens in #626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631
  • [F...
Read more

v0.0.4

11 Feb 22:15
4a74149
Compare
Choose a tag to compare
v0.0.4 Pre-release
Pre-release

What's Changed

  • [CI, Doc] Update functorch source installation command by @zou3519 in #446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in #467
  • [Feature] Cleanup mocking envs init and new by @vmoens in #469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in #435
  • [Logging]: implement MLFlow logging integration by @rayanht in #432
  • [BugFix] MLFlow import fix by @vmoens in #473
  • [BugFix] Fixed pip install by @brandonsj in #475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in #464
  • [Feature]: ModelBased Envs by @nicolas-dufour in #333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
  • [Tutorial] DQN tutorial by @vmoens in #474
  • [Feature] reader hooks for GymLike by @vmoens in #478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in #384
  • [Feature] Replaced device_safe() with device by @ordinskiy in #485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in #456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in #480
  • [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
  • [BugFix] removing unwanted device_safe() by @vmoens in #486
  • [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
  • [Feature] VIP Integration by @JasonMa2016 in #487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in #491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in #477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in #494
  • [BugFix] Multi-agent fixes by @vmoens in #488
  • [BugFix] Defaulting passing_devices to None by @vmoens in #495
  • [Feature] Lazy initialization of CatTensors by @vmoens in #497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in #498
  • [BugFix] Migration to pytorch org by @vmoens in #499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
  • [BugFix] python version for linting checks by @vmoens in #502
  • [Feature] Replay Buffers refactor by @bamaxw in #330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in #512
  • [Lint] re-instantiate F821 by @vmoens in #516
  • [BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
  • [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
  • [CI] Add coverage with codecov by @silvestrebahi in #523
  • Revert "[CI] Add coverage with codecov" by @vmoens in #525
  • [Quality] Use relative imports for local c++ deps by @apbard in #526
  • [Feature] Nightly release by @vmoens in #519
  • [Feature] Add make_tensordict() function by @sicong-huang in #522
  • [Doc] Misc readme fixes by @GavinPHR in #532
  • [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
  • [Doc] Add coverage banner by @vmoens in #533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
  • [BugFix] Fix optional imports by @vmoens in #535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in #521
  • [Lint] reorganize imports by @apbard in #545
  • [BugFix] Single-cpu compatibility by @vmoens in #548
  • [BugFix] vision install and other deps in optdeps by @vmoens in #552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in #524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in #560
  • [BugFix] Disabling video step for wandb by @vmoens in #561
  • [BugFix] Various device fix by @vmoens in #558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
  • [BugFix] Fix push binary nightly action by @psolikov in #566
  • [BugFix] TensorDict comparison by @vmoens in #567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in #571
  • [Doc] Banners on README.md by @vmoens in #572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
  • [BugFix] Add eps to reward normalization by @vmoens in #574
  • [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
  • [Feature] Restructure torchrl/objectives by @sgrigory in #580
  • [Docs] Documentation revamp by @vmoens in #581
  • [Doc] Publishing on pytorch.org by @vmoens in #582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
  • [Doc] Publishing on pytorch.org by @vmoens in #585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
  • [Doc] Publishing on pytorch.org by @vmoens in #587
  • [Feature] More restrictive tests on docstrings by @vmoens in #457
  • [BugFix] Wrong stack import in tests by @vmoens in #590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in #589
  • [Feature]: Dreamer support by @nicolas-dufour in #341
  • [Doc] Missing doc for prototype RB by @vmoens in #595
  • [Feature] Update list of supported libraries by @vmoens in #594
  • [BugFix] Fix timeit count registration by @vmoens in #598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in #603
  • [Feature] Categorical encoding for action space by @artkorenev in #593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in #621
  • [Doc] Integrate knowledge base in docs by @hatala91 in #622
  • [Doc] Updating docs requirements by @vmoens in #624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
  • [Feature] Habitat integration by @vmoens in #514
  • [Feature] Checkpointing by @vmoens in #549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in #608
  • [Version] Updating to torch 1.13 by @vmoens in #627
  • [Feature] Sub-memmap tensors by @vmoens in #626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631
  • [F...
Read more

v0.0.4-alpha

23 Jan 21:00
Compare
Choose a tag to compare
v0.0.4-alpha Pre-release
Pre-release

What's Changed

  • [CI, Doc] Update functorch source installation command by @zou3519 in #446
  • [BugFix] TransformedEnv attributes inheritance by @vmoens in #467
  • [Feature] Cleanup mocking envs init and new by @vmoens in #469
  • [Tests] Adding tensordict __repr__ tests by @sladebot in #435
  • [Logging]: implement MLFlow logging integration by @rayanht in #432
  • [BugFix] MLFlow import fix by @vmoens in #473
  • [BugFix] Fixed pip install by @brandonsj in #475
  • [Features]: Changed _inplace_update cls parameter passing in __new__ by @nicolas-dufour in #464
  • [Feature]: ModelBased Envs by @nicolas-dufour in #333
  • [Feature] make ReplayBufferTrainer compatible with storing trajectories by @vmoens in #476
  • [Tutorial] DQN tutorial by @vmoens in #474
  • [Feature] reader hooks for GymLike by @vmoens in #478
  • [BugFix] TensorSpec.zero(None) failure fix by @vmoens in #483
  • [Feature]: Support for planners and CEM by @nicolas-dufour in #384
  • [Feature] Replaced device_safe() with device by @ordinskiy in #485
  • [Feature]: TensorDictPrimer transform by @nicolas-dufour in #456
  • [Feature]: erase() method for torchrl.timeit by @nicolas-dufour in #480
  • [Feature] Added support for single collector in sync_async_collector by @nicolas-dufour in #482
  • [BugFix] removing unwanted device_safe() by @vmoens in #486
  • [Refactoring] Refactored get_stats_random_rollout by @nicolas-dufour in #481
  • [Feature] VIP Integration by @JasonMa2016 in #487
  • [Refactoring] Minor tweaks to recorder and logger by @nicolas-dufour in #489
  • [Feature]: Deactivate typechecks in envs by @nicolas-dufour in #490
  • [BugFix] Vectorized td_lambda with gamma tensor does not match the serial version by @vmoens in #400
  • [BugFix] Fix TensorDictPrimer init by @vmoens in #491
  • [Feature] Optional auto-reset when done for collectors and batched envs by @vmoens in #492
  • [BugFix] Defaulting passing_devices to None by @himjohntang in #477
  • Revert "[BugFix] Defaulting passing_devices to None" by @vmoens in #494
  • [BugFix] Multi-agent fixes by @vmoens in #488
  • [BugFix] Defaulting passing_devices to None by @vmoens in #495
  • [Feature] Lazy initialization of CatTensors by @vmoens in #497
  • [Cleanup] Removing cuda 10.2 references by @vmoens in #498
  • [BugFix] Migration to pytorch org by @vmoens in #499
  • [Refactoring] Import at root to enable vmap monkey-patching by @vmoens in #500
  • [BugFix] python version for linting checks by @vmoens in #502
  • [Feature] Replay Buffers refactor by @bamaxw in #330
  • [Feature] Rename step_tensordict in step_mdp by @romainjln in #512
  • [Lint] re-instantiate F821 by @vmoens in #516
  • [BugFix] run_type_checks for TransformedEnvs by @vmoens in #513
  • [BugFix] making first_dim and last_dim negative in FlattenObservation when a parent is set by @vmoens in #511
  • [Feature] Add info dict key-spec pairs to observation_spec by @tcbegley in #504
  • [BugFix] Changing the dm_control import to fail if not installed by @zeenolife in #515
  • [CI] Add coverage with codecov by @silvestrebahi in #523
  • Revert "[CI] Add coverage with codecov" by @vmoens in #525
  • [Quality] Use relative imports for local c++ deps by @apbard in #526
  • [Feature] Nightly release by @vmoens in #519
  • [Feature] Add make_tensordict() function by @sicong-huang in #522
  • [Doc] Misc readme fixes by @GavinPHR in #532
  • [BugFix] Replacing inference_mode decorator with no_grad to fix state_dict loading error by @GavinPHR in #530
  • [BugFix] Transformed ParallelEnv meta data are broken when passing to device by @vmoens in #531
  • [Doc] Add coverage banner by @vmoens in #533
  • [BugFix] Fix colab link of coding_dqn.ipynb by @Benjamin-eecs in #543
  • [BugFix] Fix optional imports by @vmoens in #535
  • [BugFix] Restore missing keys in data collector output by @tcbegley in #521
  • [Lint] reorganize imports by @apbard in #545
  • [BugFix] Single-cpu compatibility by @vmoens in #548
  • [BugFix] vision install and other deps in optdeps by @vmoens in #552
  • [Feature] Implemented device argument for modules.models by @yushiyangk in #524
  • [BugFix] Fix ellipsis indexing of 2d TensorDicts by @vmoens in #559
  • [BugFix] Additive gaussian exploration spec fix by @vmoens in #560
  • [BugFix] Disabling video step for wandb by @vmoens in #561
  • [BugFix] Various device fix by @vmoens in #558
  • [Feature] Allow collectors to accept regular modules as policies by @tcbegley in #546
  • [BugFix] Fix push binary nightly action by @psolikov in #566
  • [BugFix] TensorDict comparison by @vmoens in #567
  • [BugFix] Fix SyncDataCollector reset by @jrobine in #571
  • [Doc] Banners on README.md by @vmoens in #572
  • [Feature] Log printing in alphabetical order when creating a replay buffer by @nikhlrao in #573
  • [BugFix] Add eps to reward normalization by @vmoens in #574
  • [BugFix] Fix argument for PPOLoss.get_entropy_bonus() by @vmoens in #578
  • [Feature] Restructure torchrl/objectives by @sgrigory in #580
  • [Docs] Documentation revamp by @vmoens in #581
  • [Doc] Publishing on pytorch.org by @vmoens in #582
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in #584
  • [Doc] Publishing on pytorch.org by @vmoens in #585
  • Revert "[Doc] Publishing on pytorch.org" by @vmoens in #586
  • [Doc] Publishing on pytorch.org by @vmoens in #587
  • [Feature] More restrictive tests on docstrings by @vmoens in #457
  • [BugFix] Wrong stack import in tests by @vmoens in #590
  • [Feature] Exclude "_" out_keys in tensordictmodel by @jlesuffleur in #589
  • [Feature]: Dreamer support by @nicolas-dufour in #341
  • [Doc] Missing doc for prototype RB by @vmoens in #595
  • [Feature] Update list of supported libraries by @vmoens in #594
  • [BugFix] Fix timeit count registration by @vmoens in #598
  • [Naming] Renaming ProbabilisticTensorDictModule keys by @vmoens in #603
  • [Feature] Categorical encoding for action space by @artkorenev in #593
  • [BugFix] ReplayBuffer's storage now signal back when changes happen by @paulomarciano in #614
  • [Doc] Typos in tensordict tutorial by @PaLeroy in #621
  • [Doc] Integrate knowledge base in docs by @hatala91 in #622
  • [Doc] Updating docs requirements by @vmoens in #624
  • [Feature] Make torchrl runnable without functorch and with gym==0.13 by @vmoens in #386
  • [Feature] Habitat integration by @vmoens in #514
  • [Feature] Checkpointing by @vmoens in #549
  • Add support for null dim argument in TensorDict.squeeze by @jgonik in #608
  • [Version] Updating to torch 1.13 by @vmoens in #627
  • [Feature] Sub-memmap tensors by @vmoens in #626
  • [BugFix] copy_ changes the index if the dest and source memmap tensors share the same file location by @vmoens in #631
  • [F...
Read more

v0.0.3

21 Nov 21:33
Compare
Choose a tag to compare
v0.0.3 Pre-release
Pre-release

The main changes introduced by this release are:

  • dependency on the standalone tensordict repo;
  • refactoring of the "next" API

What's Changed

  • [Versioning] MacOs versioning and release bugfix by @vmoens in #247
  • [Versioning] Setup metadata by @vmoens in #248
  • [BugFix] Fix setup instructions by @vmoens in #250
  • [BugFix] Fix a bug when segment_tree size is exactly 2^N by @xiaomengy in #251
  • [Feature] Added test for RewardRescale transform by @nicolas-dufour in #252
  • [Feature] Empty TensorDict population in loops by @vmoens in #253
  • [BugFix] Memmap del bugfix by @vmoens in #254
  • [Feature] Implement padding for tensordicts by @ajhinsvark in #257
  • [BugFix]: recursion error when calling permute(...).to_tensordict() by @vmoens in #260
  • [Feature] Differentiable PPOLoss for IRL by @vmoens in #240
  • [BugFix]: avoid deleting true in_keys in TensorDictSequence by @vmoens in #261
  • [Feature] Add issue and pull request template by @Benjamin-eecs in #263
  • [Feature] Nested tensordicts by @vmoens in #256
  • [Feature]: Index nested tensordicts using tuples by @vmoens in #262
  • [Feature]: flatten nested tensordicts by @vmoens in #264
  • [Test]: test nested CompositeSpec by @vmoens in #265
  • [Test]: test squeezed TensorDict by @vmoens in #269
  • [Doc] Added TensorDict tutorial by @nicolas-dufour in #255
  • [Test]: TensorDict: test tensordict created on cuda and sub-tensordict indexed along 2nd dimension by @vmoens in #268
  • Refactor the torch.stack with destination by @khmigor in #245
  • [Feature]: faster meta-tensor API for TensorDict by @vmoens in #272
  • [Feature]: Refactored logging to be able to support other loggers easily by @nicolas-dufour in #270
  • Small tweaks to make the replay buffer code more consistent by @shagunsodhani in #275
  • [BugFix]: Minor bugs in docstrings by @vmoens in #276
  • [Doc]: TorchRL demo by @vmoens in #284
  • [BugFix]: update wrong links in issue and pull request template by @Benjamin-eecs in #286
  • [BugFix]: quickfix: force gym 0.24 installation until issue with rendering is resolved by @vmoens in #283
  • [Doc]: remove pip install from CONTRIBUTING.md by @vmoens in #288
  • [Feature]: faster safetanh transform via C++ bindings by @vmoens in #289
  • [BugFix]: fix GLFW3 error when installing dm_control by @vmoens in #291
  • [BugFix]: Fix examples by @vmoens in #290
  • [Doc] Simplify PR template by @vmoens in #292
  • [BugFix]: Replay buffer bugfixes by @vmoens in #294
  • [Doc] MacOs M1 troubleshooting by @ramonmedel in #296
  • [Feature]: Improving training efficiency by @vmoens in #293
  • [Feature] Wandb logger by @nicolas-dufour in #274
  • [QuickFix]: update issue and pr template by @Benjamin-eecs in #303
  • [Test] tests for BinarizeReward by @srikanthmg85 in #302
  • [BugFix]: L2-priority for PRB by @vmoens in #305
  • [Feature] Transforms: Compose.insert and TransformedEnv.insert_transform by @rmartimov in #304
  • [BugFix] Fix flaky test by waiting for procs instead of sleep by @nairbv in #306
  • [BugFix] Fix a build warning, setuptools/distutils import order by @nairbv in #307
  • ufmt issue if imports in order requested by distutils by @nairbv in #308
  • [BugFix]: Conda to pip for circleci by @vmoens in #310
  • [BugFix] Support list-based boolean masks for TensorDict by @benoitdescamps in #299
  • [Feature] Truly invertible tensordict permutation of dimensions by @ramonmedel in #295
  • [Doc] Tensordictmodule tutorial by @nicolas-dufour in #267
  • [Feature] Rename _TensorDict into TensorDictBase by @yoavnavon in #316
  • [Release]: v0.0.1b versioning by @vmoens in #317
  • [Feature] Adding additional checks to TensorDict.view to remove unnecessary ViewedTensorDict object creation by @bamaxw in #319
  • [BugFix]: Safe state normalization when std=0 by @vmoens in #323
  • [BugFix]: gradient propagation in advantage estimates by @vmoens in #322
  • [BugFix]: make training example gracefully exit by @vmoens in #326
  • [Setup]: Exclude tutorials from wheels by @vmoens in #325
  • [BugFix]: Tensor map for subtensordict.set_ by @vmoens in #324
  • [Versioning]: Wheels v0.0.1c by @vmoens in #327
  • [BugFix] Fixed compose which ignored inv_transforms of child by @nicolas-dufour in #328
  • [BugFix] functorch installation in CircleCI by @vmoens in #336
  • [Refactor] VecNorm inference API by @vmoens in #337
  • [BugFix] TransformedEnv sets added Transforms into eval mode by @alexanderlobov in #331
  • [Refactor] make to_tensordict() create a copy of the content by @nicolas-dufour in #334
  • [CircleCI] Fix dm_control rendering by @vmoens in #339
  • [BugFix]: joining processes when they're done by @vmoens in #311
  • [Test] pass the OS error in case the file isn't closed by @tongbaojia in #344
  • [Feature] Make default rollout tensordict contiguous by @vmoens in #343
  • [BugFix] Clone memmap tensors on regular tensors and other replay buffer improvements by @vmoens in #340
  • [CI] Using latest gym by @vmoens in #346
  • [Doc] Coding your first DDPG tutorial by @vmoens in #345
  • [Doc] Minor: typos in DDPG by @vmoens in #354
  • [Feature] Register lambda and gamma in buffers by @vmoens in #353
  • [Feature] Implement eq for TensorSpec by @omikad in #358
  • [Doc] Multi-tasking tutorial by @vmoens in #352
  • [Feature] Env refactoring for model based RL by @nicolas-dufour in #315
  • [Feature]: Added support for TensorDictSequence module subsampling by @nicolas-dufour in #332
  • [BugFix] Add lock to vec norm transform by @jaschmid-fb in #356
  • [Perf]: Improve PPO training performance by @vmoens in #297
  • [BugFix] Functorch-Tensordict bug fixes by @vmoens in #361
  • Revert "[BugFix] Functorch-Tensordict bug fixes" by @vmoens in #362
  • [BugFix] Functorch-Tensordict bug fixes by @vmoens in #363
  • [Feature] CSVLogger (ABBANDONED) by @vmoens in #371
  • [Feature] Support tensor-based decay in TD-lambda by @tcbegley in #360
  • [Feature] CSVLogger by @vmoens in #372
  • [BugFix] Fewer env instantiations for better mujoco rendering by @vmoens in #378
  • [Feature] change imports of environment libraries (gym and dm_control) at lower levels by @guabao in #379
  • [BugFix] Representation of indexed nested tensordict by @vmoens in #370
  • [BugFix] In-place __setitem__ for SubTensorDict by @vmoens in #369
  • [Feature] Add ProbabilisticTensorDictModule dist key mapping support by @nicolas-dufour in #376
  • [Feature]: R3M integration by @vmoens in #321
  • [Feature] static_seed flag for envs, vectorized envs and collectors by @vmoens in #385
  • [Feature] AdditiveGaussian exploration strategy by @vmoens in #388
  • [Feature] Multi-images R3M by @vmoens in #389
  • [Feature] Flatten multi-images in R3M by @vmoens in #391
  • [Quality] Code cleanup for fbsync by @vmoens in #392
  • [Feature] In-house functional modules for TorchRL using TensorDict by @vmoens in https://github.com/pytorch...
Read more