API Change
- Expand the Python version support for DI-engine to Python3.7-Python3.10
Env
- add pistonball MARL env and its unittest/example (#833)
- update trading env (#831)
- update ppo config for better discrete action space performance (#809)
- remove unused config fields in MuJoCo PPO
Algorithm
- add AWR algorithm (#828)
- add encoder in MAVAC (#823)
- add HPT model architecture (#841)
- fix multiple model wrappers reset bug (#846)
- add hybrid action space support to ActionNoiseWrapper (#829)
- fix mappo adv compute bug (#812)
Enhancement
- add resume_training option to allow the envstep and train_iter resume seamlessly (#835)
- polish old/new pipeline DistributedDataParallel (DDP) implementation (#842)
- adapt DingEnvWrapper to gymnasium (#817)
Fix
- fix priority buffer delete bug (#844)
- fix middleware collector env reset bug (#845)
- fix many unittest bugs
Style
- downgrade pyecharts log level to warning and polish installation doc (#838)
- polish necessary requirements
- polish api doc details
- polish DI-engine citation authors
- upgrade CI macos version from 12 to 13
News
- CleanS2S: High-quality and streaming Speech-to-Speech interactive agent in a single file.
- GenerativeRL: Revisiting Generative Policies: A Simpler Reinforcement Learning Algorithmic Perspective
- PRG: Pretrained Reversible Generation as Unsupervised Visual Representation Learning
Full Changelog: v0.5.2...v0.5.3
Contributors: @PaParaZz1 @puyuan1996 @kxzxvbk @YinminZhang @zjowowen @luodi-7 @MarkHolmstrom @TairanMK