Author: Ran Zhu @ Southeast University, China
Email: [email protected]
This repository is still under construction. Till now, the following algorithms are included:
Categories | Algorithm | Environment |
---|---|---|
Tabular Value-based | Sarsa | FrozenLake (Non-gym-official) |
Tabular Value-based | Q-learning | FrozenLake (Non-gym-official) |
Value Approximation | DQN (Nature 2016) | CartPole |
Value Approximation | Double DQN | CartPole |
Value Approximation | DDPG | Pendulum |
Policy Optimization | REINFORCE | CartPole |
Policy Optimization | QAC | CartPole |
Policy Optimization | A2C | CartPole |
Policy Optimization | PPO | CartPole |
Policy Optimization | SAC | Pendulum |
Imitation Learning | Behavior-Cloning | CartPole |
Imitation Learning | GAIL | CartPole |
Imitation Learning | DAgger | CartPole |
Offline Learning | BCQ | CartPole |
Offline Learning | CQL | CartPole |
Besides, this repository is created to relax myself (and to learn something interesting) because of the pressure of my PhD study. I will update it sporadically. If you have any questions, please contact me via email.
In the future, I will add more algorithms, such as offline RL algorithms including several applications of RL in power distribution systems.
This repository is a playground for beginners to learn reinforcement learning. It is a collection of simple environments and agents to get you started with reinforcement learning. Each algorithm is implemented in a single ipynb file with less than 500 lines of code, which written in pytorch and based on the latest gymnasium. The goal of this project is to provide a simple and clear implementation for each algorithm.
If you have already known basic concepts and mathematical notations of reinforcement learning but have no idea how to implement them, this repository is for you. If you are a beginner of reinforcement learning, I recommend you to learn the basic concepts and mathematical notations first.
- Smoothly from Maths to Codes: The names of variables and functions are consistent with the core sprits of each RL algorithm. The main sloop of each algorithm is also consistent with the pseudo code in the textbook.
- No Programming Tricks But Just Algorithms: There is no programming tricks (at least I think so) in the codes. Everything is simplified as much as possible to help you understand the core idea of each algorithm.
- Clear Implementation: The coding style and core architecture refer to the pfrl library which has a clear structure. I think the "pfrl-style" is a good choice for beginners to learn reinforcement learning.
- Modular Unified Architecture: Each code file has a similar structure. You can easily modify the code to implement your own algorithm.
All the dependencies are the current stable versions, so you don't need to worry about the compatibility. All the codes are tested on the following environment:
- python == 3.8
- pytorch == 2.0
- gymnasium == 0.28
- numpy == 1.24
- matplotlib == 3.2
- pandas == 2.0
- tqdm
I would like to thank Dr.Zhao for his unselfish sharing of the course materials (https://www.bilibili.com/video/BV1sd4y167NS/). I would also like to thank the authors of the following repositories:
- [A clear RL tutorial for beginners using pure pytorch] https://github.com/rexrex9/reinforcement_torch_pfrl
- [A powerful DRL library based on pytorch] https://github.com/pfnet/pfrl
- [Codes of David Silver's RL Course] https://github.com/dennybritz/reinforcement-learning