Combine LLMs and RL: The LLM reasons about the agent's behavior to solve subtasks and generates higher-level actions, improving RL's sample efficiency.
-
Resource link: https://drive.google.com/file/d/1IdBEGZJh1r4MOn4vMDOZ3CfKn9YAvOzZ/view?usp=sharing
-
The official document of MineDojo environment: https://docs.minedojo.org/sections/getting_started/install.html#prerequisites
-
Create python 3.9 environment in anaconda.
-
Install jdk version 171, otherwise you may see some error with Malmo. The package
jdk-8u171-linux-x64.tar.gz
is in the resource link.- sudo tar -xzvf jdk-8u171-linux-x64.tar.gz -C /usr/local
- export JAVA_HOME=/usr/local/jdk1.8.0_171
-
Install dependencies
sudo apt install xvfb xserver-xephyr python-opengl ffmpeg
- Centos:
sudo yum install xorg-x11-server-Xvfb xorg-x11-server-Xephyr ffmpeg
-
Install OpenGL (Centos)
sudo yum install mesa*
sudo yum install freeglut*
-
Download our repo https://github.com/PKU-RL/MCEnv. Run
python setup.py install
. -
For different tasks, carefully check our fast_reset option.
-
If successfully installed, you can run
MINEDOJO_HEADLESS=1 python validate_install.py
. -
Install MineCLIP:
pip install git+https://github.com/MineDojo/MineCLIP
, or use the package in the resource link. -
Use PyTorch>=1.8.1. Require x-transformers==0.27.1, otherwise the CLIP model cannot be loaded.
-
Check the arguments in train.py. Download the pretrained MineCLIP model
adjust.pth
in the resource link.
-
For PPO, run
MINEDOJO_HEADLESS=1 python train.py
.
--task: the programmatic task name.--exp-name: specify dir name prefix of saved logs and models.
--save-path: this log dir will save models and gifs.
Model, gif videos and experience are saved in checkpoint/. Training configs and logs are saved in data/.
-
Draw training curves: find the training log file progress.txt in data/, move
vis.py
into its directory and run.
-
For milk & wool, the --task is harvest_milk_with_empty_bucket_and_cow and harvest_wool_with_shears_and_sheep.
fig/
shows our training results. -
For other tasks, you may refer to the paper and modify the environment
minecraft.py
, to specify the simulation and reward function.
milk | wool |
---|---|
![]() |
![]() |
-
Open-source RL training framework
-
Fix the environmental issues of different systems
-
The temporal abstraction technique
-
More applications
@article{liu2024rlgpt,
title={{RL-GPT}: Integrating Reinforcement Learning and Code-as-policy},
author={Liu, Shaoteng and Yuan, Haoqi and Hu, Minda and Li, Yanwei and Chen, Yukang and Liu, Shu and Lu, Zongqing and Jia, Jiaya},
journal={arXiv preprint arXiv:2402.19299},
year={2024},
}
- A multi-task agent in Minecraft Plan4MC.
- The first LLM-powered lifelong learning agent in Minecraft Voyager.
- Many practical prompts and tools. in DEPS.
This codebase is under MIT License.