RL-GPT open sourced RL training

Combine LLMs and RL: The LLM reasons about the agent's behavior to solve subtasks and generates higher-level actions, improving RL's sample efficiency.

Install

Resource link: https://drive.google.com/file/d/1IdBEGZJh1r4MOn4vMDOZ3CfKn9YAvOzZ/view?usp=sharing
The official document of MineDojo environment: https://docs.minedojo.org/sections/getting_started/install.html#prerequisites
Create python 3.9 environment in anaconda.
Install jdk version 171, otherwise you may see some error with Malmo. The package jdk-8u171-linux-x64.tar.gz is in the resource link.
- sudo tar -xzvf jdk-8u171-linux-x64.tar.gz -C /usr/local
- export JAVA_HOME=/usr/local/jdk1.8.0_171
Install dependencies
- sudo apt install xvfb xserver-xephyr python-opengl ffmpeg
- Centos: sudo yum install xorg-x11-server-Xvfb xorg-x11-server-Xephyr ffmpeg
Install OpenGL (Centos)
- sudo yum install mesa*
- sudo yum install freeglut*
Download our repo https://github.com/PKU-RL/MCEnv. Run python setup.py install.
For different tasks, carefully check our fast_reset option.
If successfully installed, you can run MINEDOJO_HEADLESS=1 python validate_install.py.
Install MineCLIP: pip install git+https://github.com/MineDojo/MineCLIP, or use the package in the resource link.
Use PyTorch>=1.8.1. Require x-transformers==0.27.1, otherwise the CLIP model cannot be loaded.
Check the arguments in train.py. Download the pretrained MineCLIP model adjust.pth in the resource link.

PPO-Training

For PPO, run MINEDOJO_HEADLESS=1 python train.py.
--task: the programmatic task name.

--exp-name: specify dir name prefix of saved logs and models.

--save-path: this log dir will save models and gifs.

Model, gif videos and experience are saved in checkpoint/. Training configs and logs are saved in data/.
Draw training curves: find the training log file progress.txt in data/, move vis.py into its directory and run.

Results

For milk & wool, the --task is harvest_milk_with_empty_bucket_and_cow and harvest_wool_with_shears_and_sheep. fig/ shows our training results.
For other tasks, you may refer to the paper and modify the environment minecraft.py, to specify the simulation and reward function.

milk	wool

TODO

Open-source RL training framework
Fix the environmental issues of different systems
The temporal abstraction technique
More applications

Citation

@article{liu2024rlgpt,
title={{RL-GPT}: Integrating Reinforcement Learning and Code-as-policy}, 
author={Liu, Shaoteng and Yuan, Haoqi and Hu, Minda and Li, Yanwei and Chen, Yukang and Liu, Shu and Lu, Zongqing and Jia, Jiaya},
journal={arXiv preprint arXiv:2402.19299}, 
year={2024},
}

Acknowledgement

A multi-task agent in Minecraft Plan4MC.
The first LLM-powered lifelong learning agent in Minecraft Voyager.
Many practical prompts and tools. in DEPS.

License

This codebase is under MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
fig		fig
mineagent		mineagent
mineclip_official		mineclip_official
openai/clip-vit-base-patch16		openai/clip-vit-base-patch16
spinup_utils		spinup_utils
README.md		README.md
dqn_code.zip		dqn_code.zip
minecraft.py		minecraft.py
negative_prompts.txt		negative_prompts.txt
ppo_selfimitate_clip.py		ppo_selfimitate_clip.py
train.py		train.py
train_find_lowlevel.zip		train_find_lowlevel.zip
utils.py		utils.py
validate_install.py		validate_install.py
vis.py		vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RL-GPT open sourced RL training

Contents

Install

PPO-Training

Results

TODO

Citation

Acknowledgement

License

About

Uh oh!

Releases

Packages

Languages

dvlab-research/RL-GPT

Folders and files

Latest commit

History

Repository files navigation

RL-GPT open sourced RL training

Contents

Install

PPO-Training

Results

TODO

Citation

Acknowledgement

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages