Skip to content

Commit f06c9b1

Browse files
authored
add snake environment
add snake environment
2 parents 6300cc6 + 401569b commit f06c9b1

36 files changed

+2931
-82
lines changed

Gallery.md

+15-14
Original file line numberDiff line numberDiff line change
@@ -54,18 +54,19 @@ Users are also welcome to contribute their own training examples and demos to th
5454

5555
<div align="center">
5656

57-
| Environment/Demo | Tags | Refs |
58-
|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------------------------------------------------------------------------------------------:|:-------------------------------:|
59-
| [MuJoCo](https://github.com/deepmind/mujoco)<br> <img width="300px" height="auto" src="./docs/images/mujoco.png"> | ![continuous](https://img.shields.io/badge/-continous-green) | [code](./examples/mujoco/) |
60-
| [CartPole](https://gymnasium.farama.org/environments/classic_control/cart_pole/)<br> <img width="300px" height="auto" src="./docs/images/cartpole.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/cartpole/) |
61-
| [MPE: Simple Spread](https://pettingzoo.farama.org/environments/mpe/simple_spread/)<br> <img width="300px" height="auto" src="./docs/images/simple_spread_trained.gif"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![MARL](https://img.shields.io/badge/-MARL-yellow) | [code](./examples/mpe/) |
62-
| [StarCraft II](https://github.com/oxwhirl/smac)<br> <img width="300px" height="auto" src="./docs/images/smac.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![MARL](https://img.shields.io/badge/-MARL-yellow) | [code](./examples/smac/) |
63-
| [Chat Bot](https://openrl-docs.readthedocs.io/en/latest/quick_start/train_nlp.html)<br> <img width="300px" height="auto" src="./docs/images/chat.gif"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![NLP](https://img.shields.io/badge/-NLP-green) ![Transformer](https://img.shields.io/badge/-Transformer-blue) | [code](./examples/nlp/) |
64-
| [Atari Pong](https://gymnasium.farama.org/environments/atari/pong/)<br> <img width="300px" height="auto" src="./docs/images/pong.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![image](https://img.shields.io/badge/-image-red) | [code](./examples/atari/) |
65-
| [PettingZoo: Tic-Tac-Toe](https://pettingzoo.farama.org/environments/classic/tictactoe/)<br> <img width="300px" height="auto" src="./docs/images/tic-tac-toe.jpeg"> | ![selfplay](https://img.shields.io/badge/-selfplay-blue) ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/selfplay/) |
66-
| [DeepMind Control](https://shimmy.farama.org/environments/dm_control/)<br> <img width="300px" height="auto" src="https://shimmy.farama.org/_images/dm_locomotion.png"> | ![continuous](https://img.shields.io/badge/-continous-green) | [code](./examples/dm_control/) |
67-
| [Omniverse Isaac Gym](https://github.com/NVIDIA-Omniverse/OmniIsaacGymEnvs)<br> <img width="300px" height="auto" src="https://user-images.githubusercontent.com/34286328/171454189-6afafbff-bb61-4aac-b518-24646007cb9f.gif"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/isaac/) |
68-
| [GridWorld](./examples/gridworld/)<br> <img width="300px" height="auto" src="./docs/images/gridworld.jpg"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/gridworld/) |
69-
| [Super Mario Bros](https://github.com/Kautenja/gym-super-mario-bros)<br> <img width="300px" height="auto" src="https://user-images.githubusercontent.com/2184469/40948820-3d15e5c2-6830-11e8-81d4-ecfaffee0a14.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![image](https://img.shields.io/badge/-image-red) | [code](./examples/super_mario/) |
70-
| [Gym Retro](https://github.com/openai/retro)<br> <img width="300px" height="auto" src="./docs/images/gym-retro.jpg"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![image](https://img.shields.io/badge/-image-red) | [code](./examples/retro/) |
57+
| Environment/Demo | Tags | Refs |
58+
|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------------------:|
59+
| [MuJoCo](https://github.com/deepmind/mujoco)<br> <img width="300px" height="auto" src="./docs/images/mujoco.png"> | ![continuous](https://img.shields.io/badge/-continous-green) | [code](./examples/mujoco/) |
60+
| [CartPole](https://gymnasium.farama.org/environments/classic_control/cart_pole/)<br> <img width="300px" height="auto" src="./docs/images/cartpole.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/cartpole/) |
61+
| [MPE: Simple Spread](https://pettingzoo.farama.org/environments/mpe/simple_spread/)<br> <img width="300px" height="auto" src="./docs/images/simple_spread_trained.gif"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![MARL](https://img.shields.io/badge/-MARL-yellow) | [code](./examples/mpe/) |
62+
| [StarCraft II](https://github.com/oxwhirl/smac)<br> <img width="300px" height="auto" src="./docs/images/smac.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![MARL](https://img.shields.io/badge/-MARL-yellow) | [code](./examples/smac/) |
63+
| [Chat Bot](https://openrl-docs.readthedocs.io/en/latest/quick_start/train_nlp.html)<br> <img width="300px" height="auto" src="./docs/images/chat.gif"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![NLP](https://img.shields.io/badge/-NLP-green) ![Transformer](https://img.shields.io/badge/-Transformer-blue) | [code](./examples/nlp/) |
64+
| [Atari Pong](https://gymnasium.farama.org/environments/atari/pong/)<br> <img width="300px" height="auto" src="./docs/images/pong.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![image](https://img.shields.io/badge/-image-red) | [code](./examples/atari/) |
65+
| [PettingZoo: Tic-Tac-Toe](https://pettingzoo.farama.org/environments/classic/tictactoe/)<br> <img width="300px" height="auto" src="./docs/images/tic-tac-toe.jpeg"> | ![selfplay](https://img.shields.io/badge/-selfplay-blue) ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/selfplay/) |
66+
| [DeepMind Control](https://shimmy.farama.org/environments/dm_control/)<br> <img width="300px" height="auto" src="https://shimmy.farama.org/_images/dm_locomotion.png"> | ![continuous](https://img.shields.io/badge/-continous-green) | [code](./examples/dm_control/) |
67+
| [Omniverse Isaac Gym](https://github.com/NVIDIA-Omniverse/OmniIsaacGymEnvs)<br> <img width="300px" height="auto" src="https://user-images.githubusercontent.com/34286328/171454189-6afafbff-bb61-4aac-b518-24646007cb9f.gif"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/isaac/) |
68+
| [Snake](http://www.jidiai.cn/env_detail?envid=1)<br> <img width="300px" height="auto" src="./docs/images/snakes_1v1.gif"> | ![selfplay](https://img.shields.io/badge/-selfplay-blue) ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/snake/) |
69+
| [GridWorld](./examples/gridworld/)<br> <img width="300px" height="auto" src="./docs/images/gridworld.jpg"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) | [code](./examples/gridworld/) |
70+
| [Super Mario Bros](https://github.com/Kautenja/gym-super-mario-bros)<br> <img width="300px" height="auto" src="https://user-images.githubusercontent.com/2184469/40948820-3d15e5c2-6830-11e8-81d4-ecfaffee0a14.png"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![image](https://img.shields.io/badge/-image-red) | [code](./examples/super_mario/) |
71+
| [Gym Retro](https://github.com/openai/retro)<br> <img width="300px" height="auto" src="./docs/images/gym-retro.jpg"> | ![discrete](https://img.shields.io/badge/-discrete-brightgreen) ![image](https://img.shields.io/badge/-image-red) | [code](./examples/retro/) |
7172
</div>

README.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -104,7 +104,8 @@ Environments currently supported by OpenRL (for more details, please refer to [G
104104
- [Atari](https://gymnasium.farama.org/environments/atari/)
105105
- [StarCraft II](https://github.com/oxwhirl/smac)
106106
- [Omniverse Isaac Gym](https://github.com/NVIDIA-Omniverse/OmniIsaacGymEnvs)
107-
- [DeepMind Control](https://shimmy.farama.org/environments/dm_control/)
107+
- [DeepMind Control](https://shimmy.farama.org/environments/dm_control/)
108+
- [Snake](http://www.jidiai.cn/env_detail?envid=1)
108109
- [GridWorld](./examples/gridworld/)
109110
- [Super Mario Bros](https://github.com/Kautenja/gym-super-mario-bros)
110111
- [Gym Retro](https://github.com/openai/retro)

README_zh.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,8 @@ OpenRL目前支持的环境(更多详情请参考 [Gallery](Gallery.md)):
8686
- [Atari](https://gymnasium.farama.org/environments/atari/)
8787
- [StarCraft II](https://github.com/oxwhirl/smac)
8888
- [Omniverse Isaac Gym](https://github.com/NVIDIA-Omniverse/OmniIsaacGymEnvs)
89-
- [DeepMind Control](https://shimmy.farama.org/environments/dm_control/)
89+
- [DeepMind Control](https://shimmy.farama.org/environments/dm_control/)
90+
- [Snake](http://www.jidiai.cn/env_detail?envid=1)
9091
- [GridWorld](./examples/gridworld/)
9192
- [Super Mario Bros](https://github.com/Kautenja/gym-super-mario-bros)
9293
- [Gym Retro](https://github.com/openai/retro)

docs/images/snakes_1v1.gif

108 KB
Loading

examples/dm_control/train_ppo.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,9 @@
44
from openrl.configs.config import create_config_parser
55
from openrl.envs.common import make
66
from openrl.envs.wrappers.base_wrapper import BaseWrapper
7-
from openrl.envs.wrappers.extra_wrappers import GIFWrapper
7+
from openrl.envs.wrappers.extra_wrappers import FrameSkip, GIFWrapper
88
from openrl.modules.common import PPONet as Net
99
from openrl.runners.common import PPOAgent as Agent
10-
from openrl.envs.wrappers.extra_wrappers import FrameSkip
1110

1211
env_name = "dm_control/cartpole-balance-v0"
1312
# env_name = "dm_control/walker-walk-v0"

examples/smac/README.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -11,4 +11,7 @@ Installation guide for Linux:
1111

1212
Train SMAC with [MAPPO](https://arxiv.org/abs/2103.01955) algorithm:
1313

14-
`python train_ppo.py --config smac_ppo.yaml`
14+
`python train_ppo.py --config smac_ppo.yaml`
15+
16+
## Render replay on Mac
17+

examples/snake/README.md

+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
2+
This is the example for the snake game.
3+
4+
## Usage
5+
6+
```bash
7+
python train_selfplay.py
8+
```
9+
10+
11+
## Submit to JiDi
12+
13+
Submition site: http://www.jidiai.cn/env_detail?envid=1.
14+
15+
Snake senarios: [here](https://github.com/jidiai/ai_lib/blob/7a6986f0cb543994277103dbf605e9575d59edd6/env/config.json#L94)
16+
Original Snake environment: [here](https://github.com/jidiai/ai_lib/blob/master/env/snakes.py)
17+

examples/snake/selfplay.yaml

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
seed: 0
2+
callbacks:
3+
- id: "ProgressBarCallback"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
# -*- coding:utf-8 -*-
2+
def sample_single_dim(action_space_list_each, is_act_continuous):
3+
if is_act_continuous:
4+
each = action_space_list_each.sample()
5+
else:
6+
if action_space_list_each.__class__.__name__ == "Discrete":
7+
each = [0] * action_space_list_each.n
8+
idx = action_space_list_each.sample()
9+
each[idx] = 1
10+
elif action_space_list_each.__class__.__name__ == "MultiDiscreteParticle":
11+
each = []
12+
nvec = action_space_list_each.high - action_space_list_each.low + 1
13+
sample_indexes = action_space_list_each.sample()
14+
15+
for i in range(len(nvec)):
16+
dim = nvec[i]
17+
new_action = [0] * dim
18+
index = sample_indexes[i]
19+
new_action[index] = 1
20+
each.extend(new_action)
21+
return each
22+
23+
24+
def my_controller(observation, action_space, is_act_continuous):
25+
joint_action = []
26+
for i in range(len(action_space)):
27+
player = sample_single_dim(action_space[i], is_act_continuous)
28+
joint_action.append(player)
29+
return joint_action

examples/snake/test_env.py

+107
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
#!/usr/bin/env python
2+
# -*- coding: utf-8 -*-
3+
# Copyright 2023 The OpenRL Authors.
4+
#
5+
# Licensed under the Apache License, Version 2.0 (the "License");
6+
# you may not use this file except in compliance with the License.
7+
# You may obtain a copy of the License at
8+
#
9+
# https://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
17+
""""""
18+
import time
19+
20+
import numpy as np
21+
from wrappers import ConvertObs
22+
23+
from openrl.envs.snake.snake import SnakeEatBeans
24+
from openrl.envs.snake.snake_pettingzoo import SnakeEatBeansAECEnv
25+
from openrl.selfplay.wrappers.random_opponent_wrapper import RandomOpponentWrapper
26+
27+
28+
def test_raw_env():
29+
env = SnakeEatBeans()
30+
31+
obs, info = env.reset()
32+
33+
done = False
34+
while not np.any(done):
35+
a1 = np.zeros(4)
36+
a1[env.action_space.sample()] = 1
37+
a2 = np.zeros(4)
38+
a2[env.action_space.sample()] = 1
39+
obs, reward, done, info = env.step([a1, a2])
40+
print("obs:", obs)
41+
print("reward:", reward)
42+
print("done:", done)
43+
print("info:", info)
44+
45+
46+
def test_aec_env():
47+
from PIL import Image
48+
49+
img_list = []
50+
env = SnakeEatBeansAECEnv(render_mode="rgb_array")
51+
env.reset(seed=0)
52+
# time.sleep(1)
53+
img = env.render()
54+
img_list.append(img)
55+
step = 0
56+
for player_name in env.agent_iter():
57+
if step > 20:
58+
break
59+
observation, reward, termination, truncation, info = env.last()
60+
if termination or truncation:
61+
break
62+
action = env.action_space(player_name).sample()
63+
# if player_name == "player_0":
64+
# action = 2
65+
# elif player_name == "player_1":
66+
# action = 3
67+
# else:
68+
# raise ValueError("Unknown player name: {}".format(player_name))
69+
env.step(action)
70+
img = env.render()
71+
if player_name == "player_0":
72+
img_list.append(img)
73+
# time.sleep(1)
74+
75+
step += 1
76+
print("Total steps: {}".format(step))
77+
78+
save_path = "test.gif"
79+
img_list = [Image.fromarray(img) for img in img_list]
80+
img_list[0].save(save_path, save_all=True, append_images=img_list[1:], duration=500)
81+
82+
83+
def test_vec_env():
84+
from openrl.envs.common import make
85+
86+
env = make(
87+
"snakes_1v1",
88+
opponent_wrappers=[
89+
RandomOpponentWrapper,
90+
],
91+
env_wrappers=[ConvertObs],
92+
render_mode="group_human",
93+
env_num=2,
94+
)
95+
obs, info = env.reset()
96+
step = 0
97+
done = False
98+
while not np.any(done):
99+
action = env.random_action()
100+
obs, reward, done, info = env.step(action)
101+
time.sleep(0.3)
102+
step += 1
103+
print("Total steps: {}".format(step))
104+
105+
106+
if __name__ == "__main__":
107+
test_vec_env()

0 commit comments

Comments
 (0)