Skip to content
View GridRL's full-sized avatar

Block or report GridRL

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
GridRL/README.md

If you need any help, or just want to chat, join us on Discord.

Discord

GridRL

Customizable engine for minimalist 2D grid-based games, oriented towards Reinforcement Learning. The project provides gymnasium-compatible environments with convenient hooks, of exploration games with a complexity similar to NES/GB games. Environment state is small and contains only relevant data, so this is a good playground for training and testing policies at high speed, even on weaker devices.

Getting Started

The script is currently tested on Python 3.10 and 3.11, more versions will be checked and supported in the future. Minimal experience of gymnasium environment structure and Reinforcement Learning is required. An examples using Stable-Baselines3 PPO algorithm is currently provided. Pufferlib-CleanRL implementation in project.

Install GridRL as:

pip install git+https://github.com/GridRL/GridRL

Import of the relevant environment generation functions:

from gridrl.envs import make_env,ExplorationWorld1Env

config={}
env=make_env(ExplorationWorld1Env,config)

The CLI version is suited for benchmarks, info and debug play. It can be called on CLI or from script:

gridrl exploration_world1 --mode info
gridrl creatures_world1 --mode benchmark
from gridrl.cli import main_cli

main_cli(["exploration_world1","--mode","play"])

Game string can also fallback to game_id and you extract example scripts in your current working directory using cli:

gridrl 0 --mode examples

Games structure

Games follow a minimalist minimalist 2D grid-based approach, where the agent moves inside multiple maps and must navigate up to (unknown) checkpoints, that allow it to eventually modify other NPC states or unlock new powers, to be able to access new areas. Multiple small maps are provided and assembled into a bigger global world, that can also include warps. Relevent events required for any progression of the game are given a dedicated flag, so it's possible to easily track the agent progress or start at custom points. A hook system provides relevant bindings to the coder, that can implement custom calls before/after something occurs, like stepping on a warp. The action space and game complexity can be partially altered by configurations, changing the way the agent moves, input/menu abstraction or filler events amount. Some games can override certain fields. The hidden game state is as small as possible and most of the work needed for a smarter/faster policy, will be done by feature and reward enginering on spatial data that the agent can collect.

Environment configurations

Environment customization isn't documented for now, but it's not hard to understand. Check examples/example_environment.py The configuration dictionary is optional, but suggested for whatever editing you desire. There relevant ones are:

config=dict(
### Most important fields
## Size of directional action space. Default: 4
## 4:Arrows - 3:Forward, turn left+forward, turn right+forward - 2:Forward, turn right+forward
    movement_max_actions= 4,
## Complexity of the action space and game abstractions. Default: -1
## -1: Infer. Generally it's the highest implemented settings
## 0: Only directions, bypass powerups actions. Agent must step over checkpoints only
## 1: Adds an interaction button (for checkpoints) and powerups
## 2: Adds NPC
## 3: Adds menu
    action_complexity= 3,        
## Format of the screen observation space. Default: -1
## -1: Infer. Generally it's the highest implemented settings
## 0: No screen - 1: Tile matrix - 2: OneHot - 3: Monocromatic-RGB - 4: Assets-RGB
    screen_observation_type= 3,
## Maximum number of steps until the environment is done. Default: 2**31
    max_steps= 2**16,
## Converts the RGB screen observation space to grayscale. Default: False
    grayscale_screen= False,
## Removes the channel axis from grayscale images. Default: True
#    grayscale_single_channel= True,
## Downscale factor of the screen, only with screen_observation_type>=4
#    screen_downscale= 1,          
## Automatically adds the screen to the observations. Default: True
## The screen can always be collected via env.get_screen_observation() following screen_observation_type config
#    auto_screen_obs= True,
## Automatically flattens the observation data to a single-dimension vector. Default: False
#    flat_obs= False,
## Adds a dummy entry in the action space that does nothing. Default: False,    
#    action_nop= False,
### Save-state related
## The event used as starting point once the game is reset. Most of the game state will be infered. Default: ""
## See data/events.json file of the game, or call cli with args "game_name --mode info"
    starting_event= "",
## A list of event names that will be marked as completed on reset. Default []
    starting_collected_flags= [],
## Granularity of the automatic save state routine. Default: 0 (disabled)
## Use env.rewind_state(saved_steps=1) to reload the previous queued state.
#    rewind_steps= 0,
### Screen-debug only
## Saves screen frames for debugging purposes, execution will be slower. Default. False
#    log_screen= False,
## Integer multiplier of the window size exposed to the agent. Default: 1
#    screen_view_mult= 1,
## Custom height of the window size exposed to the agent. Default: 9
#    screen_y= 9,
## Custom width of the window size exposed to the agent. Default: 10
#    screen_x= 10,
)

More details on the code and settings, will be provided in future releases.

Deterministic agents

Even if RL agents are supposed to train their policy autonomously, I still added support for deterministic rule-based bots. Current progress is still not mature enough to have smart agents, but some behaviours can already be expressed, even if there is no certain they will lead to good actions. In the future, one could implement an hybrid algorithm that uses both policy and determinism (possibly for some pretraining, or tricky spots). See examples/example_run_deterministic_agent.py to see in action an agents that automatically enters a visible warp.

Next-implementations

Progress will be very erratic and won't follow a strict order. The plan is:

Compatibility

  • Support more Python versions

Games

  • Complete exploration_world1 main progress (currently at 60%)
  • Complete creatures_world1 main progress (main-story simplified but done, completion: 60%)
  • Add abstractions for menu handling
  • Add abstractions and better data structure for creatures_world1 battle system
  • Menu rendering
  • Set Teleport powerup to be accessible only in menu mode (too disruptive otherwise)
  • Set an extra frame for warps transitions
  • More games archetypes

Core speedup

  • Pufferlib example model
  • Optimize scripting and NPC handling
  • Port relevant game sections to Cython

Environment

  • Multi-action wrappers (partially implemented, untested)
  • Partial state randomization or perturbation
  • Improve environment validation for custom games
  • Scripting for custom NPC suggesting informative data like coordinates or text
  • Random dummy NPC generation
  • Direct text interactions. Data provided as raw string and partially encoded on the screen
  • Boilerplate routines for dataset generation, used for offline-learning

Deterministic agents

  • Generic pathfinder boilerplate (can be improved)
  • Build more fixed algorithm on a per-game basis

Customization

  • UI helpers for game editing

Bugfixes

  • Fingers crossed!

Special thanks

I want to thank programmers that were really inspiring with their works

Popular repositories Loading

  1. GridRL GridRL Public

    Customizable engine for minimalist 2D grid-based exploration games, oriented towards Reinforcement Learning.

    Python 2

  2. pokerl-map-viz pokerl-map-viz Public

    Forked from PWhiddy/pokerl-map-viz

    JavaScript

  3. PyBoy PyBoy Public

    Forked from Baekalfen/PyBoy

    Game Boy emulator written in Python

    Python