Skip to content

maazkhalil/Neural-Racer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Speed Racer RL

A reinforcement learning project built around a custom 2D top-down racing simulator written in Python.
The environment uses LIDAR-style raycasts for perception and trains a Deep Q-Network (DQN) agent using PyTorch.
Training runs headless, while trained models can be replayed visually using Pygame.

Note: This project was originally written in C++ (using Raylib + libtorch) and has been fully ported to Python for ease of setup and experimentation.

Overview

The project consists of the following scripts:

  • Training (python/train.py)
    Headless reinforcement learning using a Double DQN agent.

  • Fine-Tuning (python/fine_tune.py)
    Loads a pre-trained model and continues training with conservative hyperparameters (lower learning rate, lower epsilon).

  • Replay (python/replay.py)
    Loads a trained model and visualizes behavior in real time with Pygame.

The environment is fully custom, including physics, collision handling, checkpoint logic, and reward shaping.

Environment

  • 2D pixel-based racing track
  • Custom physics (speed, friction/drag, steering, wall/grass collisions)
  • 8 checkpoints + 3-lap race system
  • Deterministic step-based simulation

Tracks and assets live in assets/.

Observation Space

The agent observes a 23-dimensional state vector:

  • Normalized speed
  • sin(heading), cos(heading)
  • Normalized position (x, y)
  • 13 short-range LIDAR raycasts (danger sensing, −90° to +90°)
  • 5 long-range LIDAR raycasts (anticipation, forward-facing)

Raycasts return a normalized danger value:

danger = 1 / ((distance / reference_distance) + 0.1)

Values are clamped to [0, 1].

Action Space

Discrete action space with 7 actions:

  1. Accelerate forward
  2. Reverse
  3. Steer left
  4. Steer right
  5. Forward + left
  6. Forward + right
  7. No input

Reward Structure

The reward function is shaped using:

  • Progress toward next checkpoint
  • Small speed incentive (scaled conservatively)
  • Checkpoint reward
  • Lap completion reward
  • Race finish reward
  • Time penalty
  • Wall collision penalty
  • Grass penalty
  • Anti-idle penalty

Episodes may terminate early if the vehicle becomes stuck or stops making progress.

Training Behavior

Observed training characteristics:

  • Early models (~100 episodes): learns basic movement and wall avoidance
  • Mid training (~300-500 episodes): starts completing laps, still hits walls on corners
  • Fine-tuned models (~560 + 50 fine-tune episodes): stable multi-lap behavior with minimal wall hits

Repository Structure

.
├── assets/                          # Track images and car texture
│   └── raceTrackFullyWalled.png
├── python/
│   ├── environment.py               # Racing environment (physics, LIDAR, checkpoints, rewards)
│   ├── dqn.py                       # DQN network, agent, and replay buffer
│   ├── train.py                     # Headless training script
│   ├── fine_tune.py                 # Fine-tuning script for trained models
│   ├── replay.py                    # Visual Pygame replay
│   ├── requirements.txt             # Python dependencies
│   └── models/                      # Saved model checkpoints
├── LICENSE
└── README.md

Setup

Requirements

  • Python 3.12+
  • PyTorch
  • Pygame
  • Pillow
  • NumPy

Install Dependencies

cd python
pip install torch pygame Pillow numpy

Training

To train a new agent from scratch:

cd python
python train.py

Training runs headless and saves model checkpoints every 50 episodes to python/models/. Press Ctrl+C to save the final model and exit gracefully.

Fine-Tuning

To improve an existing model with conservative hyperparameters:

cd python
python fine_tune.py models/model_episode_560.pt

Fine-tuning uses a lower learning rate (1e-4) and lower epsilon (0.1) to refine learned behavior without losing stability.

Running Replay

To visualize a trained model:

cd python
python replay.py models/ft_episode_50.pt

Replay Controls

Key Action
SPACE Restart race
L Toggle LIDAR visualization
ESC Exit

The replay shows the track, car, LIDAR rays (orange = short-range, blue = long-range), checkpoints, and a HUD with speed/lap/time info.

About

Train an AI to race — custom physics, LIDAR sensing, and Deep Q-Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages