Ludo King AI

A modern Ludo rules engine wrapped in a Gymnasium-compatible reinforcement learning (RL) environment. The project couples a differentiable feature extractor and Stable Baselines3's MaskablePPO agent to explore self-play training for Ludo King.

Highlights

Full Ludo simulator with movement validation, captures, blockades, and reward shaping.
LudoEnv Gymnasium environment exposing rich observations and mandatory action masking.
Custom 1D CNN feature extractor (LudoCnnExtractor) tailored to the stacked board representation.
Ready-to-run PPO training script that handles vectorised environments, checkpoints, and TensorBoard logging.
Configuration-first design (ludo_rl/ludo_king/config.py, reward.py) to tweak board, network, and reward parameters in one place.

Project Layout

ludo_rl
├─ __init__.py → loads .env so simulator/env can read opponent strategy settings
├─ ludo_env.LudoEnv (Gymnasium Env)
│  ├─ wraps ludo_king.simulator.Simulator to expose observation dict & masked Discrete(4) actions
│  ├─ handles invalid moves, turn limit, reward shaping, and rendering snapshots
│  └─ converts Game.board into a 10‑channel board tensor + dice_roll token
├─ ludo_king.simulator.Simulator
│  ├─ owns Game, tracks agent_index
│  ├─ runs opponents’ turns (respecting extra rolls)
│  └─ can be driven by environment strategies (registry) or random fallback
├─ ludo_king.game.Game
│  ├─ instantiates 2 or 4 Player objects and provides dice + rule enforcement
│  ├─ enforces entry, home column, safe squares, captures, blockades, extra turns
│  └─ builds per‑agent board tensors via Board.build_tensor
├─ ludo_king.board.Board
│  ├─ absolute↔relative mapping, safe squares and channel construction
│  └─ counts pieces per player/channel for tensor features
├─ ludo_king.player.Player
│  ├─ keeps Piece state, win detection
│  ├─ chooses moves via strategies (lazy instantiation by name)
│  └─ falls back to random legal move if requested heuristic is unknown
├─ strategy package
│  ├─ features.build_move_options turns env observation into StrategyContext
│  ├─ BaseStrategy + concrete heuristics (defensive, killer, etc.) score MoveOption
│  └─ registry.create/available expose factories to simulator & players
├─ extractor.LudoCnnExtractor / LudoTransformerExtractor
│  ├─ convert observation dict into feature vectors for MaskablePPO
│  └─ fuse CNN/Transformer encodings with per‑piece embeddings and dice token
├─ tools (arguments, scheduler, evaluate, tournaments, imitation)
│  ├─ evaluate.py — supports 2‑player (opposite seats) and 4‑player lineups
│  ├─ tournament.py — strategy league for 2 or 4 players
│  └─ llm_vs_models.py — LLM/RL/Static mixed matches for 2 or 4 players
└─ train.py
   ├─ parses CLI args, configures MaskablePPO w/ custom extractor
   └─ runs vectorized envs, callbacks (checkpoints, entropy annealing, profiler)

LudoEnv mediates RL interaction: builds masked actions, enforces rewards, loops until player or opponents advance, and emits 10‑channel observations.
Simulator orchestrates turns: applies agent move (in env), simulates opponents with heuristic strategies, and ensures extra‑turn logic.
Core rules live in ludo_king Game + Board + Piece/Player; reward.compute_move_rewards still produces shaped returns for PPO.
Strategy module supplies configurable heuristics; features.build_move_options transforms env data into StrategyContext so Player.choose can score moves consistently.
extractor.py houses CNN/Transformer feature pipelines that embed board channels, per‑piece context, and dice roll before feeding MaskablePPO during training (train.py).

Getting Started

Prerequisites

Python 3.11+
A virtual environment is recommended (python -m venv .venv && source .venv/bin/activate).

Installation

Install the package and dependencies in editable mode:

pip install -e .

Alternatively, install the raw dependencies:

pip install -r requirements.txt

Training the Agent

The train.py script configures MaskablePPO with the custom feature extractor and launches multi-process self-play training.

python train.py

What the script does:

Creates timestamped subdirectories under training/ludo_logs/ and training/ludo_models/.
Spawns SubprocVecEnv workers (half the available CPU cores) and wraps them with VecMonitor.
Sets up checkpointing every 10k steps and periodic evaluation (20k step cadence).
Trains for 1,000,000 timesteps, saves the initial and final policies, and performs a short interactive rollout.

TensorBoard logs end up in the run-specific training/ludo_logs/<run_id>/ directory:

tensorboard --logdir training/ludo_logs

Customisation

Rewards: Adjust per-event incentives in ludo_rl/ludo_king/reward.py.
Network: Tune convolution and MLP widths in ludo_rl/ludo_king/config.py (NetworkConfig).
Environment: Modify truncation length (MAX_TURNS) or add render hooks in ludo_rl/ludo_env.py.
Training Hyperparameters: Tweak PPO arguments and callback intervals in train.py.

Current Status & Roadmap

✅ Environment, simulator, and training loop are in place.
✅ Evaluation tooling (e.g., scripted benchmarks, head-to-head matches).

Development

Static analysis and linting use tox:

tox

The default configuration runs formatting checks (via Ruff/Black if installed) and unit tests once they are introduced.

License

Released under the Apache License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
docs		docs
ludo_interface		ludo_interface
ludo_rl		ludo_rl
tests		tests
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
tox.ini		tox.ini
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Ludo King AI

Highlights

Project Layout

Getting Started

Prerequisites

Installation

Training the Agent

Customisation

Current Status & Roadmap

Development

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

KameniAlexNea/ludo-king-ai

Folders and files

Latest commit

History

Repository files navigation

Ludo King AI

Highlights

Project Layout

Getting Started

Prerequisites

Installation

Training the Agent

Customisation

Current Status & Roadmap

Development

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages