mcts-odin

Generic, optimized Monte Carlo Tree Search for Odin

CI License: MIT Version Odin

AlphaZero-style PUCT with Dirichlet root noise and First-Play Urgency. Optional fast rollouts, leaf-parallel batched playouts with virtual loss, and OS-thread parallelism. Games plug in via a small Game vtable — the core knows nothing about Go, chess, or any specific game.

Ships with eleven demo games (tic-tac-toe, Connect Four, Reversi, Hex, Breakthrough, Gomoku, Dots and Boxes, Amazons, Quoridor, Nine Men's Morris, Go) and a 137-test suite that runs leak-free under Odin's memory tracker.

Quick start

package main

import "mcts"
import ttt "games/tictactoe"

main :: proc() {
    g     := ttt.game()              // mcts.Game vtable
    state := ttt.new_state()         // tree takes ownership

    cfg := mcts.default_config()
    tree: mcts.Tree
    mcts.init(&tree, &g, state, cfg, seed = 42)
    defer mcts.destroy(&tree)

    mcts.run_simulations(&tree, 1000, my_evaluator, &g)
    action := mcts.select_action(&tree, temperature = 0.0)
}

The evaluator is your value/policy function. See tictactoe_selfplay for a complete runnable example and nn_evaluator_skeleton for the policy/value plumbing pattern a real NN-backed evaluator needs.

Throughput

9×9 Go · 1600 sims/move × 32 moves · uniform-policy evaluator · single-thread · -o:speed -no-bounds-check

Implementationsims/srelative
mcts-odin v0.6.0~108,000
autogodin (C++)8,47012.8× slower
autogodin (Odin)2,85937.8× slower

Not strictly comparable — mcts-odin's bench runs the evaluator inline; autogodin's marshals through a Python callback. The cumulative gap reflects the do/undo lift, packed slot storage, SoA hot fields, linear-space priors, per-Tree scratch arena, subtree reuse, branchless argmax, FPU (broader/shallower tree), and a clone-free is_legal_flat that probes positional superko via incremental Zobrist. Full details →

Eleven demo games

Architecture

┌──────────────────────── Tree ───────────────────────┐
│                                                     │
│  working_state ───┐                                 │
│  (owned by tree)  │   single state mutated in       │
│                   │   place via do_move / undo_move │
│                   ▼                                 │
│           ┌─── nodes[] ────┐                        │
│           │ Node 0 (root)  │       Hot fields in    │
│           │  ├─ actions[]  │       parallel SoA on  │
│           │  ├─ priors[]   │       the Tree:        │
│           │  └─ child[]    │         node_N[]       │
│           ├────────────────┤         node_N_virt[]  │
│           │ Node 1, 2, …   │         node_Q[]       │
│           │ (packed slots) │                        │
│           └────────────────┘                        │
│                                                     │
│  arena            permanent: nodes, slot arrays     │
│  scratch_arena    per-run: descent paths, deltas    │
│                                                     │
└─────────────────────────────────────────────────────┘

Three deliberate choices drive the throughput:

Three drivers, one tree

// Sequential — one evaluator call per leaf. Fine for fast in-process
// value functions and uniform priors.
mcts.run_simulations(&tree, 1600, my_evaluator, &g)

// Batched — leaf-parallel with virtual loss; the evaluator gets a slice
// of cloned leaf states per call. For GPU NN forward passes.
mcts.run_simulations_batched(&tree, 1600, batch_size = 16,
                              my_batched_evaluator, &g)

// OS-thread parallel — N workers run descent / eval / backup concurrently.
// Atomics on N / N_virt / Q + a coarse expand mutex; virtual loss decouples
// the descents.
mcts.run_simulations_threaded(&tree, 1600, n_threads = 8,
                               my_evaluator, &g)

Near-linear scaling on slow evaluators (50 µs/call benchmark, 9×9 Go): n=2: 1.93×, n=4: 3.81×, n=8: 7.15×. For cheap evaluators (microseconds per call) the mutex and CAS-loop contention can erase the speedup — use the sequential or batched paths there.

Latest release

v0.6.0 — Nine Men's Morris demo game. First demo with explicit game phases (placement → sliding → flying) that reshape the action space mid-game, and the first where one MCTS action atomically contains a follow-up sub-action by the same player (mill formation → opponent-piece removal). Eleven demo games / 137 tests.

Full changelog →