Tiny Recursion Model (TRM) is a compact recursive reasoning network that reaches 45 % on ARC-AGI-1 and 8 % on ARC-AGI-2 with only ~7 M parameters. This repository contains the training pipeline, dataset builders, and evaluation utilities that power the work described in the paper Less is More: Recursive Reasoning with Tiny Networks (arXiv).
TRM distills recursive reasoning into a single streamlined module and omits every hierarchy-specific component that defined the Hierarchical Reasoning Model (HRM).
- Philosophy — TRM keeps the focus on minimal recursion without relying on brain analogies, fixed-point theorems, or explicit hierarchies.
- Architecture — HRM maintained dedicated H-level and L-level reasoning stacks; TRM reuses the L-level module for all updates and ignores any H-level configuration.
- Forward Iteration — HRM alternated between separate H and L modules during a step. TRM applies the same reasoning module when updating both latent states, producing a flatter recursive loop.
- Configuration — Default TRM configs set
H_layers=0with multipleL_layers, alongside deeper L-level cycle counts (H_cycles=3,L_cycles=6), reflecting the absence of an H-level pathway. - Adaptive Computation Time — TRM adds the
no_ACT_continueoption to simplify halting, using only the halt signal sigmoid instead of comparing halt/continue logits.
All HRM code paths, configs, and documentation have been removed from this repository to keep the focus squarely on TRM and TRM-Text.
# 1. Set up a virtual environment
uv sync
source .venv/bin/activate
# 2. (Optional) authenticate TrackIO for experiment tracking
trackio login YOUR-LOGINℹ️ Offline runs: TrackIO defaults to offline logging when no credentials are present—you can still inspect metrics locally.
| Path | Purpose |
|---|---|
pretrain.py |
Hydra-driven training entry point shared across ARC and text tasks |
config/ |
Hydra configuration tree (architectures, datasets, tasks) |
dataset/ |
Dataset builders and shared metadata helpers |
datasets/ |
Runtime dataset adapters (e.g., TextDataset) |
models/ |
Model components and loss heads, including the text transformer |
evaluators/ |
ARC scorer and TrackIO-friendly text evaluator |
utils/ |
Misc utilities (model loading, tokenisation helpers, etc.) |
scripts/ |
Automation helpers like the TinyStories smoke-run script |
tests/ |
Pytest suite covering builders, tokeniser, datasets, models, configs, and smoke CLI |
# ARC-AGI-1
python -m dataset.build_arc_dataset \
--input-file-prefix kaggle/combined/arc-agi \
--output-dir data/arc1concept-aug-1000 \
--subsets training evaluation concept \
--test-set-name evaluation
# ARC-AGI-2
python -m dataset.build_arc_dataset \
--input-file-prefix kaggle/combined/arc-agi \
--output-dir data/arc2concept-aug-1000 \
--subsets training2 evaluation2 concept \
--test-set-name evaluation2Note: ARC-AGI-2’s training split overlaps ARC-AGI-1 evaluation data—do not train on both simultaneously if you plan to evaluate ARC-AGI-1.
- Assemble raw JSON/JSONL files (mixing
textormessagesrecords). - Run the builder:
python -m dataset.build_tiny_text_dataset \
--input-paths data/raw/tinystories.jsonl \
--output-dir data/tiny-text-processed \
--max-sequence-length 128 \
--lowercase trueThe processed directory includes train/ and validation/ splits with padded NumPy tensors, alongside dataset.json metadata and identifier mappings.
- Smoke test (recommended) — ensures configs, dataset wiring, and TrackIO logging all resolve:
./scripts/run_tiny_text_smoke.sh --dry-run data/tiny-text-processed smoke_demo # view command
./scripts/run_tiny_text_smoke.sh data/tiny-text-processed smoke_demo # execute (~minutes on CPU)- Custom training run — override any Hydra field inline; example single-GPU run:
python pretrain.py \
tasks=text/tinystories \
data_paths="[data/tiny-text-processed]" \
global_batch_size=16 \
epochs=5 \
eval_interval=1 \
run_name="tinystories_run01"Hydra configs config/tasks/text/tinystories.yaml and config/tasks/text/tinychat.yaml define sensible defaults for architecture depth, sequence length, and logging.
torchrun --nproc-per-node 4 pretrain.py \
arch=trm \
data_paths="[data/arc1concept-aug-1000]" \
arch.H_cycles=3 arch.L_cycles=4 \
global_batch_size=768 \
+run_name=pretrain_att_arc1 ema=TrueExpect ~3 days on 4× H100 GPUs for the full ARC-AGI-1 run.
pytest # full test suite
pytest tests/utils/test_text_tokenizer.py
pytest tests/integration/test_smoke_text_training.py -k smokeThe integration test constructs a miniature TinyStories shard and checks that run_tiny_text_smoke.sh produces a valid command in --dry-run mode.
- Authenticate once via
trackio login. - Runs default to the project name configured via the Hydra field
project_name. - Code snapshots are archived automatically when
checkpoint_pathis provided. - To operate fully offline: omit credentials or pass
trackio mode=offlineoverrides when invokingpretrain.py.
If you build on this work, please cite the original paper:
@article{trm2025,
title = {Less is More: Recursive Reasoning with Tiny Networks},
author = {Alexia Massalin and collaborators},
journal = {arXiv preprint arXiv:2510.04871},
year = {2025}
}
For assistance or bug reports, open an issue or consult AGENTS.md for contributor guidelines. Happy reasoning!