This document provides a map of mruby's internals for developers who want to understand, debug, or contribute to the codebase.
mruby's execution pipeline:
Ruby source → Parser → AST → Code Generator → Bytecode (irep)
↓
VM → Result
The design priority is memory > performance > readability.
All heap-allocated Ruby objects share a common header (MRB_OBJECT_HEADER):
struct RBasic (8 bytes on 64-bit)
┌──────────────┬─────┬──────────┬────────┬───────┐
│ RClass *c │ tt │ gc_color │ frozen │ flags │
│ (class ptr) │ 8b │ 3b │ 1b │ 20b │
└──────────────┴─────┴──────────┴────────┴───────┘
All object structs embed this header via MRB_OBJECT_HEADER:
| Struct | Ruby Type | Extra Fields |
|---|---|---|
RObject |
Object instances | iv (instance variables) |
RClass |
Class/Module | iv, mt (method table), super |
RString |
String | embedded or heap buffer, length |
RArray |
Array | embedded or heap buffer, length |
RHash |
Hash | hash table or k-v array |
RProc |
Proc/Lambda | irep or C function, environment |
RData |
C data wrapper | void *data, mrb_data_type |
RFiber |
Fiber | mrb_context |
RException |
Exception | iv |
Immediate values (Integer, Symbol, true, false, nil) are encoded
directly in mrb_value without heap allocation. The encoding depends on
the boxing mode (see boxing.md).
Objects must fit within 5 words (mrb_static_assert_object_size).
The VM is register-based, using two stacks: a value stack for
registers (locals, temporaries, arguments) and a call info stack
for tracking method/block call frames. Each method call pushes a
mrb_callinfo frame with the method symbol, proc, PC, and argument
counts.
The dispatch loop in mrb_vm_run() decodes opcodes and operates on
registers. Method dispatch looks up the receiver's class method table
(with a per-state method cache), then either calls a C function
directly or pushes a new call frame for Ruby methods.
Exception handling uses setjmp/longjmp (or C++ exceptions if
configured). Rescue/ensure handler tables are stored in each irep
and searched during stack unwinding.
See vm.md for detailed VM internals, opcode.md for the full instruction set.
The GC uses tri-color incremental mark-and-sweep with an optional generational mode. Objects are colored white (unmarked), gray (marked, children pending), black (fully marked), or red (static/ROM).
The three-phase cycle (root scan, incremental marking, sweep) runs
in small steps between VM instructions to avoid long pauses. Write
barriers (mrb_field_write_barrier, mrb_write_barrier) maintain
correctness during incremental marking.
The GC arena protects newly created objects in C code. Heap regions
(mrb_gc_add_region) support embedded systems with fixed memory banks.
See gc.md for detailed GC internals, ../guides/gc-arena-howto.md for arena usage patterns, ../guides/memory.md for memory management.
The compiler transforms Ruby source code through three stages:
- Parser (
parse.y): Lrama/Bison grammar produces an AST ofmrb_ast_nodestructures, tracking lexer state and local scopes. - Code Generator (
codegen.c): walks the AST and emits bytecode intomrb_irepstructures (instruction sequence, literal pool, symbol table, child ireps). - Execution: the irep is wrapped in an
RProcand executed by the VM, or serialized to.mrbbinary format.
Alternative loading paths include mrb_load_string() (compile and
run), mrb_load_irep() (load precompiled bytecode), and mrbc
(ahead-of-time compilation).
See compiler.md for detailed compiler internals, opcode.md for the instruction set.
| File | Responsibility |
|---|---|
vm.c |
Bytecode dispatch loop, method invocation |
state.c |
mrb_state init/close, irep management |
gc.c |
Garbage collector (mark-sweep, incremental) |
class.c |
Class/module definition, method tables |
object.c |
Core object operations |
variable.c |
Instance/class/global variables, object shapes |
proc.c |
Proc/Lambda/closure handling |
array.c |
Array implementation |
string.c |
String implementation (embedded, shared, heap) |
hash.c |
Hash implementation (open addressing) |
numeric.c |
Integer/Float arithmetic |
symbol.c |
Symbol table and interning |
range.c |
Range implementation |
error.c |
Exception creation, raise, backtrace |
kernel.c |
Kernel module methods |
load.c |
.mrb bytecode loading |
dump.c |
Bytecode serialization (write .mrb) |
print.c |
Print/puts/p output |
backtrace.c |
Stack trace generation |
| File | Responsibility |
|---|---|
parse.y |
Yacc grammar → AST |
y.tab.c |
Generated parser (from parse.y) |
codegen.c |
AST → bytecode (irep) |
node.h |
AST node type definitions |
| Header | Contents |
|---|---|
mruby.h |
mrb_state, core API declarations |
value.h |
mrb_value, type enums, value macros |
object.h |
RBasic, RObject, object header |
class.h |
RClass, method table types |
string.h |
RString, string macros |
array.h |
RArray, array macros |
hash.h |
RHash, hash API |
data.h |
RData, C data wrapping |
irep.h |
mrb_irep, bytecode structures |
compile.h |
Compiler context, mrb_load_string |
boxing_*.h |
Value boxing implementations |
Gems are the module system for mruby. Each gem lives in
mrbgems/mruby-*/ and contains:
mruby-example/
├── mrbgem.rake gem specification (name, deps, bins)
├── src/ C source files
├── mrblib/ Ruby source files (compiled to bytecode)
├── include/ C headers
├── test/ mrbtest test files
└── bintest/ binary test files (CRuby)
At build time, gem Ruby files are compiled with mrbc and linked into
libmruby.a. Gem initialization runs in dependency order via
gem_init.c (auto-generated).
GemBoxes (mrbgems/*.gembox) define named collections of gems
(e.g., default.gembox includes stdlib, stdlib-ext, stdlib-io,
math, metaprog, and binary tools).