This document describes mruby's virtual machine for developers
working on src/vm.c and related code.
Read this if you are: debugging method dispatch or call frame issues, working on exception handling, implementing new opcodes, modifying fiber/coroutine behavior, or optimizing the dispatch loop.
For the instruction set, see opcode.md. For the compiler that generates bytecode, see compiler.md.
mruby uses a register-based VM. Local variables and temporaries occupy fixed register slots determined at compile time. Each method call gets its own register window on a shared value stack.
The VM state is stored in mrb_context:
mrb_context
+-- stbase..stend value stack (mrb_value[])
+-- cibase..ciend call info stack (mrb_callinfo[])
+-- ci current call frame pointer
+-- status fiber state
+-- prev previous context (fiber chain)
+-- vmexec VM execution state flag
The value stack and call info stack grow independently. Each fiber
has its own mrb_context.
- Initial value stack: 128 entries (
STACK_INIT_SIZE) - Initial call info stack: 32 entries (
CALLINFO_INIT_SIZE) - Growth factor: 1.5x (or 2x with
MRB_STACK_EXTEND_DOUBLING) - Minimum growth: 128 entries (
MRB_STACK_GROWTH) - Max stack depth:
MRB_STACK_MAX(0x40000 - 128) - Max call depth:
MRB_CALL_LEVEL_MAX(512, or 128 with ASAN)
Exceeding either limit raises SystemStackError.
When the value stack is reallocated, all REnv objects and
mrb_callinfo stack pointers are adjusted by the delta
(envadjust function).
Each method or block call pushes a mrb_callinfo frame:
mrb_callinfo
+-- n:4 positional argument count (0-14, 15 = varargs)
+-- nk:4 keyword argument count (0-14, 15 = varargs)
+-- cci call context info (NONE, DIRECT, SKIP, RESUMED)
+-- vis visibility flags (public/private/protected)
+-- mid method symbol
+-- proc current RProc
+-- blk block argument (RProc*)
+-- stack pointer into value stack
+-- pc program counter (bytecode position)
+-- u.env closure environment (REnv*)
+-- u.target_class receiver's class
ci->stack:
[0] self (receiver)
[1..n] positional arguments
[n+1..] keyword argument pairs (key, value, key, value, ...)
[bidx] block argument
[bidx+1..] local variables and temporaries
The n and nk fields are 4 bits each (0-15). When n == 15,
positional arguments are packed into a single Array in register 1.
When nk == 15, keyword arguments are packed into a single Hash.
The block index is calculated by mrb_bidx(n, nk):
if n == 15: n = 1 (array)
if nk == 15: n += 1 (hash)
else: n += nk * 2 (key-value pairs)
return n + 1 (skip self)
| Value | Name | Meaning |
|---|---|---|
| 0 | CINFO_NONE |
Normal VM-to-VM call |
| 1 | CINFO_DIRECT |
Explicit VM call (block, lambda.call) |
| 2 | CINFO_SKIP |
Skip frame in stack traces |
| 3 | CINFO_RESUMED |
Fiber resumed (stop execution) |
The main loop in mrb_vm_run() decodes and dispatches opcodes.
Two dispatch strategies are available:
- Computed goto (default on GCC/Clang): a jump table of label
addresses (
optable[]) for direct dispatch. Faster due to better branch prediction. - Switch-based (
MRB_USE_VM_SWITCH_DISPATCH): a standardswitch(insn)statement. Default on MSVC and other compilers.
The dispatch loop is wrapped in MRB_TRY/MRB_CATCH for exception
handling (see Exception Handling).
When OP_SEND (or OP_SSEND, OP_SUPER) executes:
Determine argument layout. If argument count < 15, the fast path uses inline registers. Otherwise, arguments are packed into an Array (varargs mode).
ci = cipush(mrb, a, CINFO_DIRECT, NULL, NULL, blk, mid, argc);The new frame's stack starts at the previous frame's stack + a
(the receiver's register index).
The lookup sequence:
- Method cache check: hash table lookup by
(class, mid). Default cache size:MRB_METHOD_CACHE_SIZE(256 entries). - Method table walk: if cache misses, search the receiver's
class method table (
mt), then walk the superclass chain. - Cache store: on successful lookup, store in the cache.
The method cache is invalidated when classes are modified
(mrb_mc_clear_by_class).
- Ruby method (irep-based): extend the stack to
irep->nregs, setci->pctoirep->iseq, and jump to the new bytecode. - C function: call
func(mrb, recv)directly, then pop the call frame and store the return value.
Private methods are only callable without an explicit receiver.
Protected methods are callable from the same class hierarchy.
Violations raise NoMethodError.
By default, mruby uses setjmp/longjmp for exception control
flow:
MRB_TRY(&c_jmp) {
mrb->jmp = &c_jmp;
/* dispatch loop */
}
MRB_CATCH(&c_jmp) {
/* handle exception */
}
MRB_END_EXC(&c_jmp);With MRB_USE_CXX_EXCEPTION, C++ try/catch is used instead.
Each irep contains a catch handler table (appended after iseq in
memory) with entries for rescue and ensure blocks:
mrb_irep_catch_handler
+-- type RESCUE (0) or ENSURE (1)
+-- begin[4] start PC of protected range
+-- end[4] end PC of protected range
+-- target[4] jump target when handler matches
When an exception occurs:
- Search the current irep's catch handler table (reverse order) for a handler covering the current PC
- If an
ensurehandler is found: execute it (may re-raise) - If a
rescuehandler is found: jump to handler code - If no handler: pop the call frame (
cipop) and repeat with the parent frame CINFO_DIRECTframes are destroyed during propagation
Closures capture their enclosing scope's variables through REnv:
REnv
+-- stack pointer to captured variable values
+-- cxt owning context (NULL if detached from stack)
+-- mid method symbol
+-- flags length, block index, visibility
While the defining scope is active, REnv::stack points directly
into the VM value stack (shared). This avoids copying.
When a closure outlives its defining scope, mrb_env_unshare()
copies the captured variables from the stack to a heap-allocated
buffer:
mrb_env_unshare(mrb, env, noraise);After unsharing, MRB_ENV_CLOSE(env) sets cxt = NULL to indicate
the environment is detached. A write barrier is issued for GC
correctness.
| Flag | Meaning |
|---|---|
MRB_PROC_CFUNC_FL |
C function (not irep-based) |
MRB_PROC_STRICT |
Lambda (strict argument check) |
MRB_PROC_ORPHAN |
No environment attachment |
MRB_PROC_ENVSET |
Has captured environment |
MRB_PROC_SCOPE |
Defines a new variable scope |
Fibers are lightweight coroutines. Each fiber has its own
mrb_context with separate value and call info stacks.
CREATED --> RUNNING --> SUSPENDED --> TERMINATED
| ^
+-----------+
(yield/resume)
TRANSFERRED (via Fiber#transfer)
On Fiber#resume:
- Save current context state
- Set
mrb->cto the fiber's context - Push arguments onto the fiber's stack
- Continue execution in the fiber
On Fiber.yield:
- Save fiber context
- Restore the parent context (
mrb->c = c->prev) - Return yield values to the parent
When a fiber completes (fiber_terminate):
- Unshare any environments that reference the fiber's stack
- Set status to
TERMINATED - Free the fiber's stacks
- Switch to the previous context
Fibers cannot yield across C function boundaries. You cannot call
Fiber.yield from within a C-implemented method (except via
mrb_fiber_yield at return). This is because C call frames cannot
be suspended and resumed.
The VM saves the arena index at the start of the dispatch loop:
int ai = mrb_gc_arena_save(mrb);After each C function call, the arena is shrunk back:
mrb_gc_arena_shrink(mrb, ai);This prevents temporary objects created by C functions from accumulating in the arena.
Write barriers are issued when environments are detached or closed, ensuring the incremental GC correctly tracks live references.
| File | Contents |
|---|---|
src/vm.c |
Dispatch loop, method invocation (~1900 lines) |
include/mruby.h |
mrb_state, mrb_callinfo, mrb_context |
include/mruby/proc.h |
RProc, REnv structures |
include/mruby/throw.h |
MRB_TRY/MRB_CATCH macros |