Skip to content

NullLabTests/self_referential_forge

Repository files navigation

🧬 Self-Referential Forge

Darwin-style evolution of the forge's own components β€” mutation operators, evaluators, safety guards, and meta-strategies evolve through self-modification.

Status: Active License: MIT Python 3.11+ FastAPI Research PRs Welcome

Built With Platform Code Style Stars Last Commit


Navigation Β· Overview Β· Project Lineage Β· Self-Referential Approach Β· Architecture Β· Quick Start Β· Modules Β· Safety Β· Research


✦ Overview

The Self-Referential Forge is the third generation of evolutionary optimization at NullLabTests. Where its predecessors evolved prompts and then agent blueprints, this forge evolves its own source code β€” a Darwin-style self-referential loop where the genetic operators, evaluators, safety guards, and meta-strategies are themselves subject to mutation, selection, and inheritance.

What Makes This Different

Feature Impact
🧬 Self-Modifying Operators Mutation operators rewrite the forge's own Python AST β€” inserting, deleting, and restructuring code at runtime
πŸ”„ Second-Order Evolution The meta-evolver tracks which self-mutations improve evolution quality and adjusts selection pressure accordingly
πŸ›‘οΈ Safety-Guarded Mutation Every self-modification passes through a safety validator that blocks dangerous patterns (eval, exec, destructive I/O)
πŸ“Š Self-Evaluation Forge components are scored on syntax validity, code complexity, safety compliance, modular cohesion, and style consistency
πŸ—ƒοΈ Immutable Archive Every evolution snapshot is gzip-compressed and stored, enabling full rollback and trajectory analysis
πŸŽ›οΈ Human-in-the-Loop Optional manual approval gate for every self-mutation

The Self-Referential Loop

   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  🧬 Forge Source Code (the entire forge package)         β”‚
   β”‚    β”œβ”€β”€ forge/orchestrator.py                             β”‚
   β”‚    β”œβ”€β”€ forge/self_modifier.py   ◄── MUTATION TARGET      β”‚
   β”‚    β”œβ”€β”€ evaluators/evaluator.py  ◄── MUTATION TARGET      β”‚
   β”‚    β”œβ”€β”€ safety/safety_validator.py ◄── MUTATION TARGET    β”‚
   β”‚    └── meta_evolution/*.py      ◄── MUTATION TARGET      β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  πŸ”§ SelfModifier applies genetic operators               β”‚
   β”‚    β”œβ”€β”€ insert_code     β€” inject logging/branching        β”‚
   β”‚    β”œβ”€β”€ rewrite_function β€” replace body with pass          β”‚
   β”‚    β”œβ”€β”€ add_parameter   β€” add optional param to function   β”‚
   β”‚    β”œβ”€β”€ swap_condition  β€” negate if-condition              β”‚
   β”‚    └── duplicate_component β€” clone class/func             β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  πŸ›‘οΈ SafetyValidator checks mutation                      β”‚
   β”‚    └── Blocks: eval, exec, os.system, destructives       β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  πŸ“Š SelfEvaluator scores the mutated component           β”‚
   β”‚    β”œβ”€β”€ syntax_validity   (AST parse)                     β”‚
   β”‚    β”œβ”€β”€ code_complexity   (node count, depth)             β”‚
   β”‚    β”œβ”€β”€ safety_compliance (dangerous patterns)            β”‚
   β”‚    β”œβ”€β”€ modular_cohesion  (imports, structure)            β”‚
   β”‚    └── style_consistency (indent, type hints)            β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  🧠 MetaEvolver updates operator weights                 β”‚
   β”‚    β”œβ”€β”€ Positive delta β†’ reinforce operator               β”‚
   β”‚    β”œβ”€β”€ Negative delta β†’ penalize operator                β”‚
   β”‚    └── Stagnation β†’ novelty boost                        β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                        β”‚
                        β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  πŸ—ƒοΈ Archivist snapshots the state to disk                β”‚
   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🧬 Project Lineage

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     self_referential_forge                             β”‚
β”‚                        (THIS PROJECT)                                  β”‚
β”‚  Evolves the forge's own source code β€” mutation operators,             β”‚
β”‚  evaluators, safety guards, and meta-strategies evolve through         β”‚
β”‚  self-modification. Second-order evolution loop with AST-level         β”‚
β”‚  mutations and safety-guarded self-modification.                       β”‚
β”‚                                                                         β”‚
β”‚  🧬 Self-modifying operators    πŸ”„ Second-order meta-evolution          β”‚
β”‚  πŸ›‘οΈ Safety-guarded mutation     πŸ“Š Self-evaluation (6 dimensions)       β”‚
β”‚  πŸ—ƒοΈ Immutable archive            πŸŽ›οΈ Human-in-the-loop optional          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β–²
                              β”‚ evolves from Β· self-referential fork
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     grounded_agent_forge                               β”‚
β”‚  Evolves full agent blueprints (prompt + tools + memory + planning    β”‚
β”‚  + self-eval) in Docker sandbox with multi-objective fitness,         β”‚
β”‚  meta-evolution, and task specialization.                              β”‚
β”‚                                                                         β”‚
β”‚  πŸ—οΈ Agent-level evolution    πŸ“¦ Docker sandboxed execution             β”‚
β”‚  🎯 8+ fitness dimensions    πŸ”„ Self-tuning meta-evolution             β”‚
β”‚  πŸ“Š Real-time dashboard      🧩 Task specialization                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β–²
                              β”‚ builds on Β· evolves from
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      grounded_evolution                               β”‚
β”‚  Evolves text prompts with execution-grounded validation via AST       β”‚
β”‚  parse, pytest, and flake8. Two-loop system: lexical + grounded.      β”‚
β”‚                                                                         β”‚
β”‚  πŸ“ 203 evolution cycles    πŸ† Best score: 39/80                       β”‚
β”‚  πŸ”¬ 7 benchmark tasks       πŸ”„ 127 mutations + 76 crossovers           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Capability Comparison

Capability grounded_evolution grounded_agent_forge πŸš€ self_referential_forge
Evolves text prompts βœ… βœ… ❌
Evolves agent blueprints ❌ βœ… ❌
Evolves its own source code ❌ ❌ βœ…
AST-level self-mutation ❌ ❌ βœ…
Safety-guarded modification ❌ ❌ βœ…
Docker sandbox execution ❌ βœ… ❌
Multi-objective fitness ❌ βœ… (8 dims) βœ… (6 dims)
Self-evaluation of components ❌ ❌ βœ…
Second-order meta-evolution ❌ βœ… βœ…
Novelty-driven exploration ❌ βœ… βœ…
Immutable archive / rollback ❌ ❌ βœ…
Human-in-the-loop mutations ❌ ❌ βœ…
Real-time dashboard ❌ βœ… βœ…
Auto-commit on improvement βœ… βœ… βœ…

This project was built using DeepSeek V4 as the primary coding model.


πŸ”¬ Self-Referential Approach

Darwin-Style Evolution Applied to Software

The Self-Referential Forge applies Darwinian principles β€” variation, selection, and inheritance β€” not to biological organisms or even to generated prompts, but to the forge's own implementation code.

Variation: Each evolution cycle selects a champion component (the highest-fitness piece of forge source) and applies a random genetic operator: inserting code, rewriting function bodies, adding parameters, negating conditions, or duplicating components. These operators work directly on the Python AST.

Selection: The mutated component is scored across six fitness dimensions β€” syntax validity (does it parse?), code complexity (is it non-trivial?), safety compliance (no dangerous patterns?), modular cohesion (well-structured?), style consistency (clean code?), and test survival (do tests still pass?). The weighted sum determines the component's fitness.

Inheritance: High-fitness components remain in the population and serve as parents for future mutations. Low-fitness variants are pruned. The meta-evolver tracks which operators consistently produce fitness gains and adjusts selection weights accordingly β€” a second-order evolution where the evolution strategy itself evolves.

Why Self-Referential?

Traditional evolutionary computation evolves solutions within a fixed framework (a genetic algorithm with hardcoded operators, fixed evaluation functions, static safety constraints). The self-referential forge removes these boundaries: the operators are mutatable, the evaluator is improvable, the safety rules are refinable.

This creates a system that can theoretically discover novel evolutionary strategies that a human designer would never have considered. The safety validator acts as the crucial guardrail β€” preventing the system from evolving destructive capabilities while still allowing creative exploration of the design space.


πŸ—οΈ Architecture

Module Map

self_referential_forge/
β”‚
β”œβ”€β”€ forge/                          # βš’οΈ Core evolution modules
β”‚   β”œβ”€β”€ __init__.py                 # Package exports
β”‚   β”œβ”€β”€ __main__.py                 # CLI entry point
β”‚   β”œβ”€β”€ orchestrator.py             # Self-referential evolution loop
β”‚   └── self_modifier.py            # AST-level code mutation engine
β”‚
β”œβ”€β”€ meta_evolution/                 # 🧠 Strategy adaptation
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── meta_evolver.py            # Operator weight tuning + novelty
β”‚
β”œβ”€β”€ evaluators/                     # πŸ“Š Self-evaluation
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── evaluator.py               # 6-dimension fitness scoring
β”‚
β”œβ”€β”€ safety/                         # πŸ›‘οΈ Tiered safety architecture
β”‚   β”œβ”€β”€ __init__.py              # Exports SafetyValidator, SafetyTier, etc.
β”‚   β”œβ”€β”€ policy.py               # Tier definitions, operatorβ†’tier mappings
β”‚   β”œβ”€β”€ audit.py                # Hash-chained, tamper-evident audit log
β”‚   β”œβ”€β”€ sandbox.py              # Fork-test-promote sandbox lifecycle
β”‚   └── safety_validator.py     # Unified facade coordinating all safety
β”‚
β”œβ”€β”€ archive/                        # πŸ—ƒοΈ State persistence
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── archivist.py               # Compressed snapshot management
β”‚
β”œβ”€β”€ benchmarks/                     # πŸ“ˆ Internal quality benchmarks
β”‚   β”œβ”€β”€ __init__.py
β”‚   └── benchmark_suite.py         # 5 evolution quality metrics
β”‚
β”œβ”€β”€ dashboard/                      # πŸ“Š Real-time web dashboard
β”‚   └── main.py                    # FastAPI + auto-refreshing UI
β”‚
β”œβ”€β”€ self_modification/             # πŸ”„ Alternative import path
β”‚   └── __init__.py                 # (re-exports from forge.self_modifier)
β”‚
β”œβ”€β”€ run_forge.sh                   # πŸš€ Production shell wrapper
β”œβ”€β”€ pyproject.toml                 # πŸ“¦ Project metadata
β”œβ”€β”€ README.md                      # πŸ“– This file
└── .env.example                   # πŸ” Environment template

πŸš€ Quick Start

Prerequisites

  • Python 3.11+
  • pip (package installer)

Setup

# Navigate to the self-referential forge
cd grounded_agent_forge/self_referential_forge

# Create virtual environment (optional but recommended)
python -m venv .venv && source .venv/bin/activate

# Install the forge
pip install -e .

# Configure environment (optional)
cp .env.example .env

Run the Forge

# Infinite self-referential evolution (default)
bash run_forge.sh

# Run for 50 cycles
bash run_forge.sh --cycles 50

# Launch with the real-time dashboard
bash run_forge.sh --dashboard

# Require manual approval for each mutation
bash run_forge.sh --human-approval

# All together
bash run_forge.sh --cycles 100 --dashboard --human-approval --verbose

You can also run directly via Python:

python -m forge                          # Infinite evolution
python -m forge --cycles 50              # 50 cycles
python -m forge --dashboard              # + dashboard
python -m forge --verbose                # Debug logging
python -m forge --help                   # Show all options

Launch the Dashboard

# Option A: Launch alongside the forge
python -m forge --dashboard

# Option B: Launch separately
uvicorn dashboard.main:app --host 0.0.0.0 --port 8000

# Open β†’ http://localhost:8000

πŸ“¦ Modules

βš’οΈ forge/orchestrator.py β€” Evolution Loop Coordinator

The central loop that drives self-modification:

  • Validates the forge environment on startup
  • Selects champion components via tournament selection
  • Delegates mutation to SelfModifier with safety gates
  • Evaluates fitness via SelfEvaluator
  • Updates meta-evolution strategy via MetaEvolver
  • Snapshots state via Archivist
  • Supports human-in-the-loop approval gating
  • Handles auto-commit for persistent improvement tracking

πŸ”§ forge/self_modifier.py β€” AST Mutation Engine

Five genetic operators that mutate the forge's own Python source by rewriting the AST:

Operator Description
insert_code Injects a randomized logging or branching statement into a function body
rewrite_function Replaces a random function body with pass (simplification operator)
add_parameter Adds an optional _extra_* parameter to a random function
swap_condition Negates a random if-condition (e.g., if x: β†’ if not x:)
duplicate_component Clones a random class or function definition

🧠 meta_evolution/meta_evolver.py β€” Strategy Adaptation

Tracks operator performance and adjusts the evolution strategy:

  • Weighted random operator selection based on historical success
  • Fitness delta observation reinforces/penalizes operators
  • Novelty boost triggers when stagnation is detected, amplifying low-weight operators
  • Mutation rate dynamically adjusted based on recent delta

πŸ“Š evaluators/evaluator.py β€” Six-Dimension Fitness

Dimension Weight What It Measures
🎯 Syntax Validity 25% Does the source parse as valid Python AST?
βš™οΈ Code Complexity 15% Node count, function/class density, nesting depth
πŸ›‘οΈ Safety Compliance 25% Absence of eval, exec, os.system, pickle, etc.
🧩 Modular Cohesion 15% Import ordering, structural separation, if __name__
πŸ§ͺ Test Survival 10% Do existing tests still pass? (neutral if no tests)
πŸ“ Style Consistency 10% Line length, indentation discipline, type hints

πŸ›‘οΈ safety/safety_validator.py β€” Guardrail System

Prevents the forge from evolving dangerous capabilities:

  • 9 dangerous patterns flagged: eval, exec, import, compile, os.system, subprocess, shutil.rmtree, git push, Docker destructive operations
  • 4 critical patterns flagged: os.remove, shutil.rmtree, Path.unlink
  • File boundary validation: ensures mutations stay within the forge directory
  • Extension whitelist: only .py, .toml, .md, .json, etc.
  • Directory blacklist: .git, __pycache__, node_modules, .venv
  • Max file size: 100KB per source file
  • Strict mode: disallows os/subprocess/shutil imports in mutated code

πŸ—ƒοΈ archive/archivist.py β€” Snapshot Manager

  • Gzip-compressed JSON snapshots per generation
  • Configurable max snapshot count (default: 100)
  • Full rollback capability to any generation
  • In-memory history with fitness trajectory access
  • Auto-loads existing snapshots on init

πŸ›‘οΈ Safety Architecture

Self-modifying code demands a defense-in-depth approach. The safety system is built on four independent layers, each designed to be a complete barrier:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    SELF-MODIFICATION SAFETY STACK                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                       β”‚
β”‚  β”Œβ”€ TIER 3: BLOCKED (never allowed) ──────────────────────────────┐  β”‚
β”‚  β”‚  eval(), exec(), compile(), __import__()                        β”‚  β”‚
β”‚  β”‚  os.system(), subprocess.*, shutil.rmtree()                     β”‚  β”‚
β”‚  β”‚  Path.unlink(), gc.collect()                                    β”‚  β”‚
β”‚  β”‚  Modifications to safety/ or archive/ modules                   β”‚  β”‚
β”‚  β”‚  Writes to .git/, __pycache__/, .venv/                           β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                              β–²                                         β”‚
β”‚  β”Œβ”€ TIER 2: HUMAN APPROVAL ───────────────────────────────────────┐  β”‚
β”‚  β”‚  swap_condition operator                                        β”‚  β”‚
β”‚  β”‚  Bare except: or broad except Exception:                       β”‚  β”‚
β”‚  β”‚  File write operations (open with w/wb/a mode)                  β”‚  β”‚
β”‚  β”‚  try: blocks (heuristic)                                        β”‚  β”‚
β”‚  β”‚  └─ Human must type 'y' at interactive prompt                   β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                              β–²                                         β”‚
β”‚  β”Œβ”€ TIER 1: DRY RUN (sandbox required) ────────────────────────────┐  β”‚
β”‚  β”‚  rewrite_function operator                                       β”‚  β”‚
β”‚  β”‚  add_parameter operator                                          β”‚  β”‚
β”‚  β”‚  duplicate_component operator                                    β”‚  β”‚
β”‚  β”‚  └─ Fails if: syntax error, AST violation, test failure          β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                              β–²                                         β”‚
β”‚  β”Œβ”€ TIER 0: AUTOMATED (no gate) ───────────────────────────────────┐  β”‚
β”‚  β”‚  insert_code operator (logging/branching injection)              β”‚  β”‚
β”‚  β”‚  └─ Still checked: syntax validity + dangerous pattern scan      β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  CROSS-CUTTING: SANDOX + AUDIT                                   β”‚  β”‚
β”‚  β”‚                                                                   β”‚  β”‚
β”‚  β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                    β”‚  β”‚
β”‚  β”‚  β”‚  SANDBOX          β”‚  β”‚  AUDIT LOG          β”‚                    β”‚  β”‚
β”‚  β”‚  β”‚  fork β†’ test β†’    β”‚  β”‚  hash-chained JSONL  β”‚                    β”‚  β”‚
β”‚  β”‚  β”‚  promote/rollback β”‚  β”‚  tamper-evident      β”‚                    β”‚  β”‚
β”‚  β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                    β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚                                                                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Layer 1: Tiered Policy (safety/policy.py)

Every mutation operator maps to a safety tier. The policy defines what is allowed at each tier:

Tier Name Gate Operators
0 AUTOMATED None insert_code
1 DRY_RUN Sandbox tests must pass rewrite_function, add_parameter, duplicate_component
2 HUMAN_APPROVAL Interactive prompt swap_condition
3 BLOCKED Never allowed eval, exec, os.system, subprocess, etc.

The policy also enforces:

  • Blocked AST nodes: Exec, Eval, Call (when used dangerously)
  • Blocked imports: os, subprocess, shutil, signal, ctypes, socket, multiprocessing
  • Protected modules: The safety/ and archive/ packages are immune to mutation
  • Path restrictions: Only .py, .toml, .md, .json, etc. files within the forge root

Layer 2: Sandbox Testing (safety/sandbox.py)

Every Tier 1+ mutation must pass through the fork-test-promote lifecycle:

  BEFORE                          AFTER (if tests pass)
    β”‚                                 β–²
    β–Ό                                 β”‚
  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
  β”‚  1. FORK: Copy source to temp sandbox    β”‚
  β”‚  2. MUTATE: Apply operator to sandbox    β”‚
  β”‚  3. VALIDATE:                             β”‚
  β”‚     β”œβ”€ Syntax check (ast.parse)          β”‚
  β”‚     β”œβ”€ AST safety scan (blocked nodes)   β”‚
  β”‚     β”œβ”€ Dangerous pattern scan (regex)    β”‚
  β”‚     └─ Unit test execution (pytest)      β”‚
  β”‚  4. PROMOTE: Write to live file          β”‚
  β”‚     OR                                    β”‚
  β”‚     ROLLBACK: Restore original source     β”‚
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Layer 3: Hash-Chained Audit Log (safety/audit.py)

Every mutation β€” whether approved or rejected β€” is recorded in an append-only, hash-chained audit log. Each entry contains:

Field Description
timestamp Unix timestamp of the mutation
mutation_id Unique UUID for the mutation
operator Name of the mutation operator
tier Safety tier at which it was assessed
component_path Relative path to the mutated component
source_hash_before SHA-256 of source before mutation
source_hash_after SHA-256 of source after mutation
safety_verdict approved, rejected, or rolled_back
safety_violations List of violations (if rejected)
human_approver Who approved (if Tier 2)
sandbox_result passed, failed, or skipped
prev_hash SHA-256 of the previous entry
entry_hash SHA-256 of this entry

The prev_hash β†’ entry_hash chain provides tamper evidence: modifying an entry changes its hash, breaking the chain for all subsequent entries. The chain can be verified at any time:

from safety import AuditLog
audit = AuditLog()
chain_ok, issues = audit.verify_chain()

Layer 4: Environment Validation (safety/safety_validator.py)

Before any mutation cycle begins, the safety validator performs a full environment scan:

  • All forge source files must parse as valid Python
  • No blocked modules are imported in forge source
  • The forge root directory is writable and accessible
  • The audit log chain is intact from the previous run

Human-in-the-Loop Gates

Scenario Mechanism
Tier 2 operator selected Interactive prompt: Approve this self-modification? [y/N]
--human-approval flag Every mutation, regardless of tier, requires approval
Consecutive safety failures Forge halts after max_consecutive_failures (default: 5)
Audit chain violation Error logged immediately upon detection

Rollback Mechanisms

Mechanism Trigger Scope
Git revert Auto-commit before each mutation; git revert on failure Full file
Archive restore Snapshots every generation; rollback_to(N) Full state
Sandbox rollback Mutation fails sandbox tests Single file
Safety violation Dangerous pattern detected Mutation cancelled (no write)

Red Team Scenarios

Attack Defense
Mutation inserts eval() Blocked at Tier 3 β€” never allowed
Mutation imports os to delete files Blocked at Tier 3 β€” blocked import
Mutation modifies safety code Blocked β€” protected module
Mutation writes outside forge root Blocked β€” path sandboxing
Operator tries to escape tier Blocked — operator→tier mapping is immutable policy
Audit log is tampered Detected β€” hash chain breaks on next verification
Sandbox test is skipped Impossible β€” Tier 1+ requires sandbox pass before promote

Safety Checklist for Production Use

  • Always use --human-approval for unattended runs
  • Review .env.example and set HUMAN_APPROVAL=true
  • Verify the audit log chain before each session
  • Run in a dedicated directory or container
  • Ensure git is initialized for rollback capability
  • Set META_PATH to a persistent, backed-up location
  • Review safety/policy.py before adding new operators
  • Never disable safety (--safety-off) in production

πŸ”¬ Research Context

What This Project Explores

Research Direction Description
🧬 Self-Referential Evolution Can a system improve its own evolutionary algorithm by mutating its own source code?
πŸ”„ Second-Order Adaptation Does tracking operator-level fitness deltas lead to more efficient evolution than fixed-operator GAs?
πŸ›‘οΈ Safe Self-Modification Can safety guardrails be designed that are robust enough to allow creative exploration while preventing destructive outcomes?
πŸ“Š Self-Evaluation Accuracy Do AST-level fitness metrics (syntax, complexity, safety) correlate with actual evolution quality?
πŸ—ƒοΈ Evolutionary Memory Does maintaining an immutable archive of all mutations improve the system's ability to roll back from local optima?

What This Is NOT

  • ❌ A claim of AGI, sentience, or consciousness
  • ❌ An unconstrained recursive self-improvement system
  • ❌ A production-ready code generator

βœ… It is a well-scoped experimental platform for studying how genetic algorithms can safely modify their own implementation β€” with strong guardrails, immutable history, and full human oversight.


πŸ“„ License

MIT β€” see LICENSE.


πŸ™ Credits

Contribution Link
🧬 Predecessor grounded_agent_forge β€” full agent blueprint evolution platform
πŸ“œ Inspiration grounded_evolution β€” execution-grounded prompt evolution with 203 evolution cycles
πŸ€– Primary Coding Model DeepSeek V4 β€” used as the primary AI coding model for this entire project

Made with 🧬 by NullLabTests · Evolution is the ultimate optimizer

License Issues

About

Self-referential Darwin-style agent forge that evolves its own mutation operators, evaluators, and core components. Built with DeepSeek V4.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors