This document covers LLM configuration, pipeline internals, and how to extend the system.
The LLM integration is handled by utils/call_llm.py, supporting multiple providers with automatic detection.
| Provider | Detection | Default Model |
|---|---|---|
| Gemini | GEMINI_API_KEY or GEMINI_PROJECT_ID |
gemini-2.5-pro-exp-03-25 |
| Anthropic | ANTHROPIC_API_KEY |
claude-3-5-sonnet-20241022 |
| OpenAI-compatible | LLM_PROVIDER + {PROVIDER}_BASE_URL |
Configurable |
For Gemini 2.5 models, we configure thinking_budget to control reasoning depth vs speed:
| Setting | Speed | Quality | Use Case |
|---|---|---|---|
0 |
Fastest | Lower | Quick tests, simple tasks |
1024 (default) |
Fast | Good | Production - recommended |
8192 |
Slow | Best | Deep analysis, complex code |
-1 |
Variable | Dynamic | Model auto-adjusts |
# Environment variable (default: 1024)
export GEMINI_THINKING_BUDGET=1024
# For fastest processing (disable thinking)
export GEMINI_THINKING_BUDGET=0
# For best quality (full thinking)
export GEMINI_THINKING_BUDGET=8192Impact: With thinking_budget=1024, Pro model processes large files in ~4 minutes vs 24-80+ minutes with default (8192).
# Google Gemini (recommended)
export GEMINI_API_KEY=your_key
export GEMINI_MODEL=gemini-2.5-pro
# Anthropic Claude
export ANTHROPIC_API_KEY=your_key
export ANTHROPIC_MODEL=claude-3-5-sonnet-20241022
# Ollama (local)
export LLM_PROVIDER=OLLAMA
export OLLAMA_MODEL=llama2
export OLLAMA_BASE_URL=http://localhost:11434
# OpenRouter
export LLM_PROVIDER=OPENROUTER
export OPENROUTER_MODEL=anthropic/claude-3-opus
export OPENROUTER_BASE_URL=https://openrouter.ai/api
export OPENROUTER_API_KEY=your_keyLLM responses are cached to llm_cache.json:
- Enable: Default behavior, or explicit
use_cache=True - Disable:
--no-cacheCLI flag oruse_cache=False - Cache key is the exact prompt text
Each node follows a three-phase lifecycle:
prep(shared) → exec(prep_result) → post(shared, prep_result, exec_result)
prep(shared): Extract data from shared stateexec(prep_result): Execute main logic (LLM calls, I/O)post(shared, prep_result, exec_result): Store results back
flowchart LR
A[FetchRepo] --> B[IdentifyAbstractions]
B --> C[AnalyzeRelationships]
C --> D[OrderChapters]
D --> E[WriteChapters]
E --> F[CombineTutorial]
| Node | Purpose | LLM? |
|---|---|---|
| FetchRepo | Download code from GitHub or local dir | No |
| IdentifyAbstractions | Find 5-10 core concepts | Yes |
| AnalyzeRelationships | Map concept interactions | Yes |
| OrderChapters | Determine presentation order | Yes |
| WriteChapters | Generate chapter content | Yes |
| CombineTutorial | Assemble final output | No |
See KNOWLEDGE_EXTRACTION.md for the alternative bottom-up analysis mode.
LLM nodes use automatic retries:
identify_abstractions = IdentifyAbstractions(max_retries=5, wait=20)- Cache is disabled on retries for fresh responses
- Raises
ValueErroron validation failure (triggers retry)
shared = {
# Input configuration
"repo_url": str | None,
"local_dir": str | None,
"project_name": str,
"output_dir": str,
"language": str,
"use_cache": bool,
# Populated by pipeline
"files": [(path, content), ...],
"abstractions": [{"name", "description", "files": [indices]}],
"relationships": {"summary", "details": [{"from", "to", "label"}]},
"chapter_order": [abstraction_indices],
"chapters": [markdown_content],
"final_output_dir": str
}files[i] → (filepath, content)
abstractions[j]["files"] → list of file indices
relationships["details"][k] → {"from": idx, "to": idx}
chapter_order[n] → abstraction index
chapters[n] → content for abstractions[chapter_order[n]]
Location: main.py
# 1. Add to parser
parser.add_argument("--new-option", type=str, default="default")
# 2. Add to shared dict
shared = {
...
"new_option": args.new_option,
}
# 3. Access in nodes
def prep(self, shared):
new_option = shared.get("new_option", "default")Step 1: Create in nodes.py
class NewAnalysisNode(Node):
def prep(self, shared):
return shared["input_data"]
def exec(self, prep_res):
return process(prep_res)
def post(self, shared, prep_res, exec_res):
shared["new_output"] = exec_resStep 2: Add to flow.py
from nodes import NewAnalysisNode
def create_tutorial_flow():
new_analysis = NewAnalysisNode(max_retries=3, wait=10)
fetch_repo >> identify >> new_analysis >> analyzeEach LLM node has a prompt variable in exec():
# nodes.py - IdentifyAbstractions.exec()
prompt = f"""
For the project `{project_name}`:
...
# Modify prompt here
...
"""Location: utils/call_llm.py
# 1. Add provider function
def _call_llm_newprovider(prompt: str) -> str:
api_key = os.getenv("NEWPROVIDER_API_KEY")
client = newprovider.Client(api_key)
return client.complete(prompt).text
# 2. Update get_llm_provider()
def get_llm_provider():
...
if not provider and os.getenv("NEWPROVIDER_API_KEY"):
provider = "NEWPROVIDER"
return provider
# 3. Update call_llm() routing
if provider == "NEWPROVIDER":
response_text = _call_llm_newprovider(prompt)Location: nodes.py - CombineTutorial class
def exec(self, prep_res):
# Add JSON output alongside Markdown
json_output = {
"project_name": prep_res["project_name"],
"chapters": [...]
}
with open(os.path.join(output_path, "tutorial.json"), "w") as f:
json.dump(json_output, f, indent=2)# Small repo test
python main.py --repo https://github.com/small/repo --max-abstractions 3
# Local directory
python main.py --dir ./project --no-cache
# Specific language
python main.py --dir ./project --language chinese
# Knowledge mode
python main.py --dir ./project --mode knowledge- Check logs:
logs/llm_calls_YYYYMMDD.log - Disable cache:
--no-cache - Add debug prints: In node's
exec()method
print(f"DEBUG - Prompt length: {len(prompt)}")
response = call_llm(prompt)
print(f"DEBUG - Response preview: {response[:200]}")| File | Purpose | Modify When |
|---|---|---|
main.py |
CLI & configuration | Adding CLI arguments |
flow.py |
Pipeline orchestration | Adding/reordering nodes |
nodes.py |
Core processing | Changing analysis, prompts |
utils/call_llm.py |
LLM abstraction | Adding providers, caching |
utils/crawl_*.py |
File fetching | Changing file discovery |
- ARCHITECTURE.md - System architecture overview
- KNOWLEDGE_EXTRACTION.md - Knowledge extraction mode