A Python template to quickstart any project with a production-ready workflow, quality tooling, and AI-assisted development.
Features flow through 5 steps with a WIP limit of 1 feature at a time. The filesystem enforces WIP:
docs/features/backlog/<feature-name>.feature— features waiting to be worked ondocs/features/in-progress/<feature-name>.feature— exactly one feature being built right nowdocs/features/completed/<feature-name>.feature— accepted and shipped features
STEP 1: SCOPE (product-owner) → discovery + Gherkin stories + criteria
STEP 2: ARCH (software-engineer) → read all features + existing package files, write domain stubs (signatures only, no bodies); ADRs to docs/architecture/
STEP 3: TDD LOOP (software-engineer) → RED → GREEN → REFACTOR, one @id at a time
STEP 4: VERIFY (reviewer) → run all commands, review code
STEP 5: ACCEPT (product-owner) → demo, validate, move folder to completed/
PO picks the next feature from backlog. Software-engineer never self-selects.
Verification is adversarial. The reviewer's job is to try to break the feature, not to confirm it works. The default hypothesis is "it might be broken despite green checks; prove otherwise."
- Product Owner (PO) — AI agent. Interviews the stakeholder, writes discovery docs, Gherkin features, and acceptance criteria. Accepts or rejects deliveries.
- Stakeholder — Human. Answers PO's questions, provides domain knowledge, approves PO syntheses to confirm discovery is complete.
- Software Engineer — AI agent. Architecture, test bodies, implementation, git. Never edits
.featurefiles. Escalates spec gaps to PO. - Reviewer — AI agent. Adversarial verification. Reports spec gaps to PO.
- product-owner — defines scope (4 phases), picks features, accepts deliveries
- software-engineer — architecture, tests, code, git, releases (Steps 2-3 + release)
- reviewer — runs commands and reviews code at Step 4, produces APPROVED/REJECTED report
- setup-project — one-time setup to initialize a new project from this template
| Skill | Used By | Step |
|---|---|---|
session-workflow |
all agents | every session |
feature-selection |
product-owner | between features (idle state) |
scope |
product-owner | 1 |
implementation |
software-engineer | 2, 3 |
design-patterns |
software-engineer | 2, 3 (on-demand, when GoF pattern needed) |
refactor |
software-engineer | 3 (REFACTOR phase + preparatory refactoring) |
verify |
reviewer | 4 |
code-quality |
software-engineer | pre-handoff (redirects to verify) |
pr-management |
software-engineer | 5 |
git-release |
software-engineer | 5 |
create-skill |
software-engineer | meta |
create-agent |
human-user | meta |
Session protocol: Every agent loads skill session-workflow at session start. Load additional skills as needed for the current step.
PO creates docs/features/discovery.md using the 3-session template. Skip Phase 1 entirely if discovery.md Status is BASELINED. To add features to an existing project: append new questions to Session 1 and re-fill from there.
- Session 1 — Individual scope elicitation: 5Ws + Success + Failure + Out-of-scope. Gap-finding per answer using CIT, Laddering, and CI Perspective Change. PO writes synthesis; stakeholder confirms or corrects. PO runs silent pre-mortem on confirmed synthesis. Template §1 must be confirmed before Session 2.
- Session 2 — Behavior groups / big picture: questions target behavior groups and cross-cutting concerns. Gap-finding per group. Level 2 synthesis when transitioning between groups. Template §2 must be complete before Session 3.
- Session 3 — Synthesis approval + feature derivation: PO produces full synthesis of all behavior groups; stakeholder approves or corrects (PO refines until approved). Domain analysis: nouns/verbs → subject areas → FDD "Action object" feature names. Create
backlog/<name>.featurestubs. WriteStatus: BASELINEDtodiscovery.md.
Each .feature file has its own 3-session discovery template in its description. Sessions are enforced by the template: each section must be filled before proceeding to the next.
- Session 1 — Individual entity elicitation: populate Entities table from project discovery; generate questions from entity gaps using CIT, Laddering, CI Perspective Change. PO writes synthesis; stakeholder confirms. Silent pre-mortem on confirmed synthesis.
- Session 2 — Behavior groups / big picture: questions target behavior groups within this feature. Gap-finding per group. Level 2 group transition summaries.
- Session 3 — Feature synthesis approval + story derivation: PO produces synthesis of feature scope and behavior groups; stakeholder approves or corrects (PO refines until approved). Story candidates become candidate user stories (Rules). Write
Status: BASELINEDto.featurediscovery section.
Decomposition check: after Session 3, does this feature span >2 distinct concerns OR have >8 candidate Examples? YES → split into separate .feature files, re-run Phase 2. NO → proceed.
Story candidates from Phase 2 Session 2 → one Rule: block per user story. Each Rule: has the user story header (As a / I want / So that) as its description — no Example: blocks yet. INVEST gate: all 6 letters must pass. Commit: feat(stories): write user stories for <name>
Pre-mortem per Rule (all Rules must be checked before writing Examples). Write Example: blocks — declarative Given/When/Then, MoSCoW triage (Must/Should/Could) per Example. Review checklist (4.3). Commit: feat(criteria): write acceptance criteria for <name>
Criteria are frozen: no Example: changes after commit. Adding new Example with new @id replaces old.
docs/features/
discovery.md ← project-level (Status + Questions only)
backlog/<feature-name>.feature ← one per feature; discovery + Rules + Examples
in-progress/<feature-name>.feature ← file moves here at Step 2
completed/<feature-name>.feature ← file moves here at Step 5
docs/architecture/
STEP2-ARCH.md ← Step 2 reference diagram (canonical)
adr-NNN-<title>.md ← one per significant architectural decision
tests/
features/<feature-name>/
<rule_slug>_test.py ← one per Rule: block, software-engineer-written
unit/
<anything>_test.py ← software-engineer-authored extras (no @id traceability)
Tests in tests/unit/ are software-engineer-authored extras not covered by any @id criterion. Any test style is valid — plain assert or Hypothesis @given. Use Hypothesis when the test covers a property that holds across many inputs (mathematical invariants, parsing contracts, value object constraints). Use plain pytest for specific behaviors or single edge cases discovered during refactoring.
@pytest.mark.slowis mandatory on every@given-decorated test (Hypothesis is genuinely slow)@example(...)is optional but encouraged when using@givento document known corner cases- No
@idtags — tests with@idbelong intests/features/, written by software-engineer
tests/features/<feature-name>/<rule_slug>_test.py
def test_<rule_slug>_<8char_hex>() -> None:@pytest.mark.skip(reason="not yet implemented")
def test_wall_bounce_a3f2b1c4() -> None:
"""
Given: A ball moving upward reaches y=0
When: The physics engine processes the next frame
Then: The ball velocity y-component becomes positive
"""
# Given
# When
# Then@pytest.mark.slow— takes > 50ms; applied to Hypothesis tests and any test with I/O, network, or DB@pytest.mark.deprecated— auto-skipped by conftest; used for superseded Examples
# Install dependencies
uv sync --all-extras
# Run the application (for humans)
uv run task run
# Run the application with timeout (for agents — prevents hanging)
timeout 10s uv run task run
# Run tests (fast, no coverage)
uv run task test-fast
# Run full test suite with coverage
uv run task test
# Run slow tests only
uv run task test-slow
# Lint and format
uv run task lint
# Type checking
uv run task static-check
# Serve documentation
uv run task doc-serve- Principles (in priority order): YAGNI > KISS > DRY > SOLID > Object Calisthenics
- Linting: ruff, Google docstring convention,
noqaforbidden - Type checking: pyright, 0 errors required
- Coverage: 100% (measured against your actual package)
- Function length: ≤ 20 lines
- Class length: ≤ 50 lines
- Max nesting: 2 levels
- Instance variables: ≤ 2 per class (exception: dataclasses, Pydantic models, value objects, and TypedDicts are exempt — they may carry as many fields as the domain requires)
- Semantic alignment: tests must operate at the same abstraction level as the acceptance criteria they cover
- Integration tests: multi-component features require at least one test in
tests/features/that exercises the public entry point end-to-end
During Step 3 (TDD Loop), correctness priorities are:
- Design correctness — YAGNI > KISS > DRY > SOLID > Object Calisthenics > appropriate design patterns
- One test green — the specific test under work passes, plus
test-faststill passes - Reviewer code-design check — reviewer verifies design + semantic alignment (no lint/pyright/coverage)
- Commit — only after reviewer APPROVED
- Quality tooling —
lint,static-check, fulltestwith coverage run only at software-engineer handoff (before Step 5)
Design correctness is far more important than lint/pyright/coverage compliance. A well-designed codebase with minor lint issues is better than a lint-clean codebase with poor design.
- Automated checks (lint, typecheck, coverage) verify syntax-level correctness — the code is well-formed.
- Human review (semantic alignment, code review, manual testing) verifies semantic-level correctness — the code does what the user needs.
- Both are required. All-green automated checks are necessary but not sufficient for APPROVED.
- Reviewer defaults to REJECTED unless correctness is proven.
This template does not support deprecation. Criteria changes are handled by adding new Examples with new @id tags.
Version format: v{major}.{minor}.{YYYYMMDD}
- Minor bump for new features; major bump for breaking changes
- Same-day second release: increment minor, keep same date
- Each release gets a unique adjective-animal name
Use @software-engineer /skill git-release for the full release process.
Every session: load skill session-workflow. Read TODO.md first, update it at the end.
TODO.md is a session bookmark — not a project journal. See docs/workflow.md for the full structure including the Cycle State and Self-Declaration blocks used during Step 4.
To initialize a new project from this template:
@setup-projectThe setup agent will ask for your project name, GitHub username, author info, and configure all template placeholders.