tools

Developer tools useful for maintaining the repository

This document summarizes the developer tooling and workflows used in this repo.

pip install -e . --group dev

1) Pre-commit (recommended)

Enable the repository hooks locally:

pre-commit install

Run on all files:

Steering committee members may edit the NOTICE.yml to update the header.

2) Ruff cleanup helpers

For local Ruff backlog work (not a substitute for CI or pre-commit), see Ruff cleanup helpers. It documents generate_ruff_report.py (Markdown report from Ruff JSON) and fix_e501_with_autopep8.py (targeted long-line cleanup plus Ruff fix/format).

3) License headers

Code headers can be standardized by running:

Please follow the instructions in CONTRIBUTING.md for contributing to the codebase, including running tests and pre-commit checks before opening a pull request.

Run from the repository root. Update NOTICE.yml to change header content.

4) Running tests locally

Run the full test suite

pytest

Run a specific test module or folder

coverage run -m pytest
coverage report

5) Intelligent test selection (local + CI)

The repository includes a deterministic test-selection tool to reduce CI runtime by running only the relevant workflows and tests based on changed files.

What it outputs

The selector emits orthogonal workflow lanes plus structured selections:

lanes: which workflow lanes should run
- skip: skip test execution entirely (for lint-only changes)
- docs: run docs checks
- fast: run targeted pytest paths and optional functional scripts
- full: delegate to the full test workflow / matrix
pytest_paths: list of pytest path arguments (JSON)
functional_scripts: list of Python scripts to run (JSON)
provenance: mapping from each selected test/script to the category rule(s) that selected it

It also emits audit metadata:

selected_workflows: ordered list of enabled lanes (skip, docs, fast, full)
lane_reasons: reasons for each enabled lane
diff_mode: how the diff range was determined
reasons: aggregate machine-readable reasons for the decision
changed_files: files considered for the decision
schema_version: output schema version

Rule configuration

Routing rules are defined in tools/test_selector_config.py.

That file contains:

reusable path predicate helpers such as prefix(...), suffix(...), equals(...), case_insensitive_match(...), and all_of(...)
conservative FULL_SUITE_TRIGGERS
LINT_ONLY_FILES
validated CATEGORY_RULES built from the CategoryRule schema
CATEGORY_RULE_BY_NAME for stable lookup of named rules such as docs

The current refactor keeps the rule predicates simple and location-based while validating the rule structure at import time.

Run locally (no CI env required)

Important

Requires pydantic>=2,<3

Print the decision as JSON:

python tools/test_selector.py --json

Write the decision report (selection.json and decision.md) under tmp/test-selection/:

python tools/test_selector.py --report-dir tmp/test-selection --json

Write a GitHub job-summary-compatible Markdown report when GITHUB_STEP_SUMMARY is available:

python tools/test_selector.py --report-dir tmp/test-selection --write-summary

Override the diff range manually:

python tools/test_selector.py --base-sha <base> --head-sha <head> --json

In GitHub Actions, the workflow typically adds --write-github-output and --write-summary.

Diff modes

The selector records how the diff was determined in diff_mode:

pr: pull request diff using merge-base(base, head)..head
push: push diff using before..after
manual: explicit --base-sha / --head-sha
fallback: fallback to HEAD^..HEAD
initial: initial commit (empty-tree..HEAD)
fallback_no_head: could not resolve HEAD

Report files

The selector always writes report artifacts for transparency:

tmp/test-selection/selection.json: machine-readable output
tmp/test-selection/decision.md: human-readable summary with workflow lanes, reasons, explained changed files, selected tests, and provenance

These reports are especially useful when a change unexpectedly routes to full.

Notes

The selector can enable more than one lane at once. For example, a PR can legitimately enable both docs and fast, or docs and full.
Docs changes are orthogonal to test routing: docs changes can enable the docs lane while still contributing selected tests/scripts if such rules are configured.
LINT_ONLY_FILES are ignored for routing. If only lint-only files changed, the selector enables the skip lane.
If category rules match changed files but do not contribute explicit tests/scripts, the selector can fall back to the minimal pytest set defined by MINIMAL_PYTEST.

Troubleshooting the selector

If a workflow run is unexpectedly selecting full, check:

tmp/test-selection/decision.md
tmp/test-selection/selection.json
lane_reasons
diff_mode
changed_files

Common causes include:

a file matched a conservative full-suite trigger
no category rule matched the routed files
selected paths configured by a rule no longer exist in the repository
diff resolution fell back because CI checkout history was incomplete

6) Docs: Jupyter Book build (local)

The repo uses Jupyter Book for docs:

python -m pip install -U pip
python -m pip install .[docs]
jupyter-book build .

.github/workflows/build-book.yml is the canonical CI implementation.

7) Testing the test selector

The selector has dedicated tests covering:

decision behavior for docs / fast / full / skip routing
provenance and deduplicated selections
CategoryRule schema validation
integrity checks for the currently defined rules

Run the selector-focused tests with:

pytest tests/tools/test_selector/

8) Troubleshooting tips

If a workflow run is unexpectedly selecting full, inspect the selector reports first.
If targeted tests fail due to missing dependencies, either:
- broaden the fast-lane install (for example by installing required extras), or
- adjust selection rules so that the fast lane only selects tests that run in the minimal environment.
If manual diff selection is used, always pass both --base-sha and --head-sha together.
In CI, ensure checkout history is deep enough for merge-base / diff operations (fetch-depth: 0 is typically safest).

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
__init__.py		__init__.py
docs_and_notebooks_check.py		docs_and_notebooks_check.py
docs_and_notebooks_report_config.yml		docs_and_notebooks_report_config.yml
docs_and_notebooks_tool_README.md		docs_and_notebooks_tool_README.md
find_import_cycles.py		find_import_cycles.py
ruff_cleanup_helpers.md		ruff_cleanup_helpers.md
ruff_report.py		ruff_report.py
test_selector.py		test_selector.py
test_selector_config.py		test_selector_config.py
trim_lines.py		trim_lines.py
update_license_headers.py		update_license_headers.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

README.md

Developer tools useful for maintaining the repository

1) Pre-commit (recommended)

2) Ruff cleanup helpers

3) License headers

4) Running tests locally

Run the full test suite

Run a specific test module or folder

5) Intelligent test selection (local + CI)

What it outputs

Rule configuration

Run locally (no CI env required)

Diff modes

Report files

Notes

Troubleshooting the selector

6) Docs: Jupyter Book build (local)

7) Testing the test selector

8) Troubleshooting tips

Uh oh!

FilesExpand file tree

tools

Directory actions

More options

Directory actions

More options

Latest commit

History

tools

Folders and files

parent directory

README.md

Developer tools useful for maintaining the repository

1) Pre-commit (recommended)

2) Ruff cleanup helpers

3) License headers

4) Running tests locally

Run the full test suite

Run a specific test module or folder

5) Intelligent test selection (local + CI)

What it outputs

Rule configuration

Run locally (no CI env required)

Diff modes

Report files

Notes

Troubleshooting the selector

6) Docs: Jupyter Book build (local)

7) Testing the test selector

8) Troubleshooting tips