Skip to content

Order multipart parser states by frequency in the dispatch ladder#303

Closed
Kludex wants to merge 1 commit into
mainfrom
order-parser-states-by-frequency
Closed

Order multipart parser states by frequency in the dispatch ladder#303
Kludex wants to merge 1 commit into
mainfrom
order-parser-states-by-frequency

Conversation

@Kludex
Copy link
Copy Markdown
Owner

@Kludex Kludex commented Jun 4, 2026

What

The PART_DATA state machine walked its if/elif state == ... ladder in lifecycle order, so the hottest states (PART_DATA, the header states) sat near the bottom and paid a failed comparison for every state above them - ~84 state == comparisons per part for a 100-field form.

Reorder the branches by how often each state is actually active (measured): boundary and part-data states first, the once-per-stream START/END states last. The branches are mutually exclusive on state, so this is a pure reorder - no branch body changes, no behavior change.

Impact

Scenario before after
large (100 fields) ~400 us ~363 us (~9%)
simple ~15.1 us ~14.0 us (~7%)
worstcase_crlf ~56 us ~50 us (~10%)
file upload unchanged unchanged

This targets the per-part Python-level work that simple/large are bound by - the cases where this parser trailed the others - without touching the bulk paths where it already leads.

Correctness

Verified byte-identical to the prior parser across ~135k differential comparisons of the full callback-event stream and error type/offset: every chunk-split strategy (whole, byte-by-byte, fixed-size, random) plus boundary-edge sweeps, crossed with enabled-callback subsets. Zero mismatches. 158 tests pass, 100% coverage. A content-level check confirms every non-comment line is unchanged - only the branch order and if/elif keywords differ.

AI Disclaimer

This PR was developed with the assistance of either Claude or Codex. I've reviewed and verified the changes.

The PART_DATA state machine walked its `if/elif state == ...` ladder in
lifecycle order, so the hottest states (PART_DATA, the header states) sat
near the bottom and paid a failed comparison for every state above them -
~84 state comparisons per part for a 100-field form.

Reorder the branches by how often each state is actually active: boundary
and part-data first, the once-per-stream START/END states last. The
branches are mutually exclusive on `state`, so this is a pure reorder with
no behavior change - verified byte-identical to the prior parser across
~135k differential comparisons (every chunk-split strategy incl.
byte-by-byte and boundary-edge sweeps, crossed with callback subsets).

large ~9% faster (400 to 363 us), simple ~7%, worstcase_crlf ~10%; file
upload unchanged.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Jun 4, 2026

Merging this PR will not alter performance

✅ 5 untouched benchmarks


Comparing order-parser-states-by-frequency (f3f8a42) with main (238ead6)

Open in CodSpeed

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

Tip: cubic could auto-approve low-risk PRs like this, if it thinks it's safe to merge. Learn more

Re-trigger cubic

@Kludex Kludex closed this Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant