gh-152100: Move re compiler optimizations to Lib/re/_optimizer.py#152154
Merged
Conversation
Move the compile-time optimizations (_optimize_charset, _compile_charset, _simple, _compile_info and the literal/charset prefix helpers) out of _compiler.py into a new Lib/re/_optimizer.py. _compiler.py keeps only the bytecode emitter and imports them. This is groundwork for a follow-up optimization; there is no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
81875f4 to
586b2b8
Compare
eendebakpt
reviewed
Jun 25, 2026
eendebakpt
left a comment
Contributor
There was a problem hiding this comment.
I see the PR was just merged. Two minor comments anyway
| from ._constants import * | ||
|
|
||
| _CHARSET_ALL = [(NEGATE, None)] | ||
| _UNIT_CODES = {LITERAL, NOT_LITERAL, ANY, IN, CATEGORY} |
Contributor
There was a problem hiding this comment.
Since we are moving around quite some code anyway, should we change to
Suggested change
| _UNIT_CODES = {LITERAL, NOT_LITERAL, ANY, IN, CATEGORY} | |
| _UNIT_CODES = frozenset({LITERAL, NOT_LITERAL, ANY, IN, CATEGORY}) |
Member
Author
There was a problem hiding this comment.
It was a set before. frozenset does not have any advantage over set.
|
|
||
| """Internal support module for sre. | ||
|
|
||
| Optimization passes used by the compiler: character-set optimization |
Contributor
There was a problem hiding this comment.
The _compile_charset is not listed here (but it is imported from this module).
Since these things can get outdated, I am also fine with leaving out this docstring
Member
Author
There was a problem hiding this comment.
It will be updated in follow up anyway.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Move the compile-time optimizations out of
_compiler.pyinto a newLib/re/_optimizer.py:_optimize_charset,_compile_charset,_simple,_compile_info, the literal/charset prefix helpers,_combine_flagsand the related constants._compiler.pykeeps only the bytecode emitter and imports them.The dependency is now one-directional (
_compiler→_optimizer→_constants/_sre/_parser). There is no behavior change: the compiled bytecode is identical andtest_repasses unchanged (one test repointed_generate_overlap_tableto its new home).This is groundwork for follow-up compile-side optimizations under gh-152100 that would otherwise accumulate in
_compiler.py.🤖 Generated with Claude Code