Commit e645761
Implement _tokenize and update tokenize from v3.14.3 (#7392)
* Base implementation of _tokenize module
Port from PR #6240 by ShaharNaveh, adapted to current codebase.
Uses ruff_python_parser for tokenization via TokenizerIter.
* Update tokenize from v3.14.3
* Rewrite _tokenize with 2-phase model
Replace per-line reparsing with single-pass tokenization:
- Read all lines via readline, parse once, yield tokens
- Fix token type values (COMMENT=65, NL=66, OP=55)
- Fix NEWLINE/NL end positions and implicit newline handling
- Fix DEDENT positions via look-ahead to next non-DEDENT token
- Handle FSTRING_MIDDLE brace unescaping ({{ → {, }} → })
- Emit implicit NL before ENDMARKER when source lacks trailing newline
- Raise IndentationError from lexer errors
- Remove 13 expectedFailure marks for now-passing tests
---------
Co-authored-by: ShaharNaveh <shaharnaveh@users.noreply.github.com>
Co-authored-by: CPython Developers <>File tree
7 files changed
+2872
-378
lines changed- Lib
- test
- crates/stdlib
- src
7 files changed
+2872
-378
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
316 | 316 | | |
317 | 317 | | |
318 | 318 | | |
319 | | - | |
320 | 319 | | |
321 | 320 | | |
322 | 321 | | |
| |||
0 commit comments