Skip to content

[mypyc] Add librt.strings.isspace char primitive#21462

Open
VaggelisD wants to merge 3 commits into
python:masterfrom
VaggelisD:isspace-upstream
Open

[mypyc] Add librt.strings.isspace char primitive#21462
VaggelisD wants to merge 3 commits into
python:masterfrom
VaggelisD:isspace-upstream

Conversation

@VaggelisD
Copy link
Copy Markdown
Contributor

This PR serves as the foundation for the smaller alternative of the char proposal request.

It builds on top of the existing librt.strings and ord(char) specialization, so code like the following can lower to a direct codepoint check i.e without materializing a 1-character str:

from librt.strings import isspace
from mypy_extensions import i32
 
c: i32 = ...
 
if (isspace(c)):
  ...

Semantics:

  • Input type is i32
  • Negative inputs return False
  • For all valid Unicode codepoints, behavior matches str.isspace() on the corresponding 1-character string (ensured by exhaustive test too)

This PR adds only isspace; It does not add any new type-system surface, and it does not introduce the rest of the codepoint helperfamily. I'll be contributing these next if this direction looks good.

Adds a codepoint-taking `librt.strings.isspace(c: i32) -> bool` that
wraps `Py_UNICODE_ISSPACE`. Combined with the existing `ord(s[i])`
specialization (python#20578), this lets per-character hot loops avoid the
1-character `PyUnicode` materialization that `s[i].isspace()` forces.

Microbenchmark (counting whitespace in a 12 KB SQL fragment, 5000
iterations): mypyc-compiled `s[i].isspace()` takes 0.075 ms; the
codepoint path `c: i32 = i32(ord(s[i])); isspace(c)` takes 0.034 ms,
roughly 2.2x faster. Wins compound for tokenizer-shaped workloads
mixing classification and literal compares.
@github-actions

This comment has been minimized.

@github-actions
Copy link
Copy Markdown
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

Without the _librt suffix, has_test_name_tag returns False and the test
imports the installed PyPI librt 0.11.0, which lacks isspace.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant