-
-
Notifications
You must be signed in to change notification settings - Fork 34.7k
bpo-32677: Add .isascii() to str, bytes and bytearray #5342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1d4c0f4
4b01174
8b6452f
120579a
22a8400
56b7727
3fb3240
949b3ad
5289bae
40e08a0
4138202
91a6b18
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| Add ``.isascii()`` method to ``str``, ``bytes`` and ``bytearray``. | ||
| It can be used to test that string contains only ASCII characters. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -11611,6 +11611,25 @@ unicode_index(PyObject *self, PyObject *args) | |
| return PyLong_FromSsize_t(result); | ||
| } | ||
|
|
||
| /*[clinic input] | ||
| str.isascii as unicode_isascii | ||
|
|
||
| Return True if all characters in the string are ASCII, False otherwise. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nitpick, maybe copy from the doc: "Return true if the string is empty or all characters in the string are ASCII," rather than "Empty string is ASCII too." below.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Return true if the string is empty or all characters in the string are ASCII, False otherwise." overs 80 columns. All other docstrings in unicodeobject has short (<80) summaries.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh wow, that's a nasty issue. Ignore my comment and leave the docstring as it is ;-) |
||
|
|
||
| ASCII characters have code points in the range U+0000-U+007F. | ||
| Empty string is ASCII too. | ||
| [clinic start generated code]*/ | ||
|
|
||
| static PyObject * | ||
| unicode_isascii_impl(PyObject *self) | ||
| /*[clinic end generated code: output=c5910d64b5a8003f input=5a43cbc6399621d5]*/ | ||
| { | ||
| if (PyUnicode_READY(self) == -1) { | ||
| return NULL; | ||
| } | ||
| return PyBool_FromLong(PyUnicode_IS_ASCII(self)); | ||
| } | ||
|
|
||
| /*[clinic input] | ||
| str.islower as unicode_islower | ||
|
|
||
|
|
@@ -13801,6 +13820,7 @@ static PyMethodDef unicode_methods[] = { | |
| UNICODE_UPPER_METHODDEF | ||
| {"startswith", (PyCFunction) unicode_startswith, METH_VARARGS, startswith__doc__}, | ||
| {"endswith", (PyCFunction) unicode_endswith, METH_VARARGS, endswith__doc__}, | ||
| UNICODE_ISASCII_METHODDEF | ||
| UNICODE_ISLOWER_METHODDEF | ||
| UNICODE_ISUPPER_METHODDEF | ||
| UNICODE_ISTITLE_METHODDEF | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to optimize this function, I suggest you to look at ascii_decode() of Objects/unicodeobject.c which is heavily optimized to scan ASCII characters in a uint8_t* string. It works on "unsigned long" words rather than working on bytes.
But it should be done in a second PR. Right now, I would prefer to push this PR before 3.7b1 (monday).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree.