bpo-41887: omit leading whitespaces on ast.literal_eval#22469
Conversation
|
I am curious if there is a performance impact due to the need to create a substring. This may be non trivial for large springs... |
|
Only if there's whitespace, and even then I doubt that you can even measure the time -- surely the parser takes much more time looking at the characters one at a time than copying a string, even if the input is just a single long string literal (note that this would construct another copy for sure). |
gvanrossum
left a comment
There was a problem hiding this comment.
I actually discovered something a little disturbing. Some forms of whitespace cause syntax errors -- trailing for ast.literal_eval(), and both leading and trailing for eval()`. Example:
>>> eval("\n 12+12")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 2
12+12
IndentationError: unexpected indent
>>> ast.literal_eval("123\n ")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 62, in literal_eval
node_or_string = parse(node_or_string, mode='eval')
File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 50, in parse
return compile(source, filename, mode, flags,
File "<unknown>", line 2
SyntaxError: unexpected EOF while parsing
>>>
Isn't this expected, we only strip away the initial spaces/tabs to be compatible with |
No, but I would not want to mislead in the docs. Can we make literal_eval use the same algorithm as eval? |
Re differences between eval and literal_eval, that came up recently here as well: https://bugs.python.org/issue17490 . The documentation says that literal_eval only understands "strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None." But the reality is that it doesn't understand all python numbers. |
|
@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace. |
No, I can't. I was referring to an expression. Point taken about this being off topic here. |
|
Hm... There's a slight problem with the way this currently works. It strips leading spaces and tabs, but no other whitespace characters. That's the same as I propose to just change the docs to say "strips leading spaces and tabs" everywhere. |
gvanrossum
left a comment
There was a problem hiding this comment.
One nit, I'll apply myself.
|
Will merge when the tests pass. |
|
@gvanrossum: Please replace |
|
Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9. |
|
Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8. |
Also document that eval() does this (the same way). (cherry picked from commit e799aa8) Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>
|
GH-22533 is a backport of this pull request to the 3.9 branch. |
|
Sorry, @isidentical and @gvanrossum, I could not cleanly backport this to |
Also document that eval() does this (the same way).
https://bugs.python.org/issue41887