Skip to content

bpo-41887: omit leading whitespaces on ast.literal_eval#22469

Merged
gvanrossum merged 5 commits into
python:masterfrom
isidentical:bpo-41887
Oct 4, 2020
Merged

bpo-41887: omit leading whitespaces on ast.literal_eval#22469
gvanrossum merged 5 commits into
python:masterfrom
isidentical:bpo-41887

Conversation

@isidentical

@isidentical isidentical commented Sep 30, 2020

Copy link
Copy Markdown
Member

Comment thread Lib/ast.py Outdated
Comment thread Doc/library/functions.rst Outdated
Comment thread Doc/library/ast.rst Outdated
@pablogsal

Copy link
Copy Markdown
Member

I am curious if there is a performance impact due to the need to create a substring. This may be non trivial for large springs...

@gvanrossum

Copy link
Copy Markdown
Member

Only if there's whitespace, and even then I doubt that you can even measure the time -- surely the parser takes much more time looking at the characters one at a time than copying a string, even if the input is just a single long string literal (note that this would construct another copy for sure).

@gvanrossum gvanrossum left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually discovered something a little disturbing. Some forms of whitespace cause syntax errors -- trailing for ast.literal_eval(), and both leading and trailing for eval()`. Example:

>>> eval("\n 12+12")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 2
    12+12
IndentationError: unexpected indent
>>> ast.literal_eval("123\n ")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 2
    
SyntaxError: unexpected EOF while parsing
>>> 

Comment thread Doc/library/ast.rst Outdated
Comment thread Doc/library/functions.rst Outdated
Comment thread Lib/ast.py Outdated
Comment thread Misc/NEWS.d/next/Library/2020-09-30-23-49-42.bpo-41887.-ee2S-.rst Outdated
@isidentical

Copy link
Copy Markdown
Member Author

Some forms of whitespace cause syntax errors

Isn't this expected, we only strip away the initial spaces/tabs to be compatible with eval. Do you propose to strip all forms of whitespace (including newlines) in both eval and literal_eval for both leading and trailing way?

@gvanrossum gvanrossum left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, another comment

Comment thread Misc/NEWS.d/next/Library/2020-09-30-23-49-42.bpo-41887.-ee2S-.rst Outdated
@gvanrossum

Copy link
Copy Markdown
Member

Some forms of whitespace cause syntax errors

Isn't this expected, we only strip away the initial spaces/tabs to be compatible with eval. Do you propose to strip all forms of whitespace (including newlines) in both eval and literal_eval for both leading and trailing way?

No, but I would not want to mislead in the docs. Can we make literal_eval use the same algorithm as eval?

@iritkatriel

Copy link
Copy Markdown
Member

Can we make literal_eval use the same algorithm as eval?

Re differences between eval and literal_eval, that came up recently here as well: https://bugs.python.org/issue17490 .

The documentation says that literal_eval only understands "strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None." But the reality is that it doesn't understand all python numbers.

@gvanrossum

gvanrossum commented Oct 1, 2020

Copy link
Copy Markdown
Member

@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace.

@iritkatriel

Copy link
Copy Markdown
Member

@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace.

No, I can't. I was referring to an expression. Point taken about this being off topic here.

@gvanrossum

Copy link
Copy Markdown
Member

Hm... There's a slight problem with the way this currently works. It strips leading spaces and tabs, but no other whitespace characters. That's the same as eval(), but it's not what's usually considered "whitespace" -- that refers to a larger category of characters, including newline, FF, and a host of weird Unicode things.

I propose to just change the docs to say "strips leading spaces and tabs" everywhere.

Comment thread Lib/test/test_ast.py
Comment thread Doc/library/functions.rst Outdated

@gvanrossum gvanrossum left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit, I'll apply myself.

Comment thread Doc/library/ast.rst Outdated
@gvanrossum

Copy link
Copy Markdown
Member

Will merge when the tests pass.

@gvanrossum gvanrossum merged commit e799aa8 into python:master Oct 4, 2020
@bedevere-bot

Copy link
Copy Markdown

@gvanrossum: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington

Copy link
Copy Markdown
Contributor

Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒⛏🤖

@miss-islington

Copy link
Copy Markdown
Contributor

Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 4, 2020
Also document that eval() does this (the same way).
(cherry picked from commit e799aa8)

Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>
@bedevere-bot

Copy link
Copy Markdown

GH-22533 is a backport of this pull request to the 3.9 branch.

@miss-islington

Copy link
Copy Markdown
Contributor

Sorry, @isidentical and @gvanrossum, I could not cleanly backport this to 3.8 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker e799aa8b92c195735f379940acd9925961ad04ec 3.8

xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
Also document that eval() does this (the same way).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants