Fix invalid JSXExpressions having identifier-ish things in their trivia, improve error messages for comma expressions in JSX#31480
Conversation
|
While that may be what's wrong with the node (and what causes a crash when the assertion is removed), based on the PR the assertion was added in ( #21581 ) there absolutely should not be an identifier within a trivia span as far as we know - meaning the node |
|
I mean, how else would you parse this? <div>{<div />x}</div>The parser hits the <div>{<div />""}</div>which parses the same, but since it's a token, doesn't hit the debug assertion.
So I gave it text here ¯\(ツ)/¯ |
(Also it looks like we usually more gracefully handle similar parse errors by synthesizing a binary expression with a comma operator - see |
A-ha, this is exactly what I was missing. I can do that. So as a general rule, what kinds of syntax is the parser allowed to skip over? To me, it feels just as odd for a string literal token to end up in another syntax's trivia as it does for an identifier. But the debug assertion only cares about identifiers. I'm trying to figure out how I could have known, without feedback, that this solution is better than “fixing” synthetic identifiers, which also does seem to solve the problem. Is it that the parser should try as hard as possible to create a node for everything it sees, even when invalid? |
This reverts commit 422b541.
Keywords/punctuation, whitespace and comments. (There is still some punctuation we capture for transformer reasons.) Things we don't make real AST nodes for when parsing well-formed code.
I think the assertion is just an early error to warn about the crash on access - finding a string literal or almost anything that would usually have a real AST node in the trivia is an error. The category of "trivia" is primarily just comments and whitespace and is expanded to include unparsed keywords and punctuation in this context.
Yeah, usually I think we just stuff invalid junk we want to consume into a "missing node" which is marked appropriately for incremental reparsing.
Deep software archaeology. When editing code I didn't write, I'm in the habit of looking up the blames of relevant code (in this case, both the PR adding the assertion and the PR adding the scanning in jsx expression parsing) I want to change to find the justification for it being as it is and seeing if the current state is well-reasoned or apparently ad-hoc. In the case of the later, finding and fixing the root cause of the prior ad-hoc change is often able to create higher quality code than stacking more ad-hoc fixes (which is what I'm suggesting here). |
|
🕵 related: #31510 |
andrewbranch
left a comment
There was a problem hiding this comment.
@RyanCavanaugh looks like all these horrifying baselines that changed were generated by you. Would you mind taking a look?
| var class2 = "bar"; | ||
| var elem = <div className={class1} class2/>; | ||
| />;; | ||
| var elem = <div className={class1, class2}/>; |
There was a problem hiding this comment.
This emit is invalid although slightly less so than the previous? We could potentially put these inside a ParenthesizedExpression, but we're already emitting a grammar error, and it seems like there's no requirement for the emit to be totally valid if the input wasn't totally valid.
| ~~ | ||
| !!! error TS1109: Expression expected. | ||
| } | ||
| // Shouldn't see any errors down here |
There was a problem hiding this comment.
Now we don't. I think this gets significantly better.
| ; | ||
| //// [20.jsx] | ||
| <a>{"str"}}</a>; | ||
| <a>{"str"};}</a>; |
There was a problem hiding this comment.
This is weird, but barely weirder than the previous
| }()); | ||
| var x; | ||
| x = <any> {test} <any></any> }; | ||
| x = <any> {test}: <any></any> }; |
There was a problem hiding this comment.
The TypeScript generating this was x = <any> { test: <any></any> };. Garbage in, garbage out?
| ~~~~~~ | ||
| !!! error TS2695: Left side of comma operator is unused and has no side effects. | ||
| ~~~~~~~~~~~~~~ | ||
| !!! error TS18007: JSX expressions may not use the comma operator. Did you mean to write an array? |
There was a problem hiding this comment.
These two errors make a lot more sense to me than the previous three, although I could do without the first. I considered making the error range start on the comma operator instead of at the beginning of the expression. Thoughts?
| // Only an AssignmentExpression is valid here per the JSX spec, | ||
| // but we can unambiguously parse a comma sequence and provide | ||
| // a better error message in grammar checking. | ||
| node.expression = parseExpression(); |
There was a problem hiding this comment.
When we successfully parse something we know is illegal, is that always a grammar error, or is there ever a reason why that should be a parse error? I know a parse error will prevent the node from being reused incrementally—in this case the tree should be well-formed, so I think reusing it isn’t a problem.
Thinking this was going to be the way forward, I improved error messages for comma expressions as JSX expressions. However, I don't think it's quite feasible to parse extra stuff in a By conditionally scanning for more JSX text based on whether the expected |
|
@weswigham did you ever have a chance to look at my second iteration of this? I think it’s much more aligned with your initial recommendation. |
I'm not totally sure why that debug assertion was there and specific to identifiers, but removing it only seems to require that the synthetically added identifier get its
escapedTextset.Fixes #25487