Normative: allow duplicate named capture groups#2721
Conversation
7587c99 to
b2bc16e
Compare
6d18c49 to
35b63c0
Compare
8e770b5 to
2140ce7
Compare
|
|
||
| <emu-clause id="sec-patterns-static-semantics-can-both-participate" type="abstract operation"> | ||
| <h1> | ||
| Static Semantics: CanBothParticipate ( |
There was a problem hiding this comment.
Opposite polarity might lead to a more intuitive name, e.g. MutuallyExclusive.
There was a problem hiding this comment.
I agree with @gibson042, the sense of the CanBothParticipate AO is confusing. It returns true when x and y are in the same Alternative and therefore we should throw a Syntax Error. Can we rename CanBothParticipate to CannotBothParticipate?
There was a problem hiding this comment.
It's more like MightBothParticipate. As in, both could be components of a single match.
There was a problem hiding this comment.
Renamed to MightBothParticipate.
| 1. For each integer _j_ such that _j_ ≠ _i_, _j_ ≥ 1, and _j_ ≤ _n_, do | ||
| 1. If the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then |
There was a problem hiding this comment.
Would this be better expressed more procedurally?
| 1. For each integer _j_ such that _j_ ≠ _i_, _j_ ≥ 1, and _j_ ≤ _n_, do | |
| 1. If the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then | |
| 1. For each integer _j_ such that _j_ ≥ 1 and _j_ ≤ _n_, in ascending order, do | |
| 1. If _j_ ≠ _i_ and the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then |
| 1. Let _isMatchedElsewhere_ be *false*. | ||
| 1. For each integer _j_ such that _j_ ≠ _i_, _j_ ≥ 1, and _j_ ≤ _n_, do | ||
| 1. If the _j_<sup>th</sup> capture of _R_ was defined with a |GroupName|, then | ||
| 1. Let _sj_ be the CapturingGroupName of that |GroupName|. |
There was a problem hiding this comment.
I would not object to refactoring these aliases, e.g. s → groupName and sj → otherName.
61e7436 to
8dd5ab0
Compare
|
Happy do defer the editorial review, I'm told there've been enough eyes on this. |
| </li> | ||
| <li> | ||
| It is a Syntax Error if |Pattern| contains two or more |GroupSpecifier|s for which CapturingGroupName of |GroupSpecifier| is the same. | ||
| It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s _x_ and _y_ for which CapturingGroupName(_x_) is the same as CapturingGroupName(_y_) and such that CanBothParticipate(_x_, _y_) is *true*. |
There was a problem hiding this comment.
-
PR various editorial changes for comparisons #2877 says to avoid "is the same as", use "is" for comparing Strings.
-
It's rare (though not unheard of) to invoke an SDO with parenthesis notation. On the other hand, using the normal notation would put
CapturingGroupName of _y_on the RHS ofis, which I don't think we ever do. -
... for which A and such that Bis odd. Seems like either the two arms should agree re "for which" vs "such that", or else just use one that covers both arms:... such that A and B
| It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s _x_ and _y_ for which CapturingGroupName(_x_) is the same as CapturingGroupName(_y_) and such that CanBothParticipate(_x_, _y_) is *true*. | |
| It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s _x_ and _y_ such that CapturingGroupName(_x_) is CapturingGroupName(_y_) and CanBothParticipate(_x_, _y_) is *true*. |
There was a problem hiding this comment.
Changed to
It is a Syntax Error if |Pattern| contains two distinct |GroupSpecifier|s
_x_and_y_such that the CapturingGroupName of_x_is the CapturingGroupName of_y_and such that CanBothParticipate(_x_,_y_) is*true*.
michaelficarra
left a comment
There was a problem hiding this comment.
LGTM other than @jmdyck's nits.
8dd5ab0 to
2c813b7
Compare
|
Comments addressed. |
41ee524 to
1cc4d4b
Compare
This allows you to have a regex like
/(?<year>[0-9]{4})-[0-9]{2}|[0-9]{2}-(?<year>[0-9]{4})/where a capturing group name is re-used across alternatives. It continues to be illegal to re-use a name within the same alternative.
As currently specified, it also enforces that named backreferences correspond to capturing groups in the same alternative, which would make the following (currently legal) program illegal:/(?<a>x)|\k<a>/There is no reason to write this because the\kcan never refer to anything, meaning it will always match the empty string. For this reason I think it should have been illegal in the first place. But if we want to preserve that behavior, it's easy enough to specify.EDIT: updated so that the above remains legal, per plenary.
(I have a proposal repo for this, but figured it might as well be a PR.)