-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Handle trailing commas in parser instead of scanner #14517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
89ea0ec
to
24ec0b8
Compare
@@ -1390,14 +1397,7 @@ object Parsers { | |||
else | |||
Function(params, t) | |||
} | |||
def funTypeArgsRest(first: Tree, following: () => Tree) = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inlined, since this is just reading a comma-separated list.
@@ -3210,13 +3209,6 @@ object Parsers { | |||
if !idOK then syntaxError(i"named imports cannot follow wildcard imports") | |||
namedSelector(termIdent()) | |||
} | |||
val rest = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another comma-separated list factored out.
in.nextToken() | ||
ts += part() | ||
if (in.isAfterLineEnd && (in.token == OUTDENT || (expectedEnd != EMPTY && in.token == expectedEnd))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main change here, this is moved out of the scanner, but here we no longer have to guess what lookahead token might indicate a trailing comma.
@@ -654,13 +654,6 @@ object Scanners { | |||
insert(OUTDENT, offset) | |||
currentRegion = r.outer | |||
case _ => | |||
lookAhead() | |||
if isAfterLineEnd | |||
&& (token == RPAREN || token == RBRACKET || token == RBRACE || token == OUTDENT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the part that was moved to the parser.
|
||
def f = | ||
List(1, 2, 3).map { | ||
a => a + 1, // error: weird comma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These used to pass.
b, | ||
c, | ||
= (1, 2, 3) // error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I changed the behavior here, just wanted to make sure it was declared somewhere.
You beat me to it! |
There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.
There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.
The community build breaks for bad trailing commas in two repos (specs2 and protoquill). I put up PRs to remove. |
Were they not on latest Scala 2? The Scala 2 community build ought to have broken already. Or was it broken already? I guess the change was only since start of pandemic. The SIP language is, "trailing commas are only supported in comma-separated elements." |
There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.
Must have been changes to Scala 3 code made after they forked off Scala 3 parts? |
24ec0b8
to
ebfd546
Compare
(The Scala 2 community build doesn't have protoquill. It does have specs2 4.x, whereas I think the Scala 3 community build has 5.x instead.) |
There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.
ebfd546
to
0edf5a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good overall, and in the end preferable over a region-based approach. But I wonder why Scanner was not touched? Can't we delete the trailing comma handling code there now?
I have fixed the CB problem in a different PR. I leave there some instructions how to fix CB in community builds.
It was touched here (https://github.com/lampepfl/dotty/pull/14517/files#diff-265361a32415e25bed50400e4bb357fb31b82ac9f3640dd7a6c9a3599b7701c9L653). It's part of the second commit. The first commit is from #14509. |
0edf5a2
to
8a09cb5
Compare
Not sure who to ask, maybe @SethTisue why isn't CI running on this PR? |
I think you just happened to push during a GitHub Actions service outage. A force push should probably get it running. |
Thinking more about it, I am not sure we want to go ahead with this. If we would go ahead, we'd have to change All this looks like a LOT of work, and I am not sure whether the benefits are commensurate. As far as I can see, the benefit is that an argument like We can look at it this way. If the original proposal in SIP 27 had specified extensive grammar changes, it probably would not have been adopted since the cost/benefit ratio would have been deemed too high. I believe SIP 27 was accepted in part since it proposed that the whole thing can be handled in the Lexer. The SIP proposal states this explicitly. So changing things now by introducing all the grammar changes seems like it undermines the original proposal, and the rationale for accepting it. It would also be good to get @dwijnand 's opinion on this. |
Fwiw, I lost hours of confusion to the magic lines in the Scanner when I first started looking into bad error recovery behavior. The code in Scanner is small but it is not simple. This PR combines error recovery, a refactor to re-use commaSeparated in more places, and the trailing comma fix. I think if I separate those out, you will see that the trailing comma fix is not really a not complication. I will put up a PR for just the refactor to better use commaSeparated. At this point, will you accept the changes to error recovery, or do you plan to merge your PR first? After that, we can see if the trailing comma fix is worth it. Fwiw, my team did complain bitterly about it because for a while, scalafmt would insert the trailing comma in every lambda with curlies. I believe that's the reason that @som-snytt removed the "weird trailing comma" from Scala 2 in the first place -- everyone was confused how it could parse in the first place. |
Earlier It claims to be closer to the SIP. It doesn't look finicky. There is old error recovery that is finicky, and I noted it is skippable:
Oh but it does support it now. The other helpful comment was
|
I've refreshed my memory a bit before sleeping, but I'll need another look. I see and |
We can certainly evaluate that. But the grammar's have to be fixed as well. We cannot allow a deviation of Parser and grammar. |
As far as I can tell, syntax.md was never updated to reflect trailing commas at all? I don't see anything in #3463 and when I looked at either The SIP says:
and
Although not entirely unambiguous, I think any reasonable reader would interpret these sentences to mean that trailing commas are only permitted in comma-separated elements, and therefore not in {
expr,
} Unfortunately, I think the original SIP was simply incorrect in thinking that a Scanner-only implementation could accomplish the spec (without pushing an unreasonable amount of info into the Scanner from the Parse). |
I don't have a personal preference on keeping the implementation less strict and simpler or more strict and complex - I see the merits of both. But I think the fact that Scalafmt inserted commas where it shouldn't have and the compiler wasn't complex enough to reject it as a weak reason. My reading of Martin's comment is he prefers the less strict and simpler setup we have. But if I were you I'd present separate PRs and we'll see from there. |
It did not have to since it was treated as a Lexer functionality. |
Building on #14509, does the extra work to make sure that comma-separated lists can have a trailing comma only in lists delimited by parens, braces, or brackets. Pulls over a test from scala/scala#8780.
The first commit is from #14509, so you should only review the second commit in this PR.