Handle trailing commas in parser instead of scanner #14517

adampauls · 2022-02-18T23:43:03Z

Building on #14509, does the extra work to make sure that comma-separated lists can have a trailing comma only in lists delimited by parens, braces, or brackets. Pulls over a test from scala/scala#8780.

The first commit is from #14509, so you should only review the second commit in this PR.

adampauls · 2022-02-18T23:49:42Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

@@ -1390,14 +1397,7 @@ object Parsers {
          else
            Function(params, t)
        }
-      def funTypeArgsRest(first: Tree, following: () => Tree) = {


Inlined, since this is just reading a comma-separated list.

adampauls · 2022-02-18T23:49:58Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

@@ -3210,13 +3209,6 @@ object Parsers {
                if !idOK then syntaxError(i"named imports cannot follow wildcard imports")
                namedSelector(termIdent())
        }
-        val rest =


Another comma-separated list factored out.

adampauls · 2022-02-18T23:50:32Z

compiler/src/dotty/tools/dotc/parsing/Parsers.scala

        in.nextToken()
-        ts += part()
+        if (in.isAfterLineEnd && (in.token == OUTDENT || (expectedEnd != EMPTY && in.token == expectedEnd))) {


Main change here, this is moved out of the scanner, but here we no longer have to guess what lookahead token might indicate a trailing comma.

adampauls · 2022-02-18T23:50:46Z

compiler/src/dotty/tools/dotc/parsing/Scanners.scala

@@ -654,13 +654,6 @@ object Scanners {
              insert(OUTDENT, offset)
              currentRegion = r.outer
            case _ =>
-              lookAhead()
-              if isAfterLineEnd
-                 && (token == RPAREN || token == RBRACKET || token == RBRACE || token == OUTDENT)


This is the part that was moved to the parser.

adampauls · 2022-02-18T23:50:59Z

tests/neg/t11900.scala

+
+  def f =
+    List(1, 2, 3).map {
+      a => a + 1,   // error: weird comma


These used to pass.

adampauls · 2022-02-18T23:51:17Z

tests/neg/trailingCommas.scala

+    b,
+    c,
+    = (1, 2, 3) // error


I don't think I changed the behavior here, just wanted to make sure it was declared somewhere.

som-snytt · 2022-02-18T23:58:35Z

You beat me to it!

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

adampauls · 2022-02-19T01:50:11Z

The community build breaks for bad trailing commas in two repos (specs2 and protoquill). I put up PRs to remove.

som-snytt · 2022-02-19T03:46:10Z

Were they not on latest Scala 2? The Scala 2 community build ought to have broken already. Or was it broken already?

I guess the change was only since start of pandemic.

The SIP language is, "trailing commas are only supported in comma-separated elements."

https://docs.scala-lang.org/sips/trailing-commas.html

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

adampauls · 2022-02-21T00:35:46Z

Must have been changes to Scala 3 code made after they forked off Scala 3 parts?

SethTisue · 2022-02-21T23:09:32Z

(The Scala 2 community build doesn't have protoquill. It does have specs2 4.x, whereas I think the Scala 3 community build has 5.x instead.)

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

odersky

This looks good overall, and in the end preferable over a region-based approach. But I wonder why Scanner was not touched? Can't we delete the trailing comma handling code there now?

I have fixed the CB problem in a different PR. I leave there some instructions how to fix CB in community builds.

adampauls · 2022-03-16T17:16:11Z

It was touched here (https://github.com/lampepfl/dotty/pull/14517/files#diff-265361a32415e25bed50400e4bb357fb31b82ac9f3640dd7a6c9a3599b7701c9L653). It's part of the second commit. The first commit is from #14509.

adampauls · 2022-03-16T19:12:17Z

@odersky I cherry picked your CB fix, so hopefully this will go fully green. I tried merging in #14695 but a new test fails because of error recover regression. Is the plan to merge this PR first and then yours, or the other way around?

adampauls · 2022-03-18T22:22:47Z

Not sure who to ask, maybe @SethTisue why isn't CI running on this PR?

griggt · 2022-03-18T22:29:31Z

Not sure who to ask, maybe @SethTisue why isn't CI running on this PR?

I think you just happened to push during a GitHub Actions service outage. A force push should probably get it running.

odersky · 2022-03-20T16:43:11Z

Thinking more about it, I am not sure we want to go ahead with this. If we would go ahead, we'd have to change syntax.md (both versions) to reflect the grammar changes and also have to update all comments in the parser to reflect the same change. We have the strict rule that parser should parse what's in syntax.md and this has to be documented.

All this looks like a LOT of work, and I am not sure whether the benefits are commensurate. As far as I can see, the benefit is that an argument like { expr, } would be ruled out. Does that justify what looks like a considerable complication of grammar and parser?

We can look at it this way. If the original proposal in SIP 27 had specified extensive grammar changes, it probably would not have been adopted since the cost/benefit ratio would have been deemed too high. I believe SIP 27 was accepted in part since it proposed that the whole thing can be handled in the Lexer. The SIP proposal states this explicitly.

So changing things now by introducing all the grammar changes seems like it undermines the original proposal, and the rationale for accepting it. It would also be good to get @dwijnand 's opinion on this.

adampauls · 2022-03-21T07:12:35Z

Fwiw, I lost hours of confusion to the magic lines in the Scanner when I first started looking into bad error recovery behavior. The code in Scanner is small but it is not simple.

This PR combines error recovery, a refactor to re-use commaSeparated in more places, and the trailing comma fix. I think if I separate those out, you will see that the trailing comma fix is not really a not complication.

I will put up a PR for just the refactor to better use commaSeparated. At this point, will you accept the changes to error recovery, or do you plan to merge your PR first?

After that, we can see if the trailing comma fix is worth it. Fwiw, my team did complain bitterly about it because for a while, scalafmt would insert the trailing comma in every lambda with curlies. I believe that's the reason that @som-snytt removed the "weird trailing comma" from Scala 2 in the first place -- everyone was confused how it could parse in the first place.

som-snytt · 2022-03-21T07:46:03Z

Earlier ~~this~~ yesterday evening, I did glance at the Scala 2 PR from beginning of pandemic.

It claims to be closer to the SIP. It doesn't look finicky.

There is old error recovery that is finicky, and I noted it is skippable:

Currently, doti doesn't even support case x * y => let alone case _ * y =>.

Oh but it does support it now.

The other helpful comment was

I plan to forward port to Scala 3, which is mechanically similar. I won't be able to port the name inGroupers, however, as Scala 3 already has a boring name for that method.

som-snytt · 2022-03-21T08:21:40Z

I've refreshed my memory a bit before sleeping, but I'll need another look.

I see nsc says commaSeparated uses the current region:

https://github.com/scala/scala/blob/2.13.x/src/compiler/scala/tools/nsc/ast/parser/Parsers.scala#L833

and inGroupers (parens, brackets) knows which right bracket induces a skipped comma.

https://github.com/scala/scala/blob/2.13.x/src/compiler/scala/tools/nsc/ast/parser/Parsers.scala#L66

odersky · 2022-03-21T09:21:39Z

This PR combines error recovery, a refactor to re-use commaSeparated in more places, and the trailing comma fix. I think if I separate those out, you will see that the trailing comma fix is not really a not complication.

We can certainly evaluate that. But the grammar's have to be fixed as well. We cannot allow a deviation of Parser and grammar.

adampauls · 2022-03-21T15:34:50Z

As far as I can tell, syntax.md was never updated to reflect trailing commas at all? I don't see anything in #3463 and when I looked at either syntax.md file in the repo, there is no reference to trailing commas, and every reference to ',' appears in a rule that only accepts non-trailing commas from what I could see. I'd happily update the the grammar to reflect the new rules, but please first let me know what version of this PR you will accept -- should I break it up first? Is this implementation acceptable?

The SIP says:

Trailing comma support is therefore restricted to only comma-separated elements that are on separate lines

and

Following Dr. Martin Odersky's suggestion, the proposal is that trailing commas are only supported in comma-separated elements that are enclosed by parentheses, square brackets or curly braces (), ], and }, respectively).

Although not entirely unambiguous, I think any reasonable reader would interpret these sentences to mean that trailing commas are only permitted in comma-separated elements, and therefore not in

{
  expr,
}

Unfortunately, I think the original SIP was simply incorrect in thinking that a Scanner-only implementation could accomplish the spec (without pushing an unreasonable amount of info into the Scanner from the Parse).

dwijnand · 2022-03-21T16:56:42Z

I don't have a personal preference on keeping the implementation less strict and simpler or more strict and complex - I see the merits of both. But I think the fact that Scalafmt inserted commas where it shouldn't have and the compiler wasn't complex enough to reject it as a weak reason. My reading of Martin's comment is he prefers the less strict and simpler setup we have. But if I were you I'd present separate PRs and we'll see from there.

adampauls · 2022-03-22T03:49:09Z

I have broken this up into #14741 and #14742, and will the port the rest back to #14509.

odersky · 2022-03-22T08:10:26Z

As far as I can tell, syntax.md was never updated to reflect trailing commas at all?

It did not have to since it was treated as a Lexer functionality. syntax.md works on the token stream that comes from the Lexer. The Lexer simply suppressed a trailing comma in front of a closing parenthesis.

adampauls force-pushed the no_weird_trailing_commas branch from 89ea0ec to 24ec0b8 Compare February 18, 2022 23:48

adampauls commented Feb 18, 2022

View reviewed changes

tests/neg/t11900.scala

def f =

List(1, 2, 3).map {

a => a + 1, // error: weird comma

Copy link

Contributor Author

adampauls Feb 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These used to pass.

adampauls commented Feb 18, 2022

View reviewed changes

adampauls marked this pull request as ready for review February 18, 2022 23:51

adampauls changed the title ~~[WIP] Handle trailing commas in parser instead of scanner~~ Handle trailing commas in parser instead of scanner Feb 18, 2022

adampauls added a commit to adampauls/zio-protoquill that referenced this pull request Feb 19, 2022

Remove bad trailing comma

6cb95ae

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

adampauls mentioned this pull request Feb 19, 2022

Remove bad trailing comma zio/zio-protoquill#79

Merged

adampauls added a commit to adampauls/specs2 that referenced this pull request Feb 19, 2022

Remove bad trailing comma

a6f2489

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

adampauls mentioned this pull request Feb 19, 2022

Remove bad trailing comma etorreborre/specs2#1067

Merged

etorreborre pushed a commit to etorreborre/specs2 that referenced this pull request Feb 19, 2022

Remove bad trailing comma (#1067)

6353a29

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

romanowski added the area:parser label Feb 21, 2022

romanowski requested a review from odersky February 21, 2022 13:51

adampauls mentioned this pull request Feb 21, 2022

Better error recovery in comma separated lists. #14509

Closed

adampauls force-pushed the no_weird_trailing_commas branch from 24ec0b8 to ebfd546 Compare February 21, 2022 21:33

deusaquilus pushed a commit to zio/zio-protoquill that referenced this pull request Feb 22, 2022

Remove bad trailing comma (#79)

418aedc

There is a [PR](scala/scala3#14517) up to remove bad trailing commas from dotty. This bad trailing comma is breaking the community build.

adampauls force-pushed the no_weird_trailing_commas branch from ebfd546 to 0edf5a2 Compare February 22, 2022 17:37

dwijnand assigned odersky Mar 4, 2022

adampauls mentioned this pull request Mar 5, 2022

Stop parsing incorrect trailing commas and improve error recovery in comma-separated lists. #14534

Closed

odersky reviewed Mar 16, 2022

View reviewed changes

adampauls added 2 commits March 16, 2022 10:08

Better error recovery in comma-separated lists

317b08c

Check for trailing commas in parser instead of scanner

fe88598

adampauls requested a review from odersky March 16, 2022 17:16

fix trailing commas in community build

8a09cb5

adampauls force-pushed the no_weird_trailing_commas branch from 0edf5a2 to 8a09cb5 Compare March 16, 2022 18:04

SethTisue closed this Mar 19, 2022

SethTisue reopened this Mar 19, 2022

odersky assigned dwijnand and unassigned odersky Mar 20, 2022

adampauls closed this Mar 22, 2022

Handle trailing commas in parser instead of scanner #14517

Handle trailing commas in parser instead of scanner #14517

Uh oh!

Conversation

adampauls commented Feb 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adampauls Feb 18, 2022

Choose a reason for hiding this comment

Uh oh!

adampauls Feb 18, 2022

Choose a reason for hiding this comment

Uh oh!

adampauls Feb 18, 2022

Choose a reason for hiding this comment

Uh oh!

adampauls Feb 18, 2022

Choose a reason for hiding this comment

Uh oh!

adampauls Feb 18, 2022

Choose a reason for hiding this comment

Uh oh!

adampauls Feb 18, 2022

Choose a reason for hiding this comment

Uh oh!

som-snytt commented Feb 18, 2022

Uh oh!

adampauls commented Feb 19, 2022

Uh oh!

som-snytt commented Feb 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adampauls commented Feb 21, 2022

Uh oh!

SethTisue commented Feb 21, 2022

Uh oh!

odersky left a comment

Choose a reason for hiding this comment

Uh oh!

adampauls commented Mar 16, 2022

Uh oh!

adampauls commented Mar 16, 2022

Uh oh!

adampauls commented Mar 18, 2022

Uh oh!

griggt commented Mar 18, 2022

Uh oh!

odersky commented Mar 20, 2022

Uh oh!

adampauls commented Mar 21, 2022

Uh oh!

som-snytt commented Mar 21, 2022

Uh oh!

som-snytt commented Mar 21, 2022

Uh oh!

odersky commented Mar 21, 2022

Uh oh!

adampauls commented Mar 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dwijnand commented Mar 21, 2022

Uh oh!

adampauls commented Mar 22, 2022

Uh oh!

odersky commented Mar 22, 2022

Uh oh!

Uh oh!

adampauls commented Feb 18, 2022 •

edited

Loading

som-snytt commented Feb 19, 2022 •

edited

Loading

adampauls commented Mar 21, 2022 •

edited

Loading