-
Notifications
You must be signed in to change notification settings - Fork 213
Add working spec for "terminating tokens". #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This is the maximally compatible robust approach to optional semicolons. I'll start working on a simpler more conservative alternative soon.
This is the draft feature spec for #72 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also check the comment I filed on the feature - I think it's wrong to say that eager semicolon insertion will not introduce errors on the next line. I'm sure I can do things with typing, but even without that, the --
/++
case there should be enough. You cannot detect ambiguity by only looking one token ahead, and the NO_TERM changes to the grammar only looks one token ahead.
* **11,222** (99.698%) have a newline between them and the previous string. | ||
|
||
So, in this case, we don't follow eagerness and ignore a newline between | ||
adjacent strings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we will mis-interpret:
var x = "some string"
"some other string".charCodes.forEach(importantAction)
?
What can the user do to avoid this? Parenthesizing either string literal. Use a literal semicolon (that's the smaller solution, so that's what I would do, but then the formatter cannot remove it).
That's also a thing to consider: There will be semicolons that cannot be removed (empty statements can be turned into {}
, but not removed). If we remove the empty statement ;
completely, and reserve only {}
for that (the other empty statement), then that issue should be gone.
(Let's do that no matter what! Please!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oof, yes. The proposal will elide the newline there.
What can the user do to avoid this? Parenthesizing either string literal.
Yeah, parenthesizing would work.
(Let's do that no matter what! Please!)
SGTM. Empty statement is weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious... what would happen in the case of those editors and IDEs which automatically force the code on to a new line when the width of the line gets too long?
Can just imagine in those cases and the cases of the code shown above by @irhn it will get very confusing.
``` | ||
|
||
One is in an expression context where the newline is ignored and one is not. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is also --
and ++
which can both end and start an expression (prefix and suffix operators):
a
++
b
can be parsed in two ways with optional semicolons.
I doubt that ever happens in practice, so using the eagerness principle should be safe.
it, after the `)` on `foo()`. The "should be significant" is vague. More | ||
precisely: | ||
|
||
* If a newline character appears after the preceding token's lexeme and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's something missing from this paragraph - it feels vacuous. All newline characters appear between two tokens, no? Maybe something about "and this token is one of a specific set of tokens"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not all pairs of tokens do have newlines between them. :) If there's no newline between a token and the next, then the latter token is not terminating (unless it's }
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, something is still no coming across. Maybe it's a pronoun issue? What's the referent of "it" in "it is terminating"?
There's still more to do, but it's a step closer.
What's next for this PR? |
run experiments on a corpus of Dart code to try to empirically measure these to | ||
some degree. | ||
|
||
The corpus I used is the Flutter repository, the Dart SDK repository, and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why mention Flutter? Yes, it uses Dart, but so does GoogleAdware and Google Adsense. Additionally, there's the AngularDart team, plus all the frameworks the community has built.
Is this change being requested solely to benefit Flutter uses? I would urge that either these other uses of Dart are included or otherwise limit the mentions of Flutter so it's not so specific to one user base. Though you do go through packages on Pub, so I don't know why Flutter had to specifically be called out and the rest excluded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is part of the "UI as code" charter, which is detailed more here and here. It's not only to benefit Flutter, but if it doesn't benefit Flutter, then I don't think we have the resources to work on it right now. In other words "useful for Flutter" is a necessary condition, but we certainly hope that all "UI as code" features are useful for much more than Flutter too.
Flutter's repo is an important use case because it's a very large, very popular corpus of Dart code and is also one that does not currently use dartfmt and has its own formatting style. So it's a good test case for how robust the optional semicolon rules are across a variety of Dart code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, thanks for the quick response @munificent. I appreciate the openness and you provided some clarity
@munificent can we merge this? |
I need to circle back on semicolons and tie up the loose ends. I think the right thing to do for this is to add a note to the top saying we probably aren't going to pursue this approach and then land it for reference. @leafpetersen and @lrhn, does that sound OK? |
SGTM |
This is the maximally compatible robust approach to optional semicolons.
I'll start working on a simpler more conservative alternative soon.
This PR is just the proposal doc. I have a bunch of other supplementary material:
A handful of small scripts for scraping a corpus and counting occurrences of certain forms.
A fork of the front_end and dart_style packages with a prototype implementation of the rules.
A Makefile for running tests on a big corpus of code.
The scripts reuse some existing helper code I wrote for other ui-as-code language features, so I'd have to duplicate some stuff to migrate it all into here. The front_end and dart_style packages are fairly large. Do you think it's worth bringing this stuff in?