-
Notifications
You must be signed in to change notification settings - Fork 213
Add digit-separators specification v1.0 #3846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@stereotype441 has mentioned to the analyzer team that he desires to keep better track of "possible lint rules" and "possible quick fixes" that would support a given language feature. To paraphrase, I think Paul said that these are often brought up in language team meetings, when discussing how developers could be kept on the happy path, or how to support a migration to a language feature, but they aren't really written down or otherwise tracked. To that end, I've written possible new lint rules and quick fixes in this specification. But if this inspires frowning, if the specification MD file is not considered a good place for such commentary, I can remove those sections, and just open separate issues against analyzer, and not cross-link in this specification. |
|
||
That leaves underscore, which could be the start of an identifier. Currently | ||
`100_000` would be tokenized as "integer literal 100" followed by "identifier | ||
_000". However, users would never write an identifier adjacent to another token |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code quote the _100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
separators every three digits of a decimal number. Also possibly a similar rule | ||
for hexadecimal numbers. If a developer ever uses digit separators for a | ||
different purpose (as in separating the digits of a phone number), the rule may | ||
not prove useful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another lint option could be "consistent digit separators", which triggers if the digit groups do not have the same size (except the most significant I one, which can be shorter).
If there are any __
separators, the number of _
separated groups between them should also be the same, and repeatedly for higher numbers of _
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it, added!
underscore can be detected as such, and we can offer quick fixes to remove | ||
unexpected errors (for example, `100_`, `100_e.2`, `100._00`). In a few cases, | ||
the intention is not as straightforward, such as `100._100`, where `_100` can | ||
be a legal name of an extension member. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it's also possible to check whether it's an extension (in the same library, since it's a private name), and when it most likely isn't, the fix can be recommended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added
|
||
A separate lint rule could encourage _consistent_ digit separators, which | ||
triggers if the digit groups do not have the same size (except the most | ||
significant I one, which can be shorter). If there are any `__` separators, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this I
intentional?
significant I one, which can be shorter). If there are any `__` separators, the | |
significant one, which can be shorter). If there are any `__` separators, the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks
|
||
The only legal digit separator that is introduced with this feature is the | ||
underscore character. If a developer attempts to use another character, for | ||
example commas, as a separator, we may be able to detect this, and offer a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this fall into the same problem that we have when we consider ,
as a separating digit? How can we be sure that the user intended to use ,
as a separating digit instead of two expressions separated by a comma?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't. Many automated fixes are offered with the understanding that they are not the only fix, and may not be the desired fix.
I personally doubt we will offer this fix. But there might be some cases, like if the user pastes 100 lines of:
var x1 = 1,000,000;
var x2 = 2,000,000;
var x3 = 3,000,000;
into their editor, from another source text, we could offer a fix to convert the commas, because it looks more likely that they meant to write x1 = 1_000_000
, since var x1 = 1,
cannot be followed by an int literal (it can only be followed by an identifier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the same person pasted 100 lines of var x = foo(1,000,000);
we'd be having a harder time helping them.
(Not hat we couldn't suggest 1_000_000
, especially if we can see that foo
takes only one argument, but it was not grammatically wrong to begin with.)
One thing that worries me is whether we can do the same fixes after the formatter has run.
If not, we may have a problem helping people who auto-format often.
If the code doesn't parse, it likely won't format either, but if it parses, and you get to var x1 = foo(1, 000, 000);
, it would be sad if it's too late to help the user.
This is ready for review; dunno how you assign reviewers, @dart-lang/language-team |
With the digit-separators feature, separators can be added between _digits_ of | ||
a number literal, but nowhere else. In most error cases, the unexpected | ||
underscore can be detected as such, and we can offer quick fixes to remove | ||
unexpected errors (for example, `100_`, `100_e.2`, `100._00`). In a few cases, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you can't put period after e. This should be 100_e1.2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is about combining multiple tokens into one number, because we guess that that's what the author intended.
The 100_
could just be one malformed numeral token, or it could be 100
followed by _
.
Not much better. The 100_e1
is already invalid because of the _e
, so guessing that someone intended .
to be a digit separator is a pretty wild guess.
We could probably correct this to 100e12
, but it's not at all obvious that that's what the author intended.
For the 100._00
, we should check first that there is no _00
extension member on int
. If not, we can probably fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see this moving forward, thanks! Should probably get a sign off from Lasse who's thought about this a lot more, but FWIW LGTM!
Add a specification for the digit-separators feature, as specified by @lrhn at #2 and at #2 (comment). I add myself as an author as I added a few other sections.
Contribution guidelines:
dart format
.Note that many Dart repos have a weekly cadence for reviewing PRs - please allow for some latency before initial review feedback.