Skip to content

Canonicalization scorer #1455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Jun 13, 2017
Merged

Canonicalization scorer #1455

merged 24 commits into from
Jun 13, 2017

Conversation

jcollins-g
Copy link
Contributor

@jcollins-g jcollins-g commented Jun 2, 2017

Fixes #1370. This is a new canonicalization scorer to enable dartdoc to make good decisions for where to put docs for a reexported element. The scorer uses paths much like @Hixie suggested in the fifth bullet point of this comment, with some additions (variants on the rest of that comment are already implemented). It eliminates all but 4 of the ambiguous reexport warnings from angular2, dropping the number of ambiguous reexport warnings by 97%. The last few can be fixed with a new directive, @canonicalFor, that can be used to disambiguate or override bad scorer decisions.

I did manually go through the hundred and thirty three warnings this eliminated and am convinced a human would agree with me as to where they should be canonical -- the last few that display I wasn't completely sure of. The flag --ambiguous-reexport-scorer-min-confidence will allow you to adjust when these warnings appear if you think the scorer is broken.

Old warnings:
before
.... and they scroll off the screen.

New warnings:
after

To support testing the @canonicalFor feature I had to make minor changes elsewhere in dartdoc. Additionally, the warning system has been enhanced to permit enhanced warning descriptions -- the ambiguous reexport warning is particularly challenging to understand and I felt without some more infrastructure to help the user even the smaller number of warnings wouldn't be actionable.

@chalin @kwalrath

@jcollins-g jcollins-g requested a review from keertip June 2, 2017 22:30
@googlebot googlebot added the cla: yes Google CLA check succeeded. label Jun 2, 2017
@kwalrath
Copy link
Contributor

kwalrath commented Jun 2, 2017

Can't wait to see how this improves the Angular docs!

@Hixie
Copy link
Contributor

Hixie commented Jun 2, 2017

oooOOOooo!

Do we need to use @canonicalFor anywhere in Flutter? What does it do? :-)

@chalin
Copy link
Contributor

chalin commented Jun 3, 2017

Very nice! I'm also looking forward to see this in action.

Copy link
Member

@devoncarew devoncarew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a review, just a pass through.

Really great news about the scoring improvements, esp. the drop in angulardart ambiguity!

Does this reduce the number of canonicalization warnings from the sdk when generating docs for say dartdoc itself? I suspect that these canonicalization improvements should eliminate most of those.

bin/dartdoc.dart Outdated
DartSdk sdk = new FolderBasedDartSdk(PhysicalResourceProvider.INSTANCE,
PhysicalResourceProvider.INSTANCE.getFolder(sdkDir.path));

setConfig(
addCrossdart: addCrossdart,
addCrossdart: args['add-crossdart'] as bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if the analyzer needs this casting - do you get a warning without the as bool?

@@ -266,6 +266,14 @@ ArgParser _createArgsParser() {
"Generates `index.json` with indentation and newlines. The file is larger, but it's also easier to diff.",
negatable: false,
defaultsTo: false);
parser.addOption('ambiguous-reexport-scorer-min-confidence',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a super long name for a flag. Is this even a flag that user's should toggle? Should we just decide on a good value for this?

If you need a flag for dev purposes, you can have a hidden option with the hide: true param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added hide: true.

"ambiguous-reexport",
"A symbol is exported from private to public in more than one library and dartdoc can not determine which one is canonical",
[
"Use {@canonicalFor @@name@@} in the desired library's documentation to resolve",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@canonicalDeclarationFor? @canonicalDocLocation? @docLocationFor? @canonicalLocationFor? @dartdocCanonical?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked @canonicalFor since the tag is to appear in the library which is canonical for the symbol given as a parameter. It'll look like this:

/// Some docs for foo.
/// {@canonicalFor bar.Baz}
library foo;

Since we're not specifying a location as a parameter to the tag I tend to drop out options 2-4. I'm not sure what the first one would mean in this context. The last one seems to add unnecessary information (AFAIK there are no similar comment tags for other tools).

@jcollins-g
Copy link
Contributor Author

--auto-include-deps was broken and I didn't notice it until the upload (we added src/mylib from the test package). I have no idea how it managed to work before, it probably was accidental. So, added a checker to isPrivate to never declare public something in lib/src of a package.

@jcollins-g
Copy link
Contributor Author

jcollins-g commented Jun 5, 2017

@Hixie: The only ambiguous reexport warning in flutter is for Matrix4, since it is reexported from other packages directly. Flutter is constructed in such a way that it never crosses the private/public export boundary in more than one public library for a given symbol, so the canonicalization code knows to prioritize the library where the boundary was crossed and ignore other reexports.

For example:
lib/reexporter_of_foo.dart <--- lib/foo.dart <--- lib/src/my_private_library.dart
lib/another_reexporter_of_foo.dart <--- lib/foo.dart

As long as your reexports are structured this way, dartdoc knows to prioritize lib/foo.dart for symbols originally from lib/src/my_private_library.dart. Both the Dart SDK and flutter seem to be built this way.

Angular2 has a lot of things that look like this (simplified):

lib/foo.dart <--- lib/src/my_private_library.dart
lib/library.dart <--- lib/src/my_private_library.dart
lib/another_library.dart <--- lib/src/my_private_library.dart
lib/stuff_and_things.dart <--- lib/src/my_private_library.dart

With this structure, dartdoc used to just give up and pick one of these arbitrarily as canonical, emitting a warning. The scorer enables it to guess better. It sets values to things from the library names and directory structure, whether a library is deprecated, and so on, to determine where it should be canonical. But it's possible to guess wrong, and @canonicalFor makes up the difference there.

@jcollins-g jcollins-g requested a review from pq June 5, 2017 18:52
/// documented with this library, but these ModelElements and names correspond
/// to the defining library where each originally came from with respect
/// to inheritance and reexporting. Most useful for error reporting.
Iterable<String> get allOriginalModelElementNames {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious why you're not returning a List here. Is this to prevent write access?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not actually using the full interface I tend to try to limit what I return to what I need, in this case, Iterable. Still learning interfaces in Dart so I'm not doing that super consistently yet, but working on it.

}

@override
String toString() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short form?

=> "${library.name}: ${score.toStringAsPrecision(4)} - ${reasons.join(', ')}";

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@pq pq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments but nothing substantial. (And hey it's hard to argue with the results; a 97% drop ain't bad!) Thanks for the context and screen shot. Much appreciated!

👍

.where((s) => s.isNotEmpty));
}

Set<String> get namePieces {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly here. I prefer the => form but I guess it's a matter of taste. 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and extracted into Nameable.

@@ -2024,6 +2255,10 @@ abstract class ModelElement implements Comparable, Nameable, Documentable {
return _canonicalLibrary;
}

List<ScoredCandidate> scoreCanonicalCandidates(List<Library> libraries) {
return libraries.map((l) => scoreElementWithLibrary(l)).toList()..sort();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But of course, feel free to ignore!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one, since it includes a map in the middle and includes function parameters seems more natural with the long form.


PackageWarningHelpText(this.warning, this.warningName, this.shortHelp,
[this.longHelp]) {
if (this.longHelp == null) this.longHelp = [];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could do this.longHelp ??= []...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

_countedWarnings.putIfAbsent(element, () => new Set());
_countedWarnings[element].add(warningData);
_writeWarning(kind, fullMessage);
_countedWarnings.putIfAbsent(element?.element, () => new Set());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you're allowing for mappings with null keys. Is that right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Null keys are allowed when we have warnings that aren't related to a specific element.


test('@canonicalFor directive works', () {
expect(SomeOtherClass.canonicalLibrary, equals(reexportOneLib));
expect(SomeClass.canonicalLibrary, equals(reexportTwoLib));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can elide the equals?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, apparently so. Done.

@jcollins-g jcollins-g removed the request for review from keertip June 7, 2017 17:31
@jcollins-g
Copy link
Contributor Author

Realizing I didn't reply here to @devoncarew about reducing SDK errors. We talked about this briefly offline, but for the record:

This is only for ambiguous reexport warning. When determining canonicalization for a given symbol, there are three possible paths:

  1. There are multiple canonical locations and we have to choose between them.
  2. There is exactly one possible canonical location.
  3. There is no canonicalization available.

Earlier work in #1390 and #1368 implemented the second item above, and partially implemented the first in a way that worked well for Flutter but not AngularDart. This PR improves the first case to the maximum extent I can think of for now and gives control to users for disambiguating cases it can't handle. Improving the third case (no canonicalization found) is still in the queue and will require knowledge of packages not being documented.

@devoncarew
Copy link
Member

My comments mostly just applied to the UI of the tool; lgtm for my part

@jcollins-g jcollins-g merged commit 4a505cf into master Jun 13, 2017
@jcollins-g jcollins-g deleted the canonicalization-scorer branch June 13, 2017 19:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Google CLA check succeeded.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

scoring for ambiguous reexports
7 participants