Skip to content

disallow duplicate group/test names within a suite #1573

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 15, 2021

Conversation

jakemac53
Copy link
Contributor

Sending this out mostly just for discussion. If we release it we would do so as a breaking change. Related to #1571.

We could also just make this a warning, or hide it behind an opt-in configuration option.

Making it a warning is actually very difficult, we have no concept of warnings at all in the test runner, and the Declarer currently has no way of reporting messages (other than printing).

@jakemac53 jakemac53 requested a review from natebosch September 9, 2021 16:30
@google-cla google-cla bot added the cla: yes label Sep 9, 2021
@jakemac53 jakemac53 marked this pull request as draft September 9, 2021 16:30
@jakemac53
Copy link
Contributor Author

I am running this through internally also to see how breaking it would be, but based on how many tests it broke in just this package it seems like this would not be something we can enable by default, at least not without an opt-out.

@natebosch
Copy link
Member

I think we should push forward with this. I agree we need an opt-out.

I'm not sure if it's best to add an opt-in flag first and let users try it out or let the IDE plugins enable it by default first before we flip the default in the runner. @DanTup do you have any thoughts on flags and opt-in by default?

@jakemac53 jakemac53 changed the title disallow duplicate test names within a suite disallow duplicate group/test names within a suite Sep 9, 2021
@DanTup
Copy link
Contributor

DanTup commented Sep 9, 2021

I usually prefer for the IDE to control its default because it's much easier for the user if we just have an on/off than some option that depends on the version of the underlying tool. So I'd normally have the IDE always send the flag (--foo and --no-foo).

However, we'd usually know the SDK version the flag is available in to know whether to send it. But if this will depend on the package:test version in the users own app, that's trickier to get, so we might need a tristate (unset/on/off) anyway, and leave the default as unset (which would use the tools default) for some time to ensure we don't prevent people running tests with older versions of the package.

@jakemac53
Copy link
Contributor Author

@DanTup Can you read files in the users package? You could get the version from the pubspec.lock file.

I think we are leaning towards turning this on by default anyways though, and adding an opt out setting. So you could have a toggle to opt out, but by default wouldn't pass any flag.

@jakemac53
Copy link
Contributor Author

Fwiw there were around 2400 failures internally for this, so definitely a lot more than we can fix by hand. We definitely need the opt out for internal code, and its unlikely anybody would have the time to champion an effort to clean up these tests.

I am not sure that running individual tests from the IDE works anyways internally though, @jacob314 do you have any idea?

@DanTup
Copy link
Contributor

DanTup commented Sep 13, 2021

@DanTup Can you read files in the users package? You could get the version from the pubspec.lock file.

I can, though I try to avoid poking in files that seem like implementation details that could change (especially from outside of the SDK code itself, because if it changes, Dart-Code would need to handle multiple versions). It's also not always obvious which file to use because the user may have a launch config that can override some things (like passing --packages or running from a different directory).

I think we are leaning towards turning this on by default anyways though, and adding an opt out setting. So you could have a toggle to opt out, but by default wouldn't pass any flag.

In that case, maybe we don't need to do anything in the IDE - we just inherit the new default and if the user does want to opt-out, they can add the flag to their launch configuration. That way we don't need to worry about sending a flag that's not supported or reading versions. Is it possible to make sure the error message that's printed when this fails includes details of the flag that can be used to allow this to make things easier?

@jakemac53
Copy link
Contributor Author

jakemac53 commented Sep 13, 2021

I can add a link to the message to the docs for this configuration. Fwiw I don't plan on adding a command line argument - this is something that should really only be configured via source files (committed to git), since if it is required it means the tests can't be ran without it. So it will likely only be configurable via the dart_test.yaml file.

@DanTup
Copy link
Contributor

DanTup commented Sep 13, 2021

So it will likely only be configurable via the dart_test.yaml file.

Based on the above (that I don't think it's worth exposing as an IDE config) that seems reasonable to me :-)

@jakemac53 jakemac53 marked this pull request as ready for review September 13, 2021 21:44
@jakemac53
Copy link
Contributor Author

Note that I opted for breaking changes across the board here... although it does feel a bit painful to do a 2.x package:test release just for this 🤷‍♂️

@DanTup
Copy link
Contributor

DanTup commented Sep 14, 2021

although it does feel a bit painful to do a 2.x package:test release just for this 🤷‍♂️

There's no rush to release this from me. It'll allow some tidying up (and less surprising behaviour) in Dart-Code, but the current behaviour isn't bad (and I'll probably leave the old workaround to catch older package versions for a little while anyway).

@jakemac53
Copy link
Contributor Author

If we want to avoid the breaking change we could leave this disabled by default for now, and just have it as an option you can enable (which we would flip on the next breaking release).

@jacob314
Copy link
Member

Fwiw there were around 2400 failures internally for this, so definitely a lot more than we can fix by hand. We definitely need the opt out for internal code, and its unlikely anybody would have the time to champion an effort to clean up these tests.

I am not sure that running individual tests from the IDE works anyways internally though, @jacob314 do you have any idea?

Running and debugging individual tests from the IDE does work internally in Google3 for Flutter apps and VM apps.

@jacob314
Copy link
Member

Actually I think this problem is probably bellow the bar. Maybe we just need a unified scheme where if you have ambiguous group + test name pairs we just have a consistent scheme for disambiguating that both the analyzer and test runner agree on.
For example:
testName, testName duplicate 1, testName duplicate 2
if you have 3 instances of testName. That way IDEs can correctly select the test to run andand the test runner does not need to fail and does not report ambiguous results.

@DanTup
Copy link
Contributor

DanTup commented Sep 15, 2021

For example:
testName, testName duplicate 1, testName duplicate 2

I think this would work - though the analyzer doesn't currently give us the test names, only the snippet of code for the test() call, so the IDE would need to add this (so it would have to maintain a list of all complete test names to check for dupes).

It could get weird if there are dynamic names, or the names overlap in other ways though. For example:

test('my test', () {}); // "my test 1"?
test('my test', () {}); // "my test 2"?
test('my test 2', () {}); // ?

Edit: Perhaps you meant to include the word "duplicate" in the name too.. that seems much less likely to have collisions (although we should perhaps at least agree what should happen in that case).

@jakemac53
Copy link
Contributor Author

jakemac53 commented Sep 15, 2021

I don't really like the idea of trying to come up with a scheme for magic names here - any such system will have its own failure modes? It also doesn't help users manually running single tests on the command line (they have to learn this scheme?). I would rather push for just disallowing duplicate names.

In general duplicate names are usually mistakes anyways, from what I have seen.

@jacob314
Copy link
Member

I don't really like the idea of trying to come up with a scheme for magic names here - any such system will have its own failure modes? It also doesn't help users manually running single tests on the command line (they have to learn this scheme?). I would rather push for just disallowing duplicate names.

In general duplicate names are usually mistakes anyways, from what I have seen.

I agree duplicate names are mistakes. The question is whether they are important mistakes.
An alternate first step would be to start by emitting a non fatal runtime warning when there are duplicate names along with a lint that warns if there are duplicate names. The trouble is I'm not sure fixing the 2400 dupes in g3 is worth the time. The 2400 bugs are just sloppy names not cases where test behavior is actually bad.
One other idea was a quickfix for the lint that fixes the duplicate names in an ugly way so users that care can apply a better fix.
for example names could be:
"vertical scroll"
"vertical scroll DUPLICATE TEST 2"
"vertical scroll DUPLICATE TEST 3"
or something equally obvious that if you care about your test case name quality you should give the duplicates better names.

@jakemac53
Copy link
Contributor Author

A lint is not really a viable approach here because we support dynamic names.

I would go for a warning except IDE users wouldn't generally see it anyways, and also we have no actual mechanism in this package for reporting warnings bizarrely enough. I don't really want to add that just to support this. If I just print that could break some reporters.

The trouble is I'm not sure fixing the 2400 dupes in g3 is worth the time.

Yes I have no plan to fix these. We would disable this check internally, or possibly we could just disable it on all the failing targets (that may be something we could automate).

There are probably also much fewer than 2400 actual dupes - tests are ran on multiple platforms and so a single one can account for multiple failures. But the real number is still probably too high to bother fixing manually.

@DanTup
Copy link
Contributor

DanTup commented Sep 15, 2021

I don't really want to add that just to support this. If I just print that could break some reporters.

There's a bunch of weirdness that having a way to send output back that isn't tied to a test might be able to improve. For example there are fake "loading foo.dart" tests that are emitted that we have to strip out in the editor (because showing them in the test tree is weird) as well as empty groups for any top-level tests (which again we have to strip out, because showing a nameless node for them is also weird).

I don't want to invent more work, but if JSON protocol changes are a future possibility (a "v2" JSON reporter?), it might be more worthwhile than just this :-)

@natebosch
Copy link
Member

There's a bunch of weirdness that having a way to send output back that isn't tied to a test might be able to improve.

Yeah we'd like to do that in general. #1547

@jakemac53
Copy link
Contributor Author

jakemac53 commented Sep 15, 2021

We do have a desire to totally revamp the entire reporter api, and I think if we did that we should add explicit levels of logging (info/warn/error) etc. The current way it is handled is extremely cobbled together and wonky. Each reporter has to listen to multiple different streams of things that might potentially emit messages etc...

cc @natebosch it would be cool if we could actually get something on OKRs to address this? Maybe as a part of a tech debt initiative?

EDIT: Oh I see you had already responded here haha :). I needed to refresh.

@natebosch
Copy link
Member

cc @natebosch it would be cool if we could actually get something on OKRs to address this? Maybe as a part of a tech debt initiative?

The reporter API refactor is on our OKRs already. I have a prototype I've had on the back burner for a bit but it's been tricky to dig through the existing interactions and put them behind well defined interfaces. The JSON reporter in particular reads a lot internal detail, and relies on some specific implicit behaviors.

@jakemac53
Copy link
Contributor Author

Ya, we could just do #1547 I guess to help with the concrete problems we face with reporting this type of warning.

@DanTup
Copy link
Contributor

DanTup commented Sep 15, 2021

We do have a desire to totally revamp the entire reporter api

FWIW, VS Code just added a new set of testing APIs. Unfortunately it's not like LSP/DAP where you just implement something that creates JSON and a generic client can consume it, it still requires writing code inside a VS Code extension, but it might be interesting to scan through before agreeing any changes (for Dart-Code we'd basically need to map from one to the other, and I wouldn't be surprised if it grew into a published standard like DAP/LSP):

https://code.visualstudio.com/api/extension-guides/testing

It has some interesting things, like structures for the test output (for example expected/actual output, which it can then render nicely in a UI rather than just flat text output: https://code.visualstudio.com/api/references/vscode-api#TestMessage).

(It also does have a mechanism for sending general output for a "test run" that isn't tied to a test - although it's just text and doesn't have levels)

@jakemac53
Copy link
Contributor Author

That use case sounds like it could just be its own custom reporter as well. It would probably be fairly vscode specific?

@jakemac53 jakemac53 merged commit 4de96a7 into master Sep 15, 2021
@jakemac53 jakemac53 deleted the duplicate-test-names branch September 15, 2021 17:22
@DanTup
Copy link
Contributor

DanTup commented Sep 15, 2021

That use case sounds like it could just be its own custom reporter as well.

Yeah, though if the reporter API only gets text output (eg. not separate expected/actual text) it might be hard for it to map that back into that API, so it might be worth considering the capabilities of it if there are going to be changes to the reporter APIs.

It would probably be fairly vscode specific?

The API above is, but if it becomes a publish JSON standard, I'd expect to see other editors pick it up like they have with LSP and DAP - there are clients for Vim, Emacs, even Visual Studio proper. I don't know that there are plans for this though, but given the uptake of DAP/LSP, it might be a possibility (being able to get free support for lots of languages in your editor - or free support for lots of editors for your language seems fairly appealing :-) )

@jakemac53
Copy link
Contributor Author

Ya trying to output expected/actual text as a structured thing would be pretty interesting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants