Produce a warning if multiple tests run with the exact same group/test name #1571

DanTup · 2021-08-31T16:23:19Z

There's some functionality in Dart-Code that tries to handle this, but it's a rather clunky and I think could be better dealt with here. (I'm not certain it's a common issue, but it did come up for Dart-Code at least once).

Consider a user duplicated a test, but forgot to update the name:

main() {
group('foo', () {
test('bar', () {});
test('bar', () {});
});

Since IDEs can only run tests by name, it can be confusing to click Run next to one test and find yourself debugging code that originated from the other.

Unless there's a good reason to support multiple tests with the same name, it would be nice if the test package could warn when this happens (or even just refuse to run). If it's opt-in with a flag, that would be fine too (the IDE could pass it in the case where it expects exactly one test to run - eg. "Run" is clicked on a test, and not a group).

Another possibility could be to detect this before running with a lint or something (although dynamic test names might complicate that a little)? @pq

The text was updated successfully, but these errors were encountered:

jakemac53 · 2021-08-31T16:42:37Z

Ya this is uncommon and a bad practice, we probably should just block/warn for it.

I think this would be hard to lint for, and if you run tests in the IDE you don't necessarily see the warnings.

We could add a warning and an option to upgrade that to an error though, which it would possibly make sense for the IDE to use.

jakemac53 · 2021-08-31T16:45:47Z

Running tests by name has other problems too though (especially around dynamic names).

We could also explore different options for that. For instance one idea that might work better for IDEs specifically is the ability to run a test based on the column/line location in the file where the test invocation is? A bit funky to be sure but we should be able to gather that info by inspecting the stack trace. That requires source maps to be working for web tests, so it could be brittle.

DanTup · 2021-08-31T16:54:45Z

I think this would be hard to lint for, and if you run tests in the IDE you don't necessarily see the warnings.

Not sure I understand - if it was a real lint, it would show up in the IDE in the normal errors list? (this would show the user issue without needing to run the test, but I think also be less reliable).

Running tests by name has other problems too though (especially around dynamic names).

Yeah, this is another place I have some hacks - for example we try to discover tests using the analyzer outline information (so we can populate the test tree when you open a file without having run it), and currently just skip over any test names with a $ in. It would be nice to improve this area, although I don't have any great ideas.

For instance one idea that might work better for IDEs specifically is the ability to run a test based on the column/line location in the file where the test invocation is?

Yeah, this could work. We'd to make sure as the user types in the file we keep the location updated (for ex. they may modify the file then click Run without saving). Currently we only use the analysis server outline information for the initial discovery, but perhaps we could use it to maintain a live set of tests instead (this could fix one of the outstanding niggles where if you rename a test, the old name stays around until you re-run the whole file since we don't otherwise know it's not there anymore).

jakemac53 · 2021-08-31T17:00:41Z

Not sure I understand - if it was a real lint, it would show up in the IDE in the normal errors list? (this would show the user issue without needing to run the test, but I think also be less reliable).

Sorry what I meant is if we log warnings on the command line, users won't see them since they don't see command line output (and actually we probably wouldn't even log the warning in the json reporter?).

DanTup · 2021-09-01T09:29:21Z

Sorry what I meant is if we log warnings on the command line, users won't see them since they don't see command line output (and actually we probably wouldn't even log the warning in the json reporter?).

Ah, gotcha. That could be a reason to just reject it and fail with an error (to stderr) instead.

Although @jacob314 suggested a lint may be better (although I guess that could only detect constant test names) and @natebosch may have input from Flutter test perspective.

I'm not sure this is a common issue, though it came up in some discussions about trying to make it easier to match up tests from the source code (eg. using analysis outlines) versus the test runner. Right now it's quite easy to make what's shown in the VS Code test runner not really match what's in the editor and we sometimes have to guess which node in the tree relates to which result (eg. in the case of multiple tests with the same name, we then start looking at line numbers to find the "closest" one - to allow for edits to the file). With some restrictions (eg. don't use tests with the same name or non-constant names) some of that logic could perhaps be simplified.

jakemac53 · 2021-09-01T14:41:28Z

Ah, gotcha. That could be a reason to just reject it and fail with an error (to stderr) instead.

This is what I would prefer in an ideal world, but I think it is probably pretty breaking for folks and maybe not worth doing a breaking release of the package? Although breaking releases of package:test are actually possible, so maybe we could do it.

With some restrictions (eg. don't use tests with the same name or non-constant names) some of that logic could perhaps be simplified.

Note that non-constant names do have real use cases so I wouldn't want to block this. They are problematic for the IDE but useful.

jacob314 · 2021-09-01T16:06:43Z

It would be great if we could have lints and runtime errors that enforce that the tests for a file are exactly the tests that static analysis believes are run.

With that in mind, would be great to see the following static lints and runtime checks which I believe could together ensure that the list of tests the outline detects match the list of tests in the app.

Make non-constant test names a lint error
Make duplicate test names and duplicate group names a lint error as well as a runtime error in the test framework.
Enforce that test group and test block definitions always run if they are defined in a file with a lint and perhaps a runtime check (would require passing in the expected list of tests determined by static analysis)

Examples:

main() {
  group('foo', () {
    test('bar', () {}); // Lint error: duplicate test name
    test('bar', () {});
    test('$baz', () {}); // Lint error:  non constant test name
  });
}

main() {
  group('foo', () {
    test('bar', () {});
  });
  group('baz', () {
    test('bar', () {}); // Ok as bar tests are in separate groups?
  });
}

main() {
  group('foo', () {
    test('bar', () {});
  });
  group('foo', () { // Lint error: duplicate group name.
    test('baz', () {}); 
  });
}

main() {
  if (foo == bar) { // Lint error: use `skip` if a test or group should only sometimes run. Runtime error if test 'foo' is not run.
    group('foo', () {
      test('bar', () {});
    });
  }
}

DanTup · 2021-09-01T16:10:35Z

// Ok as bar tests are in separate groups?

Yep, tests with overlapping in groups should be fine (the IDE considers the test name to be fully qualified including all parent groups and the test name concatenated with a space).

I think (optional) lints for the above would be neat. It would make it easier to convey to users what rules they should follow to get the best experience in the editor and it may also make it more reasonable for us to drop results out of the runner that we think (from static analysis) no longer exist in the file (so rename tests don't hang around until you run the entire file).

jakemac53 · 2021-09-01T16:13:08Z

Make non-constant test names a lint error

I am fine with this as an optional lint, but not a default one. There are a lot of situations where it is valid to build up test names today, it is effectively the only way to write parameterized tests in Dart.

jakemac53 · 2021-09-01T16:16:29Z

main() {
  group('foo', () {
    test('bar', () {});
  });
  group('baz', () {
    test('bar', () {}); // Ok as bar tests are in separate groups?
  });
}

Yes this is fine

main() {
  group('foo', () {
    test('bar', () {});
  });
  group('foo', () { // Lint error: duplicate group name.
    test('baz', () {}); 
  });
}

In theory this might be ok but it seems reasonable enough to block. It would lead to weird behavior anyways (you would end up running both groups).

jakemac53 · 2021-09-01T16:18:40Z

Another case that gets interesting:

main() {
  group('foo', () {
    test('bar', () {});
  });
  test('foobaz', () {});
}

I believe that running the foo group today would also end up running the foobaz test.

This might be rare enough that we don't have to care about it.

DanTup · 2021-09-01T16:27:54Z

I think #908 discusses that issue. We have to use an open-ended regex for groups today, but it we could make it closed (eg. run things in groups matching ^foo$) I think that might solve that.

jacob314 · 2021-09-01T16:30:24Z

Make non-constant test names a lint error

I am fine with this as an optional lint, but not a default one. There are a lot of situations where it is valid to build up test names today, it is effectively the only way to write parameterized tests in Dart.

Can you give some examples of this? I'd only be interested in adding a lint for it if it was in the default set of lints users were opted into. Our IDE support will be lousy if you author test names like this.

jakemac53 · 2021-09-01T16:32:47Z

I think #908 discusses that issue. We have to use an open-ended regex for groups today, but it we could make it closed (eg. run things in groups matching ^foo$) I think that might solve that.

Ya adding support for running a group by name should be doable I would think, and would probably solve the problem.

Can you give some examples of this? I'd only be interested in adding a lint for it if it was in the default set of lints users were opted into. Our IDE support will be lousy if you author test names like this.

https://github.com/dart-lang/source_gen/blob/master/source_gen/test/generator_for_annotation_test.dart#L14

jakemac53 · 2021-09-01T16:55:51Z

It seems to me like what is really needed here is some sort of interactive mode for the test runner. Something that you can launch and then talk to via a JSON protocol (or some other protocol).

You would be able to ask it for instance what are all the available tests, and what are their IDs. It would also return to you the line/column information for that test. Then you would be able to ask it to run specific tests by ID etc. This could also be used as the basis for a general interactive test runner UI.

I think it gets weird still when you start talking about the IDE experience and what happens when somebody is making edits to a file that they haven't saved yet though. But that is probably weird today as well? Maybe we could actually make that situation better with this model and have it use the unsaved content of the file for the test or something.

DanTup · 2021-09-01T17:04:39Z

Having some sort of server would be nice, though sending unsaved files to it starts to feel like it's overlapping somewhat with the analysis server.

I think it gets weird still when you start talking about the IDE experience and what happens when somebody is making edits to a file that they haven't saved yet though. But that is probably weird today as well?

For things like CodeLens, it works suprisingly well. As you type, the analysis server is feeding us new Outlines for the file. When you click the Run/Debug link, we find the Outline node that corresponds to the location in the file (which is based on the latest info from the analysis server, so didn't need to have been saved) and extracts the test name from that. This means you can rename a test and click the Run CodeLens and it "just works" (VS Code automatically saves the file as the debug session starts, so we do also run the latest version of the code).

jakemac53 · 2021-09-01T18:12:19Z

Having some sort of server would be nice, though sending unsaved files to it starts to feel like it's overlapping somewhat with the analysis server.

Ya, I think the problem is that the analysis server is only doing a rough approximation of where tests are (or maybe that is done on the vscode side, I don't know). It isn't executing the code for the test and so all these weird scenarios exist where it doesn't actually know enough information.

Whatever code that does exist for this today could be removed.

This means you can rename a test and click the Run CodeLens and it "just works" (VS Code automatically saves the file as the debug session starts, so we do also run the latest version of the code).

Ah interesting so it saves right before debugging automatically, that makes sense. The same approach should generally work here then. We would want to ensure that test IDs are stable (given the same input file), but that should be doable.

This actually makes me think of another, possibly terrible idea though. We could just make the IDs be something that you can compute based on the offset in the file that the test call appears (it could just be the literal character offset even). And then allow you to run tests based on that "ID".

DanTup · 2021-09-02T10:37:13Z

Ya, I think the problem is that the analysis server is only doing a rough approximation of where tests are (or maybe that is done on the vscode side, I don't know). It isn't executing the code for the test and so all these weird scenarios exist where it doesn't actually know enough information.

It's a bit of both. The analysis server detects the group/test calls and marks Outline nodes in the files outline as tests/groups. VS Code then tries to extract the name from the text (it's the raw string with quotes etc. like 'foo' that we get from the outline). It is all static with no execution.

Right now, we only use that information once at the start. As soon as you run a test file, we switch to using the latest results from the run and never consult the outline again. However, this means we never update if you rename tests (until you re-run the whole file, where we'll prune stale results). It would be nice if we could get somewhere inbetween, where you can run a test, then rename it, and the test tree updates, and even the results from the previous run remain on that newly-named node. I have some ideas about that (that don't require changes), but haven't them out yet (but once I've fully migrated to the VS Code test runner might be a good time to do so).

This actually makes me think of another, possibly terrible idea though. We could just make the IDs be something that you can compute based on the offset in the file that the test call appears (it could just be the literal character offset even). And then allow you to run tests based on that "ID".

In some cases I think this would work - though perhaps taking the name and the offset (or just line number) would be a little better and just pick the closest one. Where it may fall down is if you make changes to the file then try to re-run from the test explorer (since right now the offset the test explorer knows about was from the last run, and not where the test necessarily is now).

jakemac53 · 2021-09-14T17:58:00Z

The functionality for this is going to ship, but given it is a breaking change we aren't going to swap the default right now.

Users can enable it in their dart_test.yaml though, and we can easily flip the default when ready to do a breaking release.

DanTup · 2021-09-16T17:07:33Z

I've been working on migrating to the VS Code test runner, and it turns out supporting these duplicate tests is now quite complicated. VS Code requires a unique ID for each test, but it must be stable across runs (otherwise the history will be lost on each run). The natural thing is to use the test name, but it means if you run two tests with the same name, they'll both write results to the same node in the tree.

I got around this with the custom runner by marking a tree node as "allocated" within each run and if I got a test result that matched a nodes name, but that node was already allocated to a previous test result, it would look for another (or create one). It also used the line number to try to pick the closest node by line number when allocated (so if you ran only the second test, it would attach to the second node in the tree, not the first).

I don't think it's going to be possible to do this reliably anymore, so think I'll probably just leave it as-is (that is - only one node will appear in the tree and will have results for both - you'll still be able to run tests and see both sets of results). I'll still produce a warning if multiple tests ran when only one was expected (our previous workaround for not having this feature) so it's clear to the users why they aren't seeing two nodes, and if this feature ships on by default in future, I may remove that to let it be handled by this.

jakemac53 mentioned this issue Sep 9, 2021

disallow duplicate group/test names within a suite #1573

Merged

jakemac53 added the next-breaking-release Issues that are worth doing but need to wait for a breaking version bump label Sep 14, 2021

DanTup mentioned this issue Oct 13, 2021

Add a more precise way to start multiple specific tests at the same time #1598

Closed

natebosch mentioned this issue Feb 3, 2023

how to express that a large number of tests are expected to fail? #1832

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Produce a warning if multiple tests run with the exact same group/test name #1571

Produce a warning if multiple tests run with the exact same group/test name #1571

DanTup commented Aug 31, 2021

jakemac53 commented Aug 31, 2021 •

edited

Loading

jakemac53 commented Aug 31, 2021

DanTup commented Aug 31, 2021

jakemac53 commented Aug 31, 2021

DanTup commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jacob314 commented Sep 1, 2021

DanTup commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

DanTup commented Sep 1, 2021

jacob314 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021 •

edited

Loading

DanTup commented Sep 1, 2021

jakemac53 commented Sep 1, 2021 •

edited

Loading

DanTup commented Sep 2, 2021

jakemac53 commented Sep 14, 2021

DanTup commented Sep 16, 2021

Produce a warning if multiple tests run with the exact same group/test name #1571

Produce a warning if multiple tests run with the exact same group/test name #1571

Comments

DanTup commented Aug 31, 2021

jakemac53 commented Aug 31, 2021 • edited Loading

jakemac53 commented Aug 31, 2021

DanTup commented Aug 31, 2021

jakemac53 commented Aug 31, 2021

DanTup commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jacob314 commented Sep 1, 2021

DanTup commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

DanTup commented Sep 1, 2021

jacob314 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021

jakemac53 commented Sep 1, 2021 • edited Loading

DanTup commented Sep 1, 2021

jakemac53 commented Sep 1, 2021 • edited Loading

DanTup commented Sep 2, 2021

jakemac53 commented Sep 14, 2021

DanTup commented Sep 16, 2021

jakemac53 commented Aug 31, 2021 •

edited

Loading

jakemac53 commented Sep 1, 2021 •

edited

Loading

jakemac53 commented Sep 1, 2021 •

edited

Loading