Smarter versioned tests #1744

munificent · 2025-07-17T22:53:53Z

Redo how versioned tests work.

The diff in the README.md and the doc comments in the .dart file changes should explain what's going on here. The basic idea is that now a test is either completely unversioned with only a single output, or completely versioned where all of the output sections have a version number.

I think that's easier to read than a test like:

<<<
some.code();
>>> 3.8
some.code();
>>>
some.code();

And you have to try to figure out what versions the last section applies to.

This commit also updates the tests to match the new style. This just means adding 3.8 to a bunch of output sections where that was previously implicit. The set of tested versions and their expectations are the same.

I also updated the test updater to support the new output styles.

This change also speeds up running the formatter tests by skipping running tests on language versions where the output is the same as a previous and subsequent version. Prior to this change, running tall_format_test.dart runs 8,674 tests. If you bump the highest supported language version from 3.9 to 3.10 (which I'm about to do for dot shorthands), that jumps to 11,556 because it runs every test at every language version.

With this change, it runs 5,940 tests when the maximum supported version is both 3.9 and 3.10. The number of tests being run only increases when there are actual new tests or new version-specific expectations.

When I first added support for version-specific tests, I had an idea that there would be a "default" way to format a given test and then one or more degenerate "old" ways to format it. Concretely, the 3.7 style would be the old way and 3.8 (and later) the new way. I thought it would be easier to read if the default newest style was closest to the input, so the tests were ordered like: ``` <<< foo.bar(); >>> No version default latest style here. foo.bar(); >>> 3.7 Older worse style here. foo.bar(); ``` This works OK when there are only two outputs, a single old degenerate style and the default style. But there are an increasing number of tests for new language features (null-aware elements, dot shorthands). Those tests shouldn't be run at all on certain older versions. And at some point, I'm sure we will end up making a further style tweak that causes some test to have multiple older styles. At that point, I think it's clearer if the test outputs are in chronological/version order, like: ``` <<< foo.bar(); >>> 3.8 First version at which this feature is supported. foo.bar(); >>> 3.9 Later style tweak. foo.bar(); >>> 3.10 Another style tweak. foo.bar(); ``` Handling multiple versions like this will require further changes to the test runner. This commit only reorders the test outputs in version order to minimize diff clutter. There are no meaningful changes here, it's just the output of a hacked version of the test_updater which normalizes the order.

See the previous commit for some more context. This commit has the substantive changes. The diff in the README.md and the doc comments in the .dart file changes should explain what's going on here. The basic idea is that now a test is either completely unversioned with only a single output, or completely versioned where all of the output sections have a version number. I think that's easier to read than a test like: ``` <<< some.code(); >>> 3.8 some.code(); >>> some.code(); ``` And you have to try to figure out what versions the last section applies to. This commit also updates the tests to match the new style. This just means adding `3.8` to a bunch of output sections where that was previously implicit. The set of tested versions and their expectations are the same. I also updated the test updater to support the new output styles. This change also speeds up running the formatter tests by skipping running tests on language versions where the output is the same as a previous and subsequent version. Prior to this change, running tall_format_test.dart runs 8,674 tests. If you bump the highest supported language version from 3.9 to 3.10 (which I'm about to do for dot shorthands), that jumps to 11,556 because it runs every test at every language version. With this change, it runs 5,940 tests when the maximum supported version is both 3.9 and 3.10. The number of tests being run only increases when there are actual new tests or new version-specific expectations.

munificent · 2025-07-17T22:55:08Z

test/tall/statement/switch_legacy.stmt

This test was deleted because it doesn't actually make sense.

It was testing how the tall style formatter would format a switch statement at language version 2.19. But the tall style formatter can't be used at any language version older than 3.7, so there's no way for a user to access this.

munificent · 2025-07-17T23:02:18Z

Oops, don't review this just yet. Let me fix the short style test failure.

example/format.dart

lib/src/testing/test_file.dart

test/utils.dart

munificent · 2025-07-18T00:16:15Z

Nate beat me to it, but to be clear this is ready to review how. :)

Redo how versioned tests work. The diff in the README.md and the doc comments in the .dart file changes should explain what's going on here. The basic idea is that now a test is either completely unversioned with only a single output, or completely versioned where all of the output sections have a version number. I think that's easier to read than a test like: ``` <<< some.code(); >>> 3.8 some.code(); >>> some.code(); ``` And you have to try to figure out what versions the last section applies to. This commit also updates the tests to match the new style. This just means adding `3.8` to a bunch of output sections where that was previously implicit. The set of tested versions and their expectations are the same. I also updated the test updater to support the new output styles. This change also speeds up running the formatter tests by skipping running tests on language versions where the output is the same as a previous and subsequent version. Prior to this change, running tall_format_test.dart runs 8,674 tests. If you bump the highest supported language version from 3.9 to 3.10 (which I'm about to do for dot shorthands), that jumps to 11,556 because it runs every test at every language version. With this change, it runs 5,940 tests when the maximum supported version is both 3.9 and 3.10. The number of tests being run only increases when there are actual new tests or new version-specific expectations.

munificent added 2 commits July 17, 2025 15:43

munificent requested review from kallentu and natebosch July 17, 2025 22:53

munificent commented Jul 17, 2025

View reviewed changes

munificent added 2 commits July 17, 2025 16:29

Fix example script and short style test.

9945836

Remove unused code.

e2c5784

natebosch approved these changes Jul 17, 2025

View reviewed changes

example/format.dart Outdated Show resolved Hide resolved

lib/src/testing/test_file.dart Outdated Show resolved Hide resolved

test/utils.dart Outdated Show resolved Hide resolved

test/utils.dart Outdated Show resolved Hide resolved

Apply review feedback.

ad0280f

munificent merged commit 6c4166a into main Jul 18, 2025
7 checks passed

munificent deleted the smarter-versioned-tests branch July 18, 2025 00:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Smarter versioned tests #1744

Smarter versioned tests #1744

Uh oh!

munificent commented Jul 17, 2025

Uh oh!

munificent Jul 17, 2025

Uh oh!

munificent commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

munificent commented Jul 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Smarter versioned tests #1744

Smarter versioned tests #1744

Uh oh!

Conversation

munificent commented Jul 17, 2025

Uh oh!

munificent Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

munificent commented Jul 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

munificent commented Jul 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants