Skip to content

cargo test: option to run all tests under the same binary #13450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Astlaan opened this issue Feb 16, 2024 · 5 comments
Open

cargo test: option to run all tests under the same binary #13450

Astlaan opened this issue Feb 16, 2024 · 5 comments
Labels
A-cargo-targets Area: selection and definition of targets (lib, bins, examples, tests, benches) C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-test S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.

Comments

@Astlaan
Copy link

Astlaan commented Feb 16, 2024

Problem

Currently if we have multiple test files (containing multiple test functions each) under the tests folder in a crate directory, cargo test generates a different binary for each of test files. For example below:

crate_directory
|_ src/
|_ tests/
    |_ test_file1.rs
    |_ test_file2.rs

cargo test will generate a binary for test_file1.rs and another one for test_file2.rs.

This article reports some of the issues with this approach, especially for larger projects/projects with some slow tests:

  1. Increased linking time: rustc will need to link the library for each of these binaries, potentially increasing compilation times.
  2. Critical path execution bottleneck: The binaries are run sequentially (even if the tests inside are multithreaded). If there are many tests inside, but also an abnormaly slow one, even if all the easier tests are finished, we will have to wait for the slow one to finish before starting to run the tests in the next binary. If this issue repeats itself in several binaries, then this can add up to a lot of wasted time where a lot of the CPU is idlying, while just waiting for the hard test to finish.

This can also have some other complications:

3.1) Tests cannot share expensive initialization steps: If many tests, even if they are split across different files, require a common expensive initialization step, this is easy to handle if everything is in the same binary/process: we can execute the computationally hard step, store it with something like lazy_static and then load it in the tests. This way you only calculate the expensive step once, and the use it many times. Of course, this is only possible if all the tests are in the same binary. But if they are split across different binaries, you have to run the computationally expensive step once per binary, wasting time (considering that binaries are run sequentially).

3.2) Systems resources have to be shared among parallel test. This is not an issue in cargo test but is, for example, in cargo nextest, since it does not use the threading model. If some tests require not a computationally expensive step, but one that uses a large fraction of the ram (loading a very big matrix for example), this will limit parallelization, since we will need to load this large amount of data in every binary, limiting the number of binaries that can be run concurrently.

Proposed Solution

One way to currently solve this is to structure the tests according to the following:

crate_directory
|_ src/
|_ tests/
    |_ test_files/
        |_ main.rs (with as many lines as files, "mod test_file1;",  "mod test_file2;", etc)
        |_ test_file1.rs
        |_ test_file2.rs

However, I suggest that cargo test could allow compiling all the tests to the same binary even if we use the original organization:

crate_directory
|_ src/
|_ tests/
    |_ test_file1.rs
    |_ test_file2.rs

Hypothetical solutions:
a) Change the behaviour of cargo test so that everything is run under the same binary by default. I would imagine however, that this would have backward compatibility issues, especially with multithreaded apps using unsafe code?
b) Simply add a flag that would enable this.

Note: cargo nextest, with its paralellization model using jobs, is able to solve issue (2). However, using processes instead of threads have the disadvantage of potentially making issues (3.1) and (3.2) worse.

Notes

No response

@Astlaan Astlaan added C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` S-triage Status: This issue is waiting on initial triage. labels Feb 16, 2024
@Astlaan Astlaan changed the title cargo test: option to run all tests in the same binary cargo test: option to run all tests under the same binary Feb 16, 2024
@epage
Copy link
Contributor

epage commented Feb 16, 2024

Note that with #5609, we plan to run test binaries in parallel. We are actively working towards this with the T-testing-devex team. The first step is json output for libtest for cargo to read and process.

Reducing link times would be great though.

Ways to opt-in

  • Edition
  • New autotests value
  • New test target value (like the proposal for opting custom test harnesses into the json output for parallel test running)

My main question is how to setup cargo/rustc to enable this, both with and without harness = false.

@epage epage added S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted. and removed S-triage Status: This issue is waiting on initial triage. labels Feb 16, 2024
@Astlaan
Copy link
Author

Astlaan commented Feb 16, 2024

@epage

Hi :)

The link you shared is for a multicrate testing though, instead of single crate? Or are you planning on running binaries in the same crate in parallel too?
In any case, This would alleviate (2), but not the other issues I think? And you would manage the number of running threads across all binaries?

Anyway, I think it would be nice if running everything in the same binary per thread would be facilitated, as single process multithreading (for tests in a crate) solves all 4 listed issues.

@epage
Copy link
Contributor

epage commented Feb 16, 2024

The link you shared is for a multicrate testing though, instead of single crate? Or are you planning on running binaries in the same crate in parallel too?
In any case,

I see that Issue being about running all selected test targets in parallel, whether they belong to the same package or not. There is another issue about running the available test binaries in parallel to compilation.

My plan is to not run a process per test unlike nextest because that runs into shared state issues.

This would alleviate (2), but not the other issues I think?

Correct, this does not help with reducing link time.

And you would manage the number of running threads across all binaries?

My assumption is that we'll use jobserver to limit the number of threads across all processes

@epage
Copy link
Contributor

epage commented Feb 21, 2024

Had some more thoughts on reducing the link time.

A very rough sketch:

package.autotests and package.autobins gain two new values

  • `split
  • unified

(names to be bikesheded)

split is the current behavior and true will map to split for the current Edition.

unified will

  • enumerate test files / directories
  • strip those out that have an explicit target
  • create an implicit build target called tests (so cargo test --test tests shows this is redundant with cargo test --tests)
    • This will error if the user has an explicit tests target but I don't see a way to avoid this without a confusing fallback scheme
  • pass these with a new rustc --mod <mod>=<path> command-line flag with an empty string being piped in for the crate root.

This still leaves out custom test harnesses. Our working approach to testing is that libtest is effectively frozen and we need to make custom test harnesses a first class experience to allow evolution outside of libtest.

One approach is if we move forward with a target field to specify main from a dep. In this case, the above scheme works.

If we do not have a way to delegate main, we'd likely want to infer tests/main.rs is the main and pass --mods to that (instead of using stdin with an empty string). The problem with this approach is that this will be inconsistent with further nested modules and with other target types. To make this consistent would be re-hashing the mod system discussions from the 2018 Edition.

We likely will need to defer the custom test harness discussion to have a better idea of what other pieces may be available (e.g. delegated main)

Benefits

  • Decrease test suite compile time by 3x for cargo itself (source)
  • Decrease on-disk artifacts by 5x for cargo itself (source)
  • Faster test times as more tests run in parallel, much like cargo nextest but without the orchestration complexity

In future editions we could migrate package.auto* = true to unified (details to be worked out)

Questions for what the long term true should be for autotestss / autobenches:

  • Are there sufficient end-user benefits outside of reduced link times / target/ size?
  • Would a reduction in link times (e.g. changing the default linker, whether lld, mold, or wild) change the answer to what the default should be? If so, where is the line?

Open questions

  • Is --mod <mod>=<path> the right approach?

Prior discussions and related links

@epage epage added the A-cargo-targets Area: selection and definition of targets (lib, bins, examples, tests, benches) label Feb 21, 2024
@epage
Copy link
Contributor

epage commented Feb 21, 2024

Asked T-compiler for input on --mod at https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Unified.20test.20binaries/near/422637295

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cargo-targets Area: selection and definition of targets (lib, bins, examples, tests, benches) C-feature-request Category: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted` Command-test S-needs-design Status: Needs someone to work further on the design for the feature or fix. NOT YET accepted.
Projects
Status: No status
Development

No branches or pull requests

2 participants