Optimize CockroachDB setup for tests (96% reduction)#493
Conversation
|
Life-changing! Tried it myself and got this on build, which I don't get on |
|
Ah good, at least the macOS CI build can reproduce that. |
|
Confirmed that fixed it. # before
cargo test --package omicron-nexus --test test_console_api -- test_sessions
9.48s user 2.08s system 31% cpu 36.163 total
# after
cargo test --package omicron-nexus --test test_console_api -- test_sessions
1.71s user 0.58s system 53% cpu 4.303 totalThis is the best day of my life. |
| "common", | ||
| "nexus", | ||
| "nexus/src/db/db-macros", | ||
| "nexus/test-utils", |
There was a problem hiding this comment.
Note: this has been removed from the default members, but we're definitely still using / testing it.
This just means that cargo build doesn't, by default, build a "test-only" helper crate - doing so would also require "cockroachdb" to be on the PATH earlier for some of our checks.
Note: This still gets built with cargo build --all-targets and cargo test, so no worries on coverage.
| @@ -14,6 +14,18 @@ set -o xtrace | |||
| cargo --version | |||
There was a problem hiding this comment.
The change here - and in the ".github/workflows/rust.yml" file - just make sure cockroach is downloaded and on the PATH before we cargo build --all-targets, since we're now using it in a build script.
There was a problem hiding this comment.
Could you update the README to reflect this? Basically this part:
CockroachDB v21.1.10.
The test suite expects to be able to start a single-node CockroachDB cluster using the cockroach executable on your PATH.
I think that should now say "the build and test suite expect..."
davepacheco
left a comment
There was a problem hiding this comment.
Very nice improvement!
test-utils/src/dev/mod.rs
Outdated
| /// By creating a "seed" version of the database, we can cut down | ||
| /// on the time spent performing this operation. Instead, we opt | ||
| /// to copy the database from this seed location. | ||
| pub const SEED_DB_DIR: &str = "/tmp/crdb-base"; |
There was a problem hiding this comment.
It'd be good to make this use one of the crates that finds a temp dir for you. I think elsewhere we do this, and it winds up using $TMPDIR from the environment in at least some cases.
Context: /tmp on Helios is generally tmpfs, which comes from swap. This uses up physical memory. We already consume a lot of that during builds and tests and it's led to exhaustion even on decently large systems. There's no need for this to be in-memory. Plus, allowing people to override this in a standard way allows folks to put this on a ZFS dataset that they've configured differently.
There was a problem hiding this comment.
Updated to use https://doc.rust-lang.org/std/env/fn.temp_dir.html
| fn copy_dir( | ||
| src: impl AsRef<Path>, | ||
| dst: impl AsRef<Path>, | ||
| ) -> std::io::Result<()> { |
There was a problem hiding this comment.
I'm surprised there's not a standard crate for doing this. There are a lot of edge cases, which I guess we haven't run into in this case? (what if there's a named pipe here or a symlink?)
Also, this is all synchronous, even though the caller is async. I guess that's okay here?
There was a problem hiding this comment.
Yeah, I looked around, and couldn't find anything much better than doing this myself.
I can make it async, but given the "test setup" context, I'm not sure how worthwhile this is to optimize for concurrency?
| @@ -14,6 +14,18 @@ set -o xtrace | |||
| cargo --version | |||
There was a problem hiding this comment.
Could you update the README to reflect this? Basically this part:
CockroachDB v21.1.10.
The test suite expects to be able to start a single-node CockroachDB cluster using the cockroach executable on your PATH.
I think that should now say "the build and test suite expect..."
Optimizes setup time for CRDB-based tests by seeding + copying a seeded database, rather than populating the database anew during each test. This significantly cuts down on the setup time for any test which wants to use even a simple, single-node Cockroach instance.
Comparing the benchmark originally added in #492:
Before:
After: