Skip to content

Conversation

@beautifulentropy
Copy link
Member

@beautifulentropy beautifulentropy commented Nov 5, 2025

DDL (Data Definition Language) operations such as ALTER and TRUNCATE are extremely expensive in Vitess. Each time Vitess detects a DDL statement, it triggers a full schema reload, rebuilding the in-memory schema models that are used for query planning and routing.

Our Storage Authority, Registration Authority, bad-key-revoker (cmd), and cert-checker (cmd) unit tests repeatedly execute ALTER (and some TRUNCATE) statements during routine cleanup of every table in the Boulder database at the end of each test case. This has the effect of dramatically worse performance when running on Vitess.

The good news is that we can omit these executions entirely by fixing our unit tests so they no longer make implicit assumptions about the Registration ID of the account under test.

  • Remove ALTER TABLE ... AUTO_INCREMENT = 1 from cleanup to avoid unnecessary DDL
  • Replace per-table SET FOREIGN_KEY_CHECKS toggles with toggles at the beginning and end of the transaction. Replacing tables (20) x 2 statements with just 2.
  • Replace TRUNCATE TABLE with DELETE FROM ... WHERE 1 = 1 to avoid unnecessary DDL
  • Remove implicit dependencies on RegID = 1 (or 2 or 3 or 4) in ra, sa, and cmd/bad-key-revoker unit tests

The results on both MariaDB and Vitess are extremely good.

Local unit test times with ProxySQL + MariaDB:
SA: 7.1s to 2.5s
RA: 4.5s to 1.1s

Local unit test times with Vitess + MySQL 8.0:
SA: 43s to 2.7s
RA: 28s to 1.5s

Part of #7736

@beautifulentropy beautifulentropy marked this pull request as ready for review November 5, 2025 22:28
@beautifulentropy beautifulentropy requested a review from a team as a code owner November 5, 2025 22:28
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2025

@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values.

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Nov 5, 2025

@beautifulentropy, this PR appears to contain configuration and/or SQL schema changes. Please ensure that a corresponding deployment ticket has been filed with the new values.

@jprenken jprenken requested review from a team and aarongable and removed request for a team November 6, 2025 03:11
@beautifulentropy beautifulentropy merged commit 76279b0 into main Nov 6, 2025
15 checks passed
@beautifulentropy beautifulentropy deleted the no-ddl-in-units branch November 6, 2025 15:19
aarongable pushed a commit that referenced this pull request Dec 8, 2025
The original plan for getting the Vitess infrastructure running was to
use
[vttestserver](https://vitess.io/docs/22.0/reference/programs/vttestserver)
as a starting point to reach a minimum viable setup. However,
vttestserver didn’t work out because some of its defaults conflicted
with how we clean up rows and the level of resources (threads) we need.

Fortunately, vttestserver is just a wrapper around
[vtcombo](https://vitess.io/docs/21.0/reference/programs/vtcombo) that
generates a [vttest
protobuf](https://github.com/vitessio/vitess/blob/v22.0.1/proto/vttest.proto)
describing the configuration for an in-memory topology server started by
vtcombo, encoded in JSON. By modifying vttestserver’s
[run.sh](https://github.com/vitessio/vitess/blob/v22.0.1/docker/vttestserver/run.sh),
we're able to interact with vtcombo directly, passing the JSON
configuration along with other vttestserver defaults reverse-engineered
from run.sh and
[vtprocess.go](https://github.com/vitessio/vitess/blob/v22.0.1/go/vt/vttest/vtprocess.go).

Vitess doesn’t provide a `vtcombo` image, we must build our own. Build
and upload a [boulder-vtcomboserver
image](https://hub.docker.com/repository/docker/letsencrypt/boulder-vtcomboserver)
on top of Docker's official MySQL 8.4 image, which provides native arm64
support. The accompanying tag-and-upload shell script defaults to amd64
for CI.

As an aside, Vitess’s official Dockerfiles are only published for amd64,
and modifying them to build for arm64 would prove difficult because
Oracle doesn’t publish MySQL arm64 binaries in its [Debian apt
repository](https://repo.mysql.com/apt/debian/pool/mysql-8.0/m/mysql-community).

With boulder-vtcomboserver up and running I was able to find/validate
the following issues and provide workarounds:

- **Problem:** db-migrate, the tool we use to apply database migrations,
must be configured to talk to MariaDB and to MySQL through Vitess
(vtgate + vttablet).
**Solution:** Create two new dbconfig YAML files (mariadb and vitess)
and use `test/entrypoint.sh` to set the appropriate file for
`sql-migrate` (`test/create_db.sh`) to use. Also, symlink each of these
two new files from db to db-next just like the old dbconfig.yml file.

- **Problem:** Vitess does not allow database `CREATE` statements and
any DDL containing them will be rejected by vtgate.
**Solution:** These databases are already created by vtcombo since
they’re defined as KEYSPACES. Skip database creation in
`test/create_db.sh`.

- **Problem:** Vitess does not allow user creation or grants (`CREATE
USER`, `GRANT`), and any DDL containing these commands will be blocked
by vtgate.
**Solution:** Skip user creation and grant steps in `test/create_db.sh`.
Set `%` for `--vschema_ddl_authorized_users` as vttestserver does, and
revisit this later for a more complete approach.

- **Problem:** vttablet default for maximum number of rows returned from
a (non-streaming) query (10,000) is too low for Boulder’s needs, causing
queries to fail due to vttablet rejecting them.
**Solution:** Increase `--queryserver-config-max-result-size` to
1,000,000 and `--queryserver-config-warn-result-size` to 1,000,000.

- **Problem:** vttablet default for connection pool size (16) and
maximum number of concurrent transactions (20) are too low for Boulder’s
needs, causing queries to fail due to vttablet being overloaded.
**Solution:** Increase `--queryserver-config-pool-size` to 64 and
`--queryserver-config-transaction-cap` to 80.
  
- **Problem:** Vitess does not allow `TRIGGER` statements and any DDL
containing them will be rejected by vtgate. Without TRIGGER statements
TestIssuanceCertStorageFailed, an integration test, will fail.
**Soluton:** Run these TRIGGER statements in an entrypoint
scripttest/vtcomboserver/install_trigger.sh, bypassing vtgate entirely.

Depends on #8479
Depends on #8489
Depends on #8490
Depends on #8494
Fixes #7736
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants