Skip to content

Add dbtool fold: emit self-contained baseline from registered migrations#481

Open
christianparpart wants to merge 1 commit intomasterfrom
feature/dbtool-fold
Open

Add dbtool fold: emit self-contained baseline from registered migrations#481
christianparpart wants to merge 1 commit intomasterfrom
feature/dbtool-fold

Conversation

@christianparpart
Copy link
Copy Markdown
Member

Adds a new offline dbtool fold subcommand that walks all registered migrations and emits a single self-contained baseline — either a .cpp migration plugin or a .sql script — that reproduces the post-migration schema and schema_migrations rows from an empty database. Useful for collapsing a long migration history into a fast-to-apply starting point, or for shipping a snapshot baseline alongside a release.

The command is purely offline: it never opens a DB connection, never queries a live schema. It loads plugins, walks each migration's Up() plan in timestamp order, folds the cumulative effect into a per-table view + chronological data steps, and emits via the existing ToSql() formatter path so each dialect's CREATE TABLE / INSERT codegen stays the single source of truth.

Changes

  • New MigrationManager::FoldRegisteredMigrations(formatter, upToInclusive) primitive — pure plan-walk, returns PlanFoldingResult (per-table state, creation order, indexes, chronological data steps, in-range releases). Used by the new module and available to any future caller.
  • New Lightweight::MigrationFold library module under src/Lightweight/MigrationFold/:
    • Folder — thin facade plus ResolveUpTo() which accepts an empty string (latest registered release), a numeric timestamp, or a release version string.
    • SqlEmitter — emits a flat dialect-specific .sql script, including a CREATE TABLE schema_migrations and a stamping INSERT per folded timestamp so a freshly-loaded DB looks identical to a real apply-all run.
    • CppEmitter — emits a .cpp baseline plugin wrapped in LIGHTWEIGHT_SQL_MIGRATION, with optional --emit-cmake and --max-lines-per-file for splitting very large baselines across multiple files.
  • Shared CodeGen/SplitFileWriter helper that bin-packs blocks within a per-file line budget; used by CppEmitter and intentionally factored out for reuse.
  • dbtool fold --output FILE [--up-to X] [--dialect D] [--emit-cmake] [--plugin-name N] [--max-lines-per-file N] — output format is picked from the file extension. .sql requires --dialect (sqlite, postgres, mssql, mysql); .cpp is dialect-agnostic. Dispatched before SetupConnectionString since fold never touches a DB; uses a connection-less GetMigrationManagerOffline variant.
  • Unit tests: 10 fold cases (create + altercolumn, drop-table cleanup, chronological ordering, --up-to truncation, RawSql passthrough, column rename FK propagation, release-range filtering, ResolveUpTo parsing) + 4 SplitFileWriter cases + 2 emitter round-trip cases. Green against sqlite3, mssql2022, and postgres.

@christianparpart christianparpart requested a review from a team as a code owner April 30, 2026 05:36
@github-actions github-actions Bot added CLI command line interface tools tests Core API labels Apr 30, 2026
@christianparpart christianparpart force-pushed the feature/dbtool-fold branch 2 times, most recently from 1c305bc to d8bf4e2 Compare April 30, 2026 08:57
@github-actions github-actions Bot added Query Builder Data Binder SQL Data Binder support Query Formatter SQL dialect implementations labels Apr 30, 2026
@christianparpart christianparpart force-pushed the feature/dbtool-fold branch 4 times, most recently from 1bf3c43 to 35bcc7e Compare April 30, 2026 10:50
@github-actions github-actions Bot removed Query Builder Data Binder SQL Data Binder support Query Formatter SQL dialect implementations labels Apr 30, 2026
@christianparpart christianparpart force-pushed the feature/dbtool-fold branch 2 times, most recently from 35cbced to 83bd900 Compare April 30, 2026 17:58
Copy link
Copy Markdown
Member

@Yaraslaut Yaraslaut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, left few small comments, mostly nitpicks

Comment thread src/Lightweight/MigrationFold/SqlEmitter.cpp Outdated
Comment thread src/Lightweight/MigrationFold/CppEmitter.hpp
Comment thread src/Lightweight/MigrationFold/CppEmitter.cpp Outdated
Comment thread src/Lightweight/MigrationFold/CppEmitter.cpp
dbtool fold --output FILE  emits a self-contained baseline (.cpp plugin
or .sql script) that reproduces the post-migration state from an empty
DB. .sql output requires --dialect (sqlite, postgres, mssql, mysql);
.cpp output is dialect-agnostic. Runs without any DB connection - loads
plugins, walks migrations in memory, writes a file.

Built on a new pure plan-walk primitive
MigrationManager::FoldRegisteredMigrations(formatter, upToInclusive)
that folds every registered migration into a per-table view of the
final shape plus a chronological list of data steps, indexes, and
releases.

The fold module (src/Lightweight/MigrationFold/{Folder,CppEmitter,
SqlEmitter}.{hpp,cpp}) emits via the existing ToSql() formatter path so
each dialect's CREATE TABLE / CREATE INDEX / INSERT codegen stays the
single source of truth. The .cpp emitter wraps the body in
LIGHTWEIGHT_SQL_MIGRATION; the .sql emitter additionally emits CREATE
TABLE schema_migrations and a stamping INSERT for every folded
timestamp so the post-fold DB looks identical to a real apply-all run.

Also pulls in CodeGen/SplitFileWriter shared codegen helper used by the
.cpp emitter to bin-pack large baselines across multiple files.

Tests: fold unit tests cover create/altercolumn/drop-table cleanup,
data-step chronological order, --up-to truncation, RawSql passthrough,
column rename FK propagation, release-range filtering, ResolveUpTo
parsing. SqlEmitter/CppEmitter round-trip tests verify the emitted
artifacts match the expected shape. SplitFileWriter tests cover bin-
packing, single-chunk, zero-budget, and oversize-block boundaries.

All [Fold] and [SplitFileWriter] tests pass against sqlite3,
mssql2022, and postgres. Full SqlMigration suite (44 cases / 210
assertions) green on all three.

Signed-off-by: Christian Parpart <christian@parpart.family>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLI command line interface tools Core API tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants