Feat(experimental): DBT project conversion #4495

erindru · 2025-05-22T01:12:05Z

This PR contains an initial implementation of a command that can take a DBT project, read it into memory and write the result out as a native SQLMesh project.

To invoke it, use:

$ sqlmesh dbt convert -i <input_path> -o <output_path>

The way this works is that the project is first loaded into memory using our existing DbtLoader.

This means that the existing SQLMesh mappings from DBT model types -> SQLMesh model types are utilized
It also means that our existing DBT shims in the Jinja context can be utilized

The resulting models and macros are extracted from the context and their Jinja is parsed into an AST (using the Jinja library). A bunch of AST transforms are run to replace DBT-isms with SQLMesh native concepts as much as possible:

{{ ref() }} and {{ source() }} calls are replaced with the actual model names they reference (where possible)
{% is_incremental() %} blocks are removed
...etc

Jinja has no way of turning a Jinja AST back into a Jinja template string (since its goal is to generate Python code, not more Jinja) so I wrote a JinjaGenerator class to go from AST back to str.

If, after applying all the Jinja AST transforms, there is still some Jinja left in the result then it is written to the SQLMesh model file as a Jinja model surrounded with JINJA_QUERY_BEGIN; JINJA_END; blocks.

If there is no Jinja left after applying the transforms, it is written directly as a native SQL model with no Jinja wrapping.

Macros that call into DBT packages are also handled. The dependency tree is migrated and put in the target folder under macros/__dbt_packages__ so that the macro hierarchy is still available when the macros are called on the SQLMesh side.

A migrated project isnt truly native until all the dbt-isms have been removed. If the native loader detects it is loading a migrated project, it injects DBT shims into the Jinja context to make the migrated macros still work. The bulk of the DBT shim code is re-used from our existing DBT loader.

Known limitations:

The source DBT project must be loadable by the SQLMesh DBT loader. It doesnt need to be runnable, just loadable.
Currently only works for BigQuery and DuckDB as these were the initial focus. Other DB types can be added by implementing from_sqlmesh() in the relevant dbt TargetConfig class
Jinja handles whitespace stripping at parse time, so the AST has no idea if eg a {%- or {% block was used. This can lead to less than ideal formatting once the AST transforms are run and the resulting AST is turned back into a Jinja string.
Only the jinja constructs that were present in the test projects are handled by the Jinja generator, some more esoteric AST nodes are currently not handled
Audits are not generalized by our DBT loader so each model currently gets its own version as an inline audit
{{ source() }} calls on the DBT side that used dynamic inputs and were aliased in the DBT config are not correctly migrated
pre/post hooks / statements are not currently handled

izeigerman · 2025-05-26T21:33:21Z

sqlmesh/cli/main.py

+@click.pass_obj
+@error_handler
+@cli_analytics
+def dbt_convert(


Should we instead extend the init command like we do for dlt generation?

izeigerman · 2025-05-26T22:00:24Z

sqlmesh/core/model/definition.py

+    # extract {{ var() }} references used in all jinja macro dependencies to check for any variables specific
+    # to a migrated DBT package and resolve them accordingly
+    # vars are added into __sqlmesh_vars__ in the Python env so that the native SQLMesh var() function can resolve them
+    if migrated_dbt_project_name:


Can this be encapsulated into its own function?

izeigerman · 2025-05-26T22:01:26Z

sqlmesh/core/model/kind.py

@@ -491,6 +491,18 @@ def _merge_filter_validator(

        return v.transform(d.replace_merge_table_aliases)

+    @field_validator("batch_concurrency", mode="before")


Why is this needed? There's already a validator for this field

izeigerman · 2025-05-26T22:01:54Z

sqlmesh/dbt/converter/console.py

+    )
+
+
+class DbtConversionConsole(TerminalConsole):


Does this need to inherit TerminalConsole?

izeigerman · 2025-05-26T22:03:59Z

sqlmesh/dbt/converter/jinja.py

+        yield prev, curr
+
+
+class JinjaGenerator:


Just curious: any reason to have this class? It doesn't look like the methods benefit from the shared self instance in any way. Should these just be top-level functions?

Feat(experimental): DBT project conversion

738b2c7

erindru force-pushed the erin/dbt-convert branch from 0f4ddc6 to 738b2c7 Compare May 22, 2025 01:39

erindru marked this pull request as ready for review May 22, 2025 02:46

izeigerman reviewed May 26, 2025

View reviewed changes

sqlmesh/dbt/converter/console.py

)

class DbtConversionConsole(TerminalConsole):

Copy link

Member

izeigerman May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to inherit TerminalConsole?

izeigerman reviewed May 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat(experimental): DBT project conversion #4495

Feat(experimental): DBT project conversion #4495

Uh oh!

erindru commented May 22, 2025 •

edited

Loading

Uh oh!

izeigerman May 26, 2025

Uh oh!

izeigerman May 26, 2025

Uh oh!

izeigerman May 26, 2025

Uh oh!

izeigerman May 26, 2025

Uh oh!

izeigerman May 26, 2025

Uh oh!

Uh oh!

		@@ -491,6 +491,18 @@ def _merge_filter_validator(

		return v.transform(d.replace_merge_table_aliases)

		@field_validator("batch_concurrency", mode="before")

Feat(experimental): DBT project conversion #4495

Are you sure you want to change the base?

Feat(experimental): DBT project conversion #4495

Uh oh!

Conversation

erindru commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

izeigerman May 26, 2025

Choose a reason for hiding this comment

Uh oh!

izeigerman May 26, 2025

Choose a reason for hiding this comment

Uh oh!

izeigerman May 26, 2025

Choose a reason for hiding this comment

Uh oh!

izeigerman May 26, 2025

Choose a reason for hiding this comment

Uh oh!

izeigerman May 26, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

erindru commented May 22, 2025 •

edited

Loading