Skip to content

Conversation

@jorritsandbrink
Copy link
Collaborator

Description

Currently, the snowflake destination only interprets cluster hints on table creation. Any subsequent changes in cluster hints are ignored. This PR changes that, such that cluster hints are also interpreted on table alterations.

These three cases are supported:
(1) add clustering: without clustering -> with clustering
(2) change clustering: change set of cluster columns or column order
(3) drop clustering: with clustering -> without clustering

Note that there is no active hint change detection. Any changes in cluster hints are only processed when the current table alteration mechanism is triggered, which is only when a new column is added.

Related Issues

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Nov 22, 2025

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
docs 5bb2365 Commit Preview URL

Branch Preview URL
Nov 25 2025, 07:28 AM

@jorritsandbrink jorritsandbrink marked this pull request as ready for review November 22, 2025 12:52
@jorritsandbrink
Copy link
Collaborator Author

Hey @rudolfix, could you review this PR?

Copy link
Collaborator

@rudolfix rudolfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is excellent! end to end test is missing though: where we actually run the queries on snowflake. you can add a pipeline testing this in test_snowflake_pipeline.py.

@jorritsandbrink
Copy link
Collaborator Author

@rudolfix Good call! I added a pipeline test and it exposed an issue with my implementation: it relied on SqlJobClientBase.get_storage_tables, which does not include cluster hints. More so, even when overriding with SnowflakeClient.get_storage_tables and adding the cluster hint, it doesn't work because the hints are only applied when a new column is added. Any cluster hint change without new column won't be applied, and thus will not show up in the return value of get_storage_tables in subsequent runs.

Long story short, I switch to reading the cluster hints from the table schema, which does work and actually simplifies the implementation.

So, could you review again?

Copy link
Collaborator

@rudolfix rudolfix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! thanks for adding super clear end to end test!

@rudolfix rudolfix merged commit 9619002 into devel Nov 25, 2025
69 of 71 checks passed
@rudolfix rudolfix deleted the feat/3306-snowflake-cluster-hint-change branch November 25, 2025 16:39
sh-rp added a commit that referenced this pull request Nov 26, 2025
* override local marimo theme for dashboard (#3337)

This ensures custom CSS is always readable.

* Fix DocSearch v4 styles (#3338)

* Fix DocSearch v4 styles
* Fix search input styles for light and dark modes

* docs: update weaviate destination docs and version (#3352)

* Redshift feature: Include STS session token in COPY CREDENTIALS. (#3307)

* Redshift feature: Include STS session token in COPY CREDENTIALS. If aws_session_token is present, append the session token. Keeps IAM_ROLE path and long-lieved keys unchanged

---------

Co-authored-by: Tim Hable <[email protected]>

* fixes sqlglot from find (#3357)

* fixes athena refresh mode (#3313)

* adds filter to exclude dropped tables in staging destination, implements for athena

* enables refresh mode tests for athena, fixes tests

* fixes staging_allowed_local_path on databricks, bumps databricks connector in lockfile

* passes dropped tables schemas to filter, adjust athena filter

* allows to disable lake formation

* fix: backwards compatible traces (#3354)

* makes trace backward compat with 1.17.0 and earlier

* skips trace if any error in unpickle

* always saves merged pipeline trace to have consistent pipeline.last_trace property

* tests for past traces, broken traces and other improvements

* (docs) adds community destinations (#3326)

* adds community destinations

* Apply suggestions from code review

applies crate fixes

Co-authored-by: Andreas Motl <[email protected]>

---------

Co-authored-by: Andreas Motl <[email protected]>

* fix: dashboard no longer crashes on broken home cell (#3348)

* split home and workspace render methods

* header row dry-er

* catch-all errors in home()-cell

* local try-catch for broken traces

* e2e test for broken trace

* removes this

* shows navigation on pipeline attach error

---------

Co-authored-by: Marcin Rudolf <[email protected]>

* (fix) use sparse checkout for dlt init dlthub (#3356)

* adds option to sparse checkout repo

* use sparse checkout for llm context

* fixes sqlglot from find

* adds checkout after sparse clone

* explains unknown path tests

* Fix: The child table column remains in the schema as a partial column with seen-null-first=True (#3131)

* child table column removed from parent

* A utility functin that checks whether a column has seen-null-first set

* Improved comments and docstrings, separate method in worker

* null column not inferred if exists as compound

* Column level x-normalizer cleaning moved outside of worker

* Test for empty column becoming compound

* Test clean_seen_null_first_hint

* Uncalled source in pipeline.run( (#3369)

* fix flaky dashboard tests (#3370)

* improves dashboard multi schema test

* closes and waits for sections in multi-schema test

* removes command line snippet with generic text in exceptions

* disables transformers pokeapi test

* feat: `Schema.to_mermaid()` (#3364)


* Add dlt.Schema.to_mermaid() method

---------

Co-authored-by: jayant <[email protected]>

* Refactor boundary timestamp handling in SqlMergeFollowupJob and SqlalchemyMergeFollowupJob to ensure current load package creation time is used when no boundary timestamp is provided. Update DltResourceHints class to streamline timestamp validation for active_record_timestamp and boundary_timestamp. Adjust tests accordingly. (#3378)

* feat: `snowflake` clustering key modifications (#3365)

* add support for snowflake clustering key modifications

* add cluster column order test case

* update snowflake cluster hint docs

* switch to reading snowflake cluster hints from table schema

* docs: lifecycle of `@dlt.hub.transformation` and `dlt.Relation` (#3329)

* Lifecycle of a dlt transformation

* Added test to match lifecycle docs

* (fix) 3351 fixes default type var (#3373)

* tests minimal typing extensions in alpine docker

* keeps typevar default but does not use it in the code for backwart compat

---------

Co-authored-by: djudjuu <[email protected]>
Co-authored-by: Anton Burnashev <[email protected]>
Co-authored-by: Violetta Mishechkina <[email protected]>
Co-authored-by: Tim Hable <[email protected]>
Co-authored-by: Tim Hable <[email protected]>
Co-authored-by: rudolfix <[email protected]>
Co-authored-by: Andreas Motl <[email protected]>
Co-authored-by: anuunchin <[email protected]>
Co-authored-by: Thierry Jean <[email protected]>
Co-authored-by: jayant <[email protected]>
Co-authored-by: Menna <[email protected]>
Co-authored-by: Jorrit Sandbrink <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add CLUSTER support on the ALTER TABLE for the Snowflake destination

3 participants