Fix/3047 prevent same naming for staging and final datasets #3096

alkaline-0 · 2025-09-17T15:29:32Z

Description

Created create_dataset_names static method and added validation to prevent a setup where final and staging datasets have the same name.

Changes:

Created WithStagingDataset.create_dataset_names() as a @staticmethod
Added validation logic to raise ValueError when dataset names are equal
Added error message with clear explanation of the issue and solution
Added tests covering both success and error scenarios to ensure error message gets raised.

Why this change:
As mentioned in ticket-3047, the user has the ability to setup the staging dataset which could end up with the same name as the final dataset. A validation was added to ensure that a 'ValueError' gets raised when the stage dataset and final dataset have the same name otherwise a tuple is returned containing the data set name and the staging data set name.

Related Issues

ticket-3047

Additional Context

This change prevents potential data loss scenarios where users might accidentally configure staging and final dataset names to be identical. The validation ensures that setup commands that should only truncate staging datasets don't accidentally truncate the final dataset.

…dataset (#XXXX) * Introduced `create_dataset_names` static method in `WithStagingDataset` class. * Added validation to ensure staging dataset name is not the same as the final dataset name, raising a ValueError if they match. * Updated documentation for the new method and error handling.

…saging and add unit tests * Reformatted the `create_dataset_names` method for better readability. * Improved error messages to clarify the consequences of identical dataset names. * Added unit tests for `create_dataset_names` to validate functionality and error handling.

* Replaced direct calls to `normalize_dataset_name` and `normalize_staging_dataset_name` with a new method `create_dataset_names` in multiple destination client classes. * This change improves consistency in dataset name generation across various implementations, ensuring proper handling of dataset names and staging datasets.

…nts in destination clients * Updated dataset name assignments in multiple destination client classes to enhance readability by breaking long lines. * Maintained consistency in the use of the `create_dataset_names` method across implementations.

netlify · 2025-09-17T15:29:36Z

✅ Deploy Preview for dlt-hub-docs ready!

Name	Link
🔨 Latest commit	`861d611`
🔍 Latest deploy log	https://app.netlify.com/projects/dlt-hub-docs/deploys/68cbf3c75bdad90008275301
😎 Deploy Preview	https://deploy-preview-3096--dlt-hub-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

sh-rp

Very good PR, thank you! I have one comment not related to the code, you can merge this as is and do a follow up PR or integrate into this one:

On the page you reference in the exception: https://dlthub.com/docs/dlt-ecosystem/staging#staging-dataset, we should add a sentence that staging and final datasets need to have separate names, similar to what you have written in the exception.

* disable most tests * try correct windows command for runnig marimo e2e tests * try without timeout * test only launch marimo * bump python version * try install playwright deps * fix e2e tests for dashboard on windows * enable e2e tests for dashboard * test macos 14 for dashboard e2e tests * add basic tests for ui elements * improve ui elements tests * revert changes to main github workflow * review fixes --------- Co-authored-by: Your Name <[email protected]>

* add code to fix behavior of normalizer when None or primitives are encountered for child tables (cherry picked from commit 5f44278) * fixes one existing test that would not work with cached schema otherwise * add tests and small fixes to dashboard * fix implementation and add more tests * Long names handled, get_nested_tables test, cached table lookups * relational normalizer returns unshortened parent bath * Schema contract test added --------- Co-authored-by: anuunchin <[email protected]>

Added important notes and examples to the staging dataset configuration section, emphasizing the need for unique names between staging and final datasets to avoid `ValueError` and potential data loss during setup commands.

…ng-for-staging-and-final-datasets

alkaline-0 added 5 commits September 17, 2025 12:28

alkaline-0 self-assigned this Sep 17, 2025

alkaline-0 marked this pull request as ready for review September 17, 2025 18:56

alkaline-0 requested review from anuunchin and sh-rp September 17, 2025 18:56

sh-rp linked an issue Sep 18, 2025 that may be closed by this pull request

Prevent users from setting the same name for final and staging dataset #3047

Closed

sh-rp previously approved these changes Sep 18, 2025

View reviewed changes

sh-rp and others added 4 commits September 18, 2025 13:52

Add redirect from dlt-plus page (#3084)

12f1058

alkaline-0 dismissed sh-rp’s stale review via 09d5ced September 18, 2025 11:52

Merge remote-tracking branch 'origin' into fix/3047-prevent-same-nami…

861d611

…ng-for-staging-and-final-datasets

alkaline-0 requested a review from sh-rp September 18, 2025 11:58

sh-rp approved these changes Sep 18, 2025

View reviewed changes

alkaline-0 merged commit 8d67e86 into devel Sep 19, 2025
67 checks passed

alkaline-0 deleted the fix/3047-prevent-same-naming-for-staging-and-final-datasets branch September 19, 2025 06:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/3047 prevent same naming for staging and final datasets #3096

Fix/3047 prevent same naming for staging and final datasets #3096

Uh oh!

alkaline-0 commented Sep 17, 2025

Uh oh!

netlify bot commented Sep 17, 2025 •

edited

Loading

Uh oh!

sh-rp left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Fix/3047 prevent same naming for staging and final datasets #3096

Fix/3047 prevent same naming for staging and final datasets #3096

Uh oh!

Conversation

alkaline-0 commented Sep 17, 2025

Description

Related Issues

Additional Context

Uh oh!

netlify bot commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for dlt-hub-docs ready!

Uh oh!

sh-rp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

netlify bot commented Sep 17, 2025 •

edited

Loading