Prevent users from setting the same name for final and staging dataset

**TLDR**
We use staging datasets for transactional safety on merge write_dispositions as well as some variants of the replace write_disposition. The default behavior is to have a second dataset called "<dataset_name>_staging". Users can change this name which can lead to a setup where final and staging datasets have the same name. We should prevent this or at least print a big fat warning if users try to do this, as data in the final dataset will be truncated by the setup commands that should only truncate the staging dataset.

**ToDo**
* Learn about the staging dataset: https://dlthub.com/docs/dlt-ecosystem/staging#staging-dataset
* Add a new method to the `WithStagingDataset` class: `def create_dataset_names(self, schema: Schema, config: DestinationClientDwhConfiguration) -> Tuple[str, str]:`, which creates the regular and the staging dataset names for a given schema and config, this method should also raise an Exception if both are the same. See the point below to find the places where these normalized names are created.
* Use this new method to create the normalized regular and staging dataset names in for all destinations (including the filesystem destination). You can find all destination implementations under dlt/destinations/impl, or just search for all the places where `normalize_staging_dataset_name()` is used.
* Write tests that demonstrate that this exception is raised if both datasets end up having the same name after normalization.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prevent users from setting the same name for final and staging dataset #3047

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Prevent users from setting the same name for final and staging dataset #3047

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions