Skip to content

DBT Snapshot filling in dbt_valid_to for most recent record #52

Closed
@dhodge250

Description

@dhodge250

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

After running the dbt snapshot command and building a snapshot table with the invalidate_hard_deletes parameter set to True, various records are having their dbt_valid_to dates populated with the latest dbt_updated_at value, even though these records are still considered valid and have not been removed from their source (staging) tables. We are using the following code to build each snapshot:

{% snapshot tmp_device_ad %}

{{
    config(
           target_schema = 'dbt',
           strategy = 'timestamp',
           unique_key = 'snap_id',
           updated_at = 'update_dt',
           invalidate_hard_deletes = True
          )
}}

select ad.*

  from {{ ref('stg_device_ad') }} ad

{% endsnapshot %}

The number of records this is affecting is inconsistent between snapshots (sometimes it's a few dozen, or a few hundred from tables up to 30k), but it appears that it is affecting the same records on each run.

Expected Behavior

When building a dbt snapshot table and setting the invalidate_hard_deletes parameter to True, I expect dbt to only fill in the dbt_valid_to value ONLY if a record no longer exists in the source table OR if a record has changed in the source table and a new record is created in the snapshot table and the previous record in the snapshot table is marked as invalid. An active record in the snapshot table SHOULD NOT be marked as having a dbt_valid_to date, instead it should have a value of null.

Steps To Reproduce

  1. Environment: Windows 10 Enterprise 21H2 19044.2251
  2. Config:
{% snapshot tmp_device_ad %}

{{
    config(
           target_schema = 'dbt',
           strategy = 'timestamp',
           unique_key = 'snap_id',
           updated_at = 'update_dt',
           invalidate_hard_deletes = True
          )
}}

select ad.*

  from {{ ref('stg_device_ad') }} ad

{% endsnapshot %}
  1. Run: dbt snapshot
  2. No errors are generated

Relevant log output using --debug flag enabled

No response

Environment

- OS: Windows 10 Enterprise 21H2 19044.2251
- Python: 3.9.11
- dbt: 1.3.0

What Oracle database version are you using dbt with?

19c

Additional Context

This issue looks to be similar, if not identical, to #2390 from dbt-core a few years ago that was resolved in v0.17.0. I've created a fork to play around with this and see if the two issues are related.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions