-
Notifications
You must be signed in to change notification settings - Fork 418
Closed
Labels
bugSomething isn't workingSomething isn't workingcommunityIssue from dlt Slack communityIssue from dlt Slack community
Description
dlt version
0.5.2
Describe the problem
When using the scd2 merge strategy, records do not get reinserted after retirement:
- load 1: source snapshot contains record x ➜ record x is inserted in destination
- load 2: source snapshot does not contain record x ➜ record x is retired in destination ("valid to" field is filled with timestamp)
- load 3: source snapshot contains record x ➜ nothing happens (this is the bug)
Issue only occurs if record x is exactly the same. If the natural key is the same, but one or more attributes is different, the record gets inserted as expected.
Expected behavior
Record x gets reinserted in load 3. Resultant destination table has two records for record x with different validity windows.
Steps to reproduce
import dlt
@dlt.resource(
table_name="my_table", write_disposition={"disposition": "merge", "strategy": "scd2"}
)
def r(data):
yield data
pipe = dlt.pipeline(destination="postgres", dataset_name="scd2_reinsert_bug")
# load 1 — initial load
dim_snap = [
{"nk": 1, "c1": "foo", "c2": "foo"},
{"nk": 2, "c1": "bar", "c2": "bar"},
]
pipe.run(r(dim_snap))
# result: inserted 2 records (table size: 2 records)
# load 2 — delete natural key 1
dim_snap = [
{"nk": 2, "c1": "bar", "c2": "bar"},
]
pipe.run(r(dim_snap))
# result: retired record for natural key 1 (table size: 2 records)
# load 3 — reinsert natural key 1
dim_snap = [
{"nk": 1, "c1": "foo", "c2": "foo"},
{"nk": 2, "c1": "bar", "c2": "bar"},
]
pipe.run(r(dim_snap))
# result: nothing happened (table size: 2 records)
# desired result: reinsert record for natural key 1 (table size: 3 records)Operating system
Windows
Runtime environment
Local
Python version
3.8
dlt data source
No response
dlt destination
Postgres
Other deployment details
No response
Additional information
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingcommunityIssue from dlt Slack communityIssue from dlt Slack community
Type
Projects
Status
Done