-
Notifications
You must be signed in to change notification settings - Fork 418
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
dlt version
0.4.4
Describe the problem
The append disposition is supposed to not append if there's no new data and it works that way in postgres destination, but with duckdb it reinserts the data.
Expected behavior
in the second iteration, the output should not have an extra row since there's no new data, however it appends anyway.
Steps to reproduce
use the following code:
import os
import dlt
from dlt.sources.credentials import ConnectionStringCredentials
from dlt.common import pendulum
import shutil
import duckdb
from ingestr.src.sql_database import sql_table
def load_standalone_table_resource() -> None:
pipeline = dlt.pipeline(
pipeline_name="output",
destination=dlt.destinations.duckdb(
credentials=uri,
),
pipelines_dir="pipeline_dataappendprob",
dataset_name="testschema_append",
)
# Run the resources together
info = pipeline.run(
sql_table(
credentials=uri,
schema="testschema_append",
table="input",
incremental=dlt.sources.incremental(
"updated_at",
),
),
dataset_name="testschema_append",
table_name="output",
write_disposition="append",
)
print(info)
def get_abs_path(relative_path):
return os.path.abspath(os.path.join(os.path.dirname(__file__), relative_path))
abs_db_path = get_abs_path("sample_append.db")
uri = f"duckdb:///sample_append.db"
conn = duckdb.connect(abs_db_path)
def setup_db():
try:
shutil.rmtree(get_abs_path("pipeline_dataappendprob"))
except:
pass
conn.execute("DROP SCHEMA IF EXISTS testschema_append CASCADE")
conn.execute("CHECKPOINT")
conn.execute("CREATE SCHEMA testschema_append")
conn.execute("CREATE TABLE testschema_append.input (id INTEGER, val VARCHAR, updated_at DATE)")
conn.execute("INSERT INTO testschema_append.input VALUES (1, 'val1', '2022-01-01'), (2, 'val2', '2022-01-02')")
conn.execute("CHECKPOINT")
def run(iteration: int):
print("running iteration", iteration)
load_standalone_table_resource()
print("--------------")
print("source:")
conn.sql("select * from testschema_append.input").show()
print()
print("destination:")
conn.sql("select * from testschema_append.output").show()
print("--------------")
setup_db()
run(1)
run(2)Operating system
macOS
Runtime environment
Local
Python version
3.11
dlt data source
dlt sql_table source
dlt destination
No response
Other deployment details
official duckdb
Additional information
No response
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working
Type
Projects
Status
Done