fully support naive and tz-aware timestamp/time data types #2570

rudolfix · 2025-04-28T23:32:09Z

Description

timestamp and time handling

This PR makes the way dlt handles timestamps with timezone or without (naive) consistent across normalizers and destinations. The core of the change is described in
#2591

It also fixes several edge cases for timestamps with precision (ie. nanoseconds or 100s nanosecond precision in mssql):
#2877
#2486

Standardizing timestamp behavior surfaced several problems with incremental datetime cursors. This PR makes the incremental cursor to always preserve the exact timestamp type of source data:
#2658
#2460
#2225

handling of time type was changed to behave as documented in all cases (always naive in UTC)

naive timestamps are enabled for all destinations that support that. destination capabilities define timestmap support with additional flags.

sql_database timestamp cursor tests and extensions

This PR addes mssql source test - all data types and tz-aware and naive cursors (as many bugs were reported here). To unify behavior on Connectorx and other backends, sql_databse will convert LimitItem step into LIMIT sql clause (to load in chunks).

Schema object cleanup

Data normalization part got kicked out from Schema object and moved to items_normalizer (identifier normalization stays!). This separates the concerns better (we have many normalizers now, so the old concept of having everything in Schema object is outdated). This should be followed up with removing json relational from Schema.
Why this change happened here: because I started to normalize timestamps. And that requires destination capabilities to be present.

Resources maintain parent-child (transformer) relationship

Previously this relationship was maintained by Pipe class (which represents data processing steps, without metadata). Now DltResource has a _parent field which points to the parent. This allows to maintain full resource tree and to provide resource metadata and correct list of extract resources when grouped in a source. Previously "mock" resources were created in case of resources added to the source implicitly. Example:
if you add a transformer with name "TR" that has parent resource "R" only "TR" is visible in the source (explicit) but when extracted both "TR" and "R" will be evaluated. "R" is then implicitly added to the source and previously its metadata was not available.

…ting destinations not supporting it

netlify · 2025-04-28T23:33:38Z

✅ Deploy Preview for dlt-hub-docs ready!

Name	Link
🔨 Latest commit	`349ca9b`
🔍 Latest deploy log	https://app.netlify.com/projects/dlt-hub-docs/deploys/68b4457118cd670008c009e7
😎 Deploy Preview	https://deploy-preview-2570--dlt-hub-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

…sts tz-awareness edge cases

…ctor, tests

…te LIMIT statements, tests incl. proper cursor tests for naive/tz aware incremental cursor columns

… on dates

… for naive datetimes

…gular transaction destination caps

sh-rp · 2025-08-29T14:39:39Z

tests/load/sources/sql_database/mssql_source.py

+from dlt.sources.credentials import ConnectionStringCredentials
+
+
+class MSSQLSourceDB:


I would've probably created a generic source db that uses sqlalchemy and maybe sqlmodel so we can create an example dataset on any database including an abstraction for manipulating rows. We already have more or less the exact same code for postgres. But that is something for the future maybe.

right! we can have a few tables with standard types that should work on all source databases. for that we can extract a few standard tests.

tables with specific datatypes could at least have standardized names. but this is significant amount of work. I did a first step by enabling more source databases here.

sh-rp · 2025-08-29T14:41:27Z

dlt/extract/incremental/transform.py

-    def _adapt_if_datetime(row_value: Any, last_value: Any) -> Any:
-        # For datetime cursor, ensure the value is a timezone aware datetime.
-        # The object saved in state will always be a tz aware pendulum datetime so this ensures values are comparable
+    def _adapt_timezone(row_value: datetime, cursor_value: datetime, cursor_value_name: str) -> Any:


This is smart! If we have incoming rows with varying timezone awareness there will be a big mess though, right? Maybe we should somehow raise in the normalizer when we detect this. Or is there a mechanism somewhere? I have not seen this coming up in the community, but if there is unstructured data being loaded this could always be a possibility.

huh! why this comment is on outdated code? Anyway - if tz-awareness changes on cursor column, the comparison will fail and exact step will abort. we normalize data after extract phase. so user should use map to normalize this data.

incremental is used mostly for sql databases and rest apis so this problem is not popping up - like you say

sh-rp · 2025-08-29T14:43:45Z

dlt/sources/sql_database/helpers.py

+            # limit works with chunks (by default)
+            limit = self.limit.limit(self.chunk_size)
+            if limit is not None:
+                query = query.limit(limit)


Just out of curiosity, is it faster to apply a limit on the query even when you are not retrieving all rows from the cursor?

this is very dependent on particular implementation. what really makes a difference is an index on cursor column. then query engine can stream that data in chunks without scanning. in that case limit will not change much AFAIK. it however allows to load data from connectorx in chunks

sh-rp · 2025-08-29T14:49:19Z

dlt/destinations/impl/bigquery/factory.py

+        # TIMESTAMP is always timezone-aware in BigQuery
+        # DATETIME is always timezone-naive in BigQuery
+        # NOTE: we disable DATETIME because it does not work with parquet
+        return "TIMESTAMP" if timezone else "TIMESTAMP"


Is this line correct? The condition does not change anything..

it is but I'll remove it to make it 100% clear. DATETIME does not work on bigquery in practice

sh-rp · 2025-08-29T14:55:48Z

dlt/extract/incremental/lag.py

    return value


 def _apply_lag_to_datetime(


should be apply_lag_to_date I think

yeah. typing is wrong.

rudolfix added 5 commits April 29, 2025 01:25

adds databricks timestamp NTZ

bed9771

improves error messages in pyarrow tuples to arrow

916e893

decreases timestamp precision to 6 for mssql

9ab3020

adds naive datetime to all data types case, enables fallback when tes…

d4dad26

…ting destinations not supporting it

other test fixes

8923bed

sh-rp assigned rudolfix May 5, 2025

rudolfix added 3 commits July 21, 2025 22:27

Merge branch 'devel' into fix/2486-fixes-mssql-datetime-precision

266be34

always stores incremental state last value as present in the data, te…

15687eb

…sts tz-awareness edge cases

fixes ntz timestamp tests

8f7451c

rudolfix mentioned this pull request Jul 24, 2025

Unsupported data type timestamp[ns,TZ=UTC] in pyarrow backend #2877

Closed

rudolfix added 5 commits July 24, 2025 22:56

fixes sqlalchemy destination to work with mssql

cab6f28

adds func to current module to get current resource instance

c99c251

generates LIMIT clause in sql_database when limit step is present

7421d04

adds basic tests for mssql in sql_database

a716e5e

adds docs on tz-awareness in datetime columns in sql_database

795665c

rudolfix force-pushed the fix/2486-fixes-mssql-datetime-precision branch 3 times, most recently from 7359a1b to c33ed04 Compare July 27, 2025 09:23

rudolfix force-pushed the fix/2486-fixes-mssql-datetime-precision branch from c33ed04 to 365e851 Compare August 6, 2025 22:09

Merge branch 'devel' into fix/2486-fixes-mssql-datetime-precision

f166f83

rudolfix force-pushed the fix/2486-fixes-mssql-datetime-precision branch 9 times, most recently from 3580072 to ddbf1e5 Compare August 11, 2025 21:01

rudolfix added 11 commits August 17, 2025 22:46

updates dbapi sql client for dremio

176ef1f

adjust column schema inferred from arrow to destination caps in extra…

50c8f8c

…ctor, tests

moves schema and data setup for all data types tests to common code

19fe0e6

adds option to exclude columns in sql_table, uses LimitItem to genera…

2115b59

…te LIMIT statements, tests incl. proper cursor tests for naive/tz aware incremental cursor columns

tests sql_database on mssql for all data types and incremental cursor…

366f56c

… on dates

improves tests for row tuples to arrow with cast to dlt schema, tests…

950407e

… for naive datetimes

improved test for timestamps and int with precision on duckdb

2bb104b

disables Python 3.14 tests and dashboard test on mac

840dbd8

better maybe transaction in job client: takes into account ddl and re…

845bd73

…gular transaction destination caps

pyodbc py3.13 bump

6a12ba4

timestamp docs WIP

74f220b

rudolfix force-pushed the fix/2486-fixes-mssql-datetime-precision branch from 5838a6c to 74f220b Compare August 17, 2025 21:05

fixes tests

b423a73

rudolfix changed the title ~~Fix/2486 fixes mssql datetime precision~~ fully support naive and tz-aware timestamp/time data types Aug 19, 2025

rudolfix marked this pull request as ready for review August 19, 2025 12:15

rudolfix requested a review from sh-rp August 21, 2025 15:07

sh-rp added the breaking This issue introduces breaking change label Aug 29, 2025

sh-rp reviewed Aug 29, 2025

View reviewed changes

rudolfix added 3 commits August 31, 2025 13:56

review fixes

6a7a95d

Merge branch 'devel' into fix/2486-fixes-mssql-datetime-precision

a144eb4

finalizes docs

349ca9b

rudolfix force-pushed the fix/2486-fixes-mssql-datetime-precision branch from f51a1bc to 349ca9b Compare August 31, 2025 12:51

rudolfix merged commit 823bf38 into devel Aug 31, 2025
66 of 67 checks passed

rudolfix deleted the fix/2486-fixes-mssql-datetime-precision branch August 31, 2025 18:06

This was referenced Sep 1, 2025

Feature: Introduce support of http based resources for fs source #3029

Merged

Invalid date format when using incremental loads from MSSQL source #3061

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fully support naive and tz-aware timestamp/time data types #2570

fully support naive and tz-aware timestamp/time data types #2570

Uh oh!

rudolfix commented Apr 28, 2025 •

edited

Loading

Uh oh!

netlify bot commented Apr 28, 2025 •

edited

Loading

Uh oh!

sh-rp Aug 29, 2025

Uh oh!

rudolfix Aug 31, 2025

Uh oh!

sh-rp Aug 29, 2025

Uh oh!

rudolfix Aug 31, 2025

Uh oh!

sh-rp Aug 29, 2025

Uh oh!

rudolfix Aug 31, 2025

Uh oh!

sh-rp Aug 29, 2025

Uh oh!

rudolfix Aug 31, 2025

Uh oh!

sh-rp Aug 29, 2025

Uh oh!

rudolfix Aug 31, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		from dlt.sources.credentials import ConnectionStringCredentials


		class MSSQLSourceDB:

fully support naive and tz-aware timestamp/time data types #2570

fully support naive and tz-aware timestamp/time data types #2570

Uh oh!

Conversation

rudolfix commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

timestamp and time handling

sql_database timestamp cursor tests and extensions

Schema object cleanup

Resources maintain parent-child (transformer) relationship

Uh oh!

netlify bot commented Apr 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for dlt-hub-docs ready!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rudolfix commented Apr 28, 2025 •

edited

Loading

netlify bot commented Apr 28, 2025 •

edited

Loading