You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/website/docs/dlt-ecosystem/verified-sources/sql_database/configuration.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -160,14 +160,12 @@ Incremental loading uses a cursor column (e.g., timestamp or auto-incrementing I
160
160
If your cursor column name contains special characters (e.g., `$`) you need to escape it when passing it to the `incremental` function. For example, if your cursor column is `example_$column`, you should pass it as `"'example_$column'"` or `'"example_$column"'` to the `incremental` function: `incremental("'example_$column'", initial_value=...)`.
161
161
:::
162
162
163
-
### Configure timezone aware and naive timestamp cursors
164
-
If your cursor is on timestamp/datetime column, make sure you set up your initial and end values correctly. You will avoid implicit
165
-
type conversions, invalid date time literals or column comparisons in database queries. Note that if implicit conversions may
166
-
result in data loss ie. if naive datetime has different local timezone on the machine where Python is executing vs. your dbms.
167
-
168
-
* If your datetime columns is naive, use naive Python datetime. Note that `pendulum` datetime is tz-aware by default and standard `datetime` is naive.
169
-
* Use `full` reflection level or above to reflect `timezone` (awareness hint) on the datetime columns.
170
-
* read about the [timestamp handling](../../../general-usage/schema.md#handling-of-timestamp-and-time-zones) in `dlt`
163
+
### Configure timezone-aware and naive timestamp cursors
164
+
If your cursor is on a timestamp/datetime column, make sure you set up your initial and end values correctly. This will help you avoid implicit type conversions, invalid datetime literals, or column comparisons in database queries. Note that implicit conversions may result in data loss, for example if a naive datetime has a different local timezone on the machine where Python is executing versus your DBMS.
165
+
166
+
* If your datetime column is naive, use naive Python datetime. Note that `pendulum` datetime is timezone-aware by default while standard `datetime` is naive.
167
+
* Use `full` reflection level or above to reflect the `timezone` (awareness hint) on datetime columns.
168
+
* Read about [timestamp handling](../../../general-usage/schema.md#handling-of-timestamp-and-time-zones) in `dlt`
Copy file name to clipboardExpand all lines: docs/website/docs/dlt-ecosystem/verified-sources/sql_database/troubleshooting.md
+9-9Lines changed: 9 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,18 +10,18 @@ import Header from '../_source-info-header.md';
10
10
11
11
<Header/>
12
12
13
-
## Timezoneaware and non-aware data types
13
+
## Timezone-aware and Non-aware Data Types
14
14
15
-
### I see UTC datetime column in my destination, but my data source has naive datetime column
16
-
Use `full` or `full_with_precision` reflection level to get explicit `timezone` hint in reflected table schemas. Without that
17
-
hint, `dlt` will coerce all timestamps into tz-aware UTC ones.
15
+
### I see a UTC datetime column in my destination, but my data source has a naive datetime column
16
+
Use `full` or `full_with_precision` reflection level to get an explicit `timezone` hint in reflected table schemas. Without that
17
+
hint, `dlt` will coerce all timestamps into timezone-aware UTC ones.
18
18
19
-
### I have incremental cursor on datetime column and I see query errors
20
-
Queries used to query data in the `sql_database` are created from `Incremental` instance attached to table resource. [Initial end and last values
21
-
must match tz-awareness of the cursor column](setup.md) because they will be used as parameters to the `WHERE` clause.
19
+
### I have an incremental cursor on a datetime column and I see query errors
20
+
Queries used to query data in the `sql_database` are created from an `Incremental` instance attached to the table resource. [Initial end and last values
21
+
must match timezone-awareness of the cursor column](setup.md) because they will be used as parameters in the `WHERE` clause.
22
22
23
-
In rare cases where last value is already stored in pipeline state and has wrong tz-awareness you may not be able to recover your pipeline automatically. You may
24
-
modify local pipeline state (after syncing with destination) to add/remove timezone.
23
+
In rare cases where the last value is already stored in the pipeline state and has incorrect timezone-awareness, you may not be able to recover your pipeline automatically. You can
24
+
modify the local pipeline state (after syncing with destination) to add/remove timezone information.
Copy file name to clipboardExpand all lines: docs/website/docs/general-usage/schema.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -197,7 +197,7 @@ Now go ahead and try to add a new record where `id` is a float number; you shoul
197
197
198
198
199
199
### Handling of timestamp and time zones
200
-
By default `dlt` normalizes timestamps (tz-aware an naive) into time zone aware type in UTC timezone. Since `1.16.0` it fully honors `timezone` boolean hint if set
200
+
By default,`dlt` normalizes timestamps (tz-aware and naive) into time zone aware types in UTC timezone. Since `1.16.0`, it fully honors the `timezone` boolean hint if set
201
201
explicitly on a column or by a source/resource. Normalizers do not infer this hint from data. The same rules apply for tabular data (arrow/pandas) and Python objects:
202
202
203
203
| input timestamp |`timezone` hint | normalized timestamp |
@@ -212,12 +212,12 @@ explicitly on a column or by a source/resource. Normalizers do not infer this hi
212
212
naive timestamps will **always be considered as UTC**, system timezone settings are ignored by `dlt`
213
213
:::
214
214
215
-
Ultimately destination will interpret the timestamp values. Some destinations:
216
-
- do not support naive timestamps (ie. BigQuery) and will interpret them as naive UTC by attaching UTC timezone
217
-
- do not support tz-aware timestamps (ie. Dremio, Athena) and will strip timezones from timestamps being loaded
215
+
Ultimately, the destination will interpret the timestamp values. Some destinations:
216
+
- do not support naive timestamps (i.e. BigQuery) and will interpret them as naive UTC by attaching UTC timezone
217
+
- do not support tz-aware timestamps (i.e. Dremio, Athena) and will strip timezones from timestamps being loaded
218
218
- do not store timezone at all and all timestamps are converted to UTC
219
-
- store timezone as column level property and internally convert timestamps to UTC. (ie. postgres)
220
-
- store timezone and offset (ie. MSSQL). however we could not find any destination that can read back the original timezones
219
+
- store timezone as column level property and internally convert timestamps to UTC (i.e. postgres)
220
+
- store timezone and offset (i.e. MSSQL). However, we could not find any destination that can read back the original timezones
221
221
222
222
`dlt` sets sessions to UTC timezone to minimize chances of erroneous conversion.
223
223
@@ -226,20 +226,20 @@ The precision and scale are interpreted by the particular destination and are va
226
226
227
227
The precision for **bigint** is mapped to available integer types, i.e., TINYINT, INT, BIGINT. The default is 64 bits (8 bytes) precision (BIGINT).
228
228
229
-
Selected destinations honor precision hint on **timestamp**. Precisions is numeric value in range of 0 (seconds) to 9 (nanoseconds) and set the fractional
230
-
number of seconds stored in a column. The default value is 6 (microseconds) which is Python `datetime` precision. `postgres`, `duckdb`, `snowflake`, `synapse` and `mssql` allow to set precision. Additionally `duckdb` and `filesystem` (via. parquet) allow for nanosecond precision if:
229
+
Selected destinations honor precision hint on **timestamp**. Precision is a numeric value in range of 0 (seconds) to 9 (nanoseconds) and sets the fractional
230
+
number of seconds stored in a column. The default value is 6 (microseconds) which is Python `datetime` precision. `postgres`, `duckdb`, `snowflake`, `synapse` and `mssql` allow setting precision. Additionally,`duckdb` and `filesystem` (via parquet) allow for nanosecond precision if:
231
231
* you configure [parquet version](../dlt-ecosystem/file-formats/parquet.md#writer-settings) to **2.6**
232
232
* you yield tabular data (arrow tables/pandas). `dlt` coerces all Python datetime objects into `pendulum` with microsecond precision.
233
233
234
234
### Handling nulls
235
235
In general, destinations are responsible for NULL enforcement. `dlt` does not verify nullability of data in arrow tables and Python objects. Note that:
236
236
237
-
* there's an exception to that rule if Python object (`dict`) contains explicit `None` for non-nullable key. This check will be eliminated. Note that if value
237
+
* there's an exception to that rule if a Python object (`dict`) contains explicit `None` for a non-nullable key. This check will be eliminated. Note that if a value
238
238
for a key is not present at all, nullability check is not done
239
239
* nullability is checked by Arrow when saving parquet files. This is a new behavior and `dlt` normalizes it for older arrow versions.
240
240
241
241
### Structured types
242
-
`dlt` has experimental support for structured types that currently piggyback on `json` data type and may be set only by yielding arrow tables. `dlt`` does not
242
+
`dlt` has experimental support for structured types that currently piggyback on `json` data type and may be set only by yielding arrow tables. `dlt` does not
243
243
evolve nested types and will not migrate destination schemas to match. Nested types are enabled for `filesystem`, `iceberg`, `delta` and `lancedb` destinations.
0 commit comments