Skip to content

Commit 4af60c8

Browse files
authored
fix grammar on timezone specific docs sections (#3044)
1 parent 823bf38 commit 4af60c8

File tree

3 files changed

+25
-27
lines changed

3 files changed

+25
-27
lines changed

docs/website/docs/dlt-ecosystem/verified-sources/sql_database/configuration.md

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -160,14 +160,12 @@ Incremental loading uses a cursor column (e.g., timestamp or auto-incrementing I
160160
If your cursor column name contains special characters (e.g., `$`) you need to escape it when passing it to the `incremental` function. For example, if your cursor column is `example_$column`, you should pass it as `"'example_$column'"` or `'"example_$column"'` to the `incremental` function: `incremental("'example_$column'", initial_value=...)`.
161161
:::
162162

163-
### Configure timezone aware and naive timestamp cursors
164-
If your cursor is on timestamp/datetime column, make sure you set up your initial and end values correctly. You will avoid implicit
165-
type conversions, invalid date time literals or column comparisons in database queries. Note that if implicit conversions may
166-
result in data loss ie. if naive datetime has different local timezone on the machine where Python is executing vs. your dbms.
167-
168-
* If your datetime columns is naive, use naive Python datetime. Note that `pendulum` datetime is tz-aware by default and standard `datetime` is naive.
169-
* Use `full` reflection level or above to reflect `timezone` (awareness hint) on the datetime columns.
170-
* read about the [timestamp handling](../../../general-usage/schema.md#handling-of-timestamp-and-time-zones) in `dlt`
163+
### Configure timezone-aware and naive timestamp cursors
164+
If your cursor is on a timestamp/datetime column, make sure you set up your initial and end values correctly. This will help you avoid implicit type conversions, invalid datetime literals, or column comparisons in database queries. Note that implicit conversions may result in data loss, for example if a naive datetime has a different local timezone on the machine where Python is executing versus your DBMS.
165+
166+
* If your datetime column is naive, use naive Python datetime. Note that `pendulum` datetime is timezone-aware by default while standard `datetime` is naive.
167+
* Use `full` reflection level or above to reflect the `timezone` (awareness hint) on datetime columns.
168+
* Read about [timestamp handling](../../../general-usage/schema.md#handling-of-timestamp-and-time-zones) in `dlt`
171169

172170

173171
### Examples

docs/website/docs/dlt-ecosystem/verified-sources/sql_database/troubleshooting.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,18 +10,18 @@ import Header from '../_source-info-header.md';
1010

1111
<Header/>
1212

13-
## Timezone aware and non-aware data types
13+
## Timezone-aware and Non-aware Data Types
1414

15-
### I see UTC datetime column in my destination, but my data source has naive datetime column
16-
Use `full` or `full_with_precision` reflection level to get explicit `timezone` hint in reflected table schemas. Without that
17-
hint, `dlt` will coerce all timestamps into tz-aware UTC ones.
15+
### I see a UTC datetime column in my destination, but my data source has a naive datetime column
16+
Use `full` or `full_with_precision` reflection level to get an explicit `timezone` hint in reflected table schemas. Without that
17+
hint, `dlt` will coerce all timestamps into timezone-aware UTC ones.
1818

19-
### I have incremental cursor on datetime column and I see query errors
20-
Queries used to query data in the `sql_database` are created from `Incremental` instance attached to table resource. [Initial end and last values
21-
must match tz-awareness of the cursor column](setup.md) because they will be used as parameters to the `WHERE` clause.
19+
### I have an incremental cursor on a datetime column and I see query errors
20+
Queries used to query data in the `sql_database` are created from an `Incremental` instance attached to the table resource. [Initial end and last values
21+
must match timezone-awareness of the cursor column](setup.md) because they will be used as parameters in the `WHERE` clause.
2222

23-
In rare cases where last value is already stored in pipeline state and has wrong tz-awareness you may not be able to recover your pipeline automatically. You may
24-
modify local pipeline state (after syncing with destination) to add/remove timezone.
23+
In rare cases where the last value is already stored in the pipeline state and has incorrect timezone-awareness, you may not be able to recover your pipeline automatically. You can
24+
modify the local pipeline state (after syncing with destination) to add/remove timezone information.
2525

2626
## Troubleshooting connection
2727

docs/website/docs/general-usage/schema.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -197,7 +197,7 @@ Now go ahead and try to add a new record where `id` is a float number; you shoul
197197

198198

199199
### Handling of timestamp and time zones
200-
By default `dlt` normalizes timestamps (tz-aware an naive) into time zone aware type in UTC timezone. Since `1.16.0` it fully honors `timezone` boolean hint if set
200+
By default, `dlt` normalizes timestamps (tz-aware and naive) into time zone aware types in UTC timezone. Since `1.16.0`, it fully honors the `timezone` boolean hint if set
201201
explicitly on a column or by a source/resource. Normalizers do not infer this hint from data. The same rules apply for tabular data (arrow/pandas) and Python objects:
202202

203203
| input timestamp | `timezone` hint | normalized timestamp |
@@ -212,12 +212,12 @@ explicitly on a column or by a source/resource. Normalizers do not infer this hi
212212
naive timestamps will **always be considered as UTC**, system timezone settings are ignored by `dlt`
213213
:::
214214

215-
Ultimately destination will interpret the timestamp values. Some destinations:
216-
- do not support naive timestamps (ie. BigQuery) and will interpret them as naive UTC by attaching UTC timezone
217-
- do not support tz-aware timestamps (ie. Dremio, Athena) and will strip timezones from timestamps being loaded
215+
Ultimately, the destination will interpret the timestamp values. Some destinations:
216+
- do not support naive timestamps (i.e. BigQuery) and will interpret them as naive UTC by attaching UTC timezone
217+
- do not support tz-aware timestamps (i.e. Dremio, Athena) and will strip timezones from timestamps being loaded
218218
- do not store timezone at all and all timestamps are converted to UTC
219-
- store timezone as column level property and internally convert timestamps to UTC. (ie. postgres)
220-
- store timezone and offset (ie. MSSQL). however we could not find any destination that can read back the original timezones
219+
- store timezone as column level property and internally convert timestamps to UTC (i.e. postgres)
220+
- store timezone and offset (i.e. MSSQL). However, we could not find any destination that can read back the original timezones
221221

222222
`dlt` sets sessions to UTC timezone to minimize chances of erroneous conversion.
223223

@@ -226,20 +226,20 @@ The precision and scale are interpreted by the particular destination and are va
226226

227227
The precision for **bigint** is mapped to available integer types, i.e., TINYINT, INT, BIGINT. The default is 64 bits (8 bytes) precision (BIGINT).
228228

229-
Selected destinations honor precision hint on **timestamp**. Precisions is numeric value in range of 0 (seconds) to 9 (nanoseconds) and set the fractional
230-
number of seconds stored in a column. The default value is 6 (microseconds) which is Python `datetime` precision. `postgres`, `duckdb`, `snowflake`, `synapse` and `mssql` allow to set precision. Additionally `duckdb` and `filesystem` (via. parquet) allow for nanosecond precision if:
229+
Selected destinations honor precision hint on **timestamp**. Precision is a numeric value in range of 0 (seconds) to 9 (nanoseconds) and sets the fractional
230+
number of seconds stored in a column. The default value is 6 (microseconds) which is Python `datetime` precision. `postgres`, `duckdb`, `snowflake`, `synapse` and `mssql` allow setting precision. Additionally, `duckdb` and `filesystem` (via parquet) allow for nanosecond precision if:
231231
* you configure [parquet version](../dlt-ecosystem/file-formats/parquet.md#writer-settings) to **2.6**
232232
* you yield tabular data (arrow tables/pandas). `dlt` coerces all Python datetime objects into `pendulum` with microsecond precision.
233233

234234
### Handling nulls
235235
In general, destinations are responsible for NULL enforcement. `dlt` does not verify nullability of data in arrow tables and Python objects. Note that:
236236

237-
* there's an exception to that rule if Python object (`dict`) contains explicit `None` for non-nullable key. This check will be eliminated. Note that if value
237+
* there's an exception to that rule if a Python object (`dict`) contains explicit `None` for a non-nullable key. This check will be eliminated. Note that if a value
238238
for a key is not present at all, nullability check is not done
239239
* nullability is checked by Arrow when saving parquet files. This is a new behavior and `dlt` normalizes it for older arrow versions.
240240

241241
### Structured types
242-
`dlt` has experimental support for structured types that currently piggyback on `json` data type and may be set only by yielding arrow tables. `dlt`` does not
242+
`dlt` has experimental support for structured types that currently piggyback on `json` data type and may be set only by yielding arrow tables. `dlt` does not
243243
evolve nested types and will not migrate destination schemas to match. Nested types are enabled for `filesystem`, `iceberg`, `delta` and `lancedb` destinations.
244244

245245

0 commit comments

Comments
 (0)