Skip to content

Commit 3140978

Browse files
Add documentation for result_set_type_hints feature
Document the motivation (Athena API lacks nested type info), usage, constraints (nested arrays in native format, Arrow/Pandas/Polars), and the breaking change in 3.30.0 (complex type internals kept as strings without hints). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 5f5e6d5 commit 3140978

File tree

1 file changed

+74
-0
lines changed

1 file changed

+74
-0
lines changed

docs/usage.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -389,6 +389,80 @@ The `on_start_query_execution` callback is supported by the following cursor typ
389389
Note: `AsyncCursor` and its variants do not support this callback as they already
390390
return the query ID immediately through their different execution model.
391391

392+
## Type hints for complex types
393+
394+
*New in version 3.30.0.*
395+
396+
The Athena API does not return element-level type information for complex types
397+
(array, map, row/struct). PyAthena parses the string representation returned by
398+
Athena, but without type metadata the converter can only apply heuristics — which
399+
may produce incorrect Python types for nested values (e.g. integers left as strings
400+
inside a struct).
401+
402+
The `result_set_type_hints` parameter solves this by letting you provide Athena DDL
403+
type signatures for specific columns. The converter then uses precise, recursive
404+
type-aware conversion instead of heuristics.
405+
406+
```python
407+
from pyathena import connect
408+
409+
cursor = connect(s3_staging_dir="s3://YOUR_S3_BUCKET/path/to/",
410+
region_name="us-west-2").cursor()
411+
cursor.execute(
412+
"SELECT col_array, col_map, col_struct FROM one_row_complex",
413+
result_set_type_hints={
414+
"col_array": "array(integer)",
415+
"col_map": "map(integer, integer)",
416+
"col_struct": "row(a integer, b integer)",
417+
},
418+
)
419+
row = cursor.fetchone()
420+
# col_struct values are now integers, not strings:
421+
# {"a": 1, "b": 2} instead of {"a": "1", "b": "2"}
422+
```
423+
424+
Column name matching is case-insensitive. Type hints support arbitrarily nested types:
425+
426+
```python
427+
cursor.execute(
428+
"""
429+
SELECT CAST(
430+
ROW(ROW('2024-01-01', 123), 4.736, 0.583)
431+
AS ROW(header ROW(stamp VARCHAR, seq INTEGER), x DOUBLE, y DOUBLE)
432+
) AS positions
433+
""",
434+
result_set_type_hints={
435+
"positions": "row(header row(stamp varchar, seq integer), x double, y double)",
436+
},
437+
)
438+
row = cursor.fetchone()
439+
positions = row[0]
440+
# positions["header"]["seq"] == 123 (int, not "123")
441+
# positions["x"] == 4.736 (float, not "4.736")
442+
```
443+
444+
### Constraints
445+
446+
* **Nested arrays in native format** — Athena's native (non-JSON) string representation
447+
does not clearly delimit nested arrays. If your query returns nested arrays
448+
(e.g. `array(array(integer))`), use `CAST(... AS JSON)` in your query to get
449+
JSON-formatted output, which is parsed reliably.
450+
* **Arrow, Pandas, and Polars cursors** — These cursors accept `result_set_type_hints`
451+
but their converters do not currently use the hints because they rely on their own
452+
type systems. The parameter is passed through for forward compatibility and for
453+
result sets that fall back to the default conversion path.
454+
455+
### Breaking change in 3.30.0
456+
457+
Prior to 3.30.0, PyAthena attempted to infer Python types for scalar values inside
458+
complex types using heuristics (e.g. `"123"``123`). Starting with 3.30.0, values
459+
inside complex types are **kept as strings** unless `result_set_type_hints` is provided.
460+
This change avoids silent misconversion but means existing code that relied on the
461+
heuristic behavior may see string values where it previously saw integers or floats.
462+
463+
To restore typed conversion, pass `result_set_type_hints` with the appropriate type
464+
signatures for the affected columns.
465+
392466
## Environment variables
393467

394468
Support [Boto3 environment variables](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#using-environment-variables).

0 commit comments

Comments
 (0)