Clarify that S3 download happens during execute(), not during fetch

laughingman7743 · claude · laughingman7743 · commit ea55fbcb25f2 · 2026-02-21T13:43:35.000+09:00
Explain why as_pandas/as_arrow/as_polars don't need await: the S3
download is wrapped in asyncio.to_thread inside execute(), so data
is already in memory by the time fetch/as_* methods are called.

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/docs/aio.md b/docs/aio.md
@@ -177,24 +177,27 @@ Native asyncio versions are available for all cursor types:
 
 ### Fetch behavior
 
-Most aio cursors load all result data eagerly during `execute()` (via `asyncio.to_thread`),
-so `fetchone()`, `fetchmany()`, and `fetchall()` are synchronous (in-memory only):
+For **AioPandasCursor**, **AioArrowCursor**, and **AioPolarsCursor**, the S3 download
+(CSV or Parquet) happens inside `execute()`, wrapped in `asyncio.to_thread()`.
+By the time `execute()` returns, all data is already loaded into memory.
+Therefore `fetchone()`, `fetchall()`, `as_pandas()`, `as_arrow()`, and `as_polars()`
+are synchronous (in-memory only) and do not need `await`:
 
 ```python
-# Pandas, Arrow, Polars — fetch is sync (data already loaded)
-await cursor.execute("SELECT * FROM many_rows")
-row = cursor.fetchone()        # No await needed
-rows = cursor.fetchall()       # No await needed
-df = cursor.as_pandas()        # No await needed
+# Pandas, Arrow, Polars — S3 download completes during execute()
+await cursor.execute("SELECT * FROM many_rows")  # Downloads data here
+row = cursor.fetchone()        # No await — data already in memory
+rows = cursor.fetchall()       # No await
+df = cursor.as_pandas()        # No await
 ```
 
 The exceptions are **AioCursor** and **AioS3FSCursor**, which stream rows lazily from S3.
-Their fetch methods require `await`:
+Their fetch methods perform I/O and require `await`:
 
 ```python
-# AioCursor, AioS3FSCursor — fetch is async (reads from S3)
+# AioCursor, AioS3FSCursor — fetch reads from S3 lazily
 await cursor.execute("SELECT * FROM many_rows")
-row = await cursor.fetchone()    # Await required
+row = await cursor.fetchone()    # Await required — reads from S3
 rows = await cursor.fetchall()   # Await required
 ```
 
diff --git a/docs/arrow.md b/docs/arrow.md
@@ -489,8 +489,9 @@ AioArrowCursor is a native asyncio cursor that returns results as Apache Arrow T
 Unlike AsyncArrowCursor which uses `concurrent.futures`, this cursor uses
 `asyncio.to_thread()` for result set creation, keeping the event loop free.
 
-Since the result set is loaded eagerly during `execute()`, fetch methods, `as_arrow()`,
-and `as_polars()` are synchronous (in-memory only) and do not need `await`.
+The S3 download (CSV or Parquet) happens inside `execute()`, wrapped in `asyncio.to_thread()`.
+By the time `execute()` returns, all data is already loaded into memory.
+Therefore fetch methods, `as_arrow()`, and `as_polars()` are synchronous and do not need `await`.
 
 ```python
 from pyathena import aconnect
diff --git a/docs/pandas.md b/docs/pandas.md
@@ -776,8 +776,9 @@ AioPandasCursor is a native asyncio cursor that returns results as pandas DataFr
 Unlike AsyncPandasCursor which uses `concurrent.futures`, this cursor uses
 `asyncio.to_thread()` for result set creation, keeping the event loop free.
 
-Since the result set is loaded eagerly during `execute()`, fetch methods and `as_pandas()`
-are synchronous (in-memory only) and do not need `await`.
+The S3 download (CSV or Parquet) happens inside `execute()`, wrapped in `asyncio.to_thread()`.
+By the time `execute()` returns, all data is already loaded into memory.
+Therefore fetch methods and `as_pandas()` are synchronous and do not need `await`.
 
 ```python
 from pyathena import aconnect
diff --git a/docs/polars.md b/docs/polars.md
@@ -583,8 +583,9 @@ AioPolarsCursor is a native asyncio cursor that returns results as Polars DataFr
 Unlike AsyncPolarsCursor which uses `concurrent.futures`, this cursor uses
 `asyncio.to_thread()` for result set creation, keeping the event loop free.
 
-Since the result set is loaded eagerly during `execute()`, fetch methods, `as_polars()`,
-and `as_arrow()` are synchronous (in-memory only) and do not need `await`.
+The S3 download (CSV or Parquet) happens inside `execute()`, wrapped in `asyncio.to_thread()`.
+By the time `execute()` returns, all data is already loaded into memory.
+Therefore fetch methods, `as_polars()`, and `as_arrow()` are synchronous and do not need `await`.
 
 ```python
 from pyathena import aconnect