-
Notifications
You must be signed in to change notification settings - Fork 706
Ability to chose return types of awswrangler.athena.read_sql_query #425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@MosheVai Do you think it makes sense to include this functionality in the library. Because as far as I understand, pandas already has this use case covered. Even if you don't use it down the stream you can just use The only benefit I could see that instead of reading the complete data at once, would be streaming it line for line. WDYT? |
It does seem like a logical solution to this issue. Another reason to include this functionality is large datasets. |
Do you have any other response type than |
We should not focus on fetch or deliver big result sets without Pandas/PyArrow. AWS Data Wrangler has a clear intention to rely on these projects to handle big amounts of data with a reasonable performance. One common second scenario where we could focus is on tiny results. Where we can just reach out the Athena API through get_query_results() skipping the result files on s3 and all the pandas/pyarrow layer. @MosheVai @maxispeicher what do you think? @MosheVai does this second scenario help you? |
I think Edit: But if this should be included I agree that this should probably go to a separate function. |
You are right, we would need to cast everything based on the column types provided under I feel like it should be resolved inside the pandas project. |
Is your feature request related to a problem? Please describe.
I am using the
read_sql_query
to get data from Athena, However I dont use pandas down the stream.Because of this I have to convert the
pd.Dataframe
to a python list and convert all the relevant pandas/numpy related types. (pd.Timestamp
,np.double
, etc...)Describe the solution you'd like
Add a kwarg for
return_type: str
that accepts from a list of possible values.Maybe one of the options will return a
Iterable[Dict[str, Any]]
where eachdict
corresponds to a row?The text was updated successfully, but these errors were encountered: