Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Dec 5, 2019

This PR adds PyArrow 0.15 support back with a better documentation and error message.
Error message will look as below:

Warning when pyarrow>=0.15 and pyspark<3.0 but ARROW_PRE_0_15_IPC_FORMAT not set.

WARNING:root:'ARROW_PRE_0_15_IPC_FORMAT' environment variable was not set. It is required 
to set this environment variable to '1' if you use pyarrow>=0.15 and pyspark<3.0. Koalas will 
set it for you but it does not work if there is a Spark context already launched.

Exceptions when ARROW_PRE_0_15_IPC_FORMAT is set in valid cases:

RuntimeError: Please explicitly unset 'ARROW_PRE_0_15_IPC_FORMAT' environment variable 
in both driver and executor sides. Check your spark.executorEnv.*, spark.yarn.appMasterEnv.*, 
spark.mesos.driverEnv.* and spark.kubernetes.driverEnv.* configurations. It is required to set 
this environment variable only when you use pyarrow>=0.15 and pyspark<3.0.
RuntimeError: Please explicitly unset 'ARROW_PRE_0_15_IPC_FORMAT' environment variable in 
both driver and executor sides. It is required to set this environment variable only when you use 
pyarrow>=0.15 and pyspark<3.0.

Resolves #1109

@codecov-io
Copy link

codecov-io commented Dec 5, 2019

Codecov Report

Merging #1110 into master will increase coverage by 0.03%.
The diff coverage is 80.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1110      +/-   ##
==========================================
+ Coverage   95.01%   95.05%   +0.03%     
==========================================
  Files          34       34              
  Lines        7944     7962      +18     
==========================================
+ Hits         7548     7568      +20     
+ Misses        396      394       -2     
Impacted Files Coverage Δ
databricks/koalas/__init__.py 92.30% <71.42%> (-1.31%) ⬇️
databricks/koalas/utils.py 96.92% <87.50%> (+1.71%) ⬆️
databricks/koalas/base.py 97.57% <0.00%> (+0.06%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b251bac...4d0c17d. Read the comment docs.

@HyukjinKwon HyukjinKwon force-pushed the pyarrow-0.15 branch 2 times, most recently from 2ec5dea to c776eeb Compare April 8, 2020 04:12
@databricks databricks deleted a comment from softagram-bot Apr 8, 2020
@HyukjinKwon HyukjinKwon changed the title Add PyArrow 0.15 support back Add PyArrow 0.15 support back, and test PyArrow 0.16 in CI Apr 8, 2020
Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, LGTM.

@HyukjinKwon
Copy link
Member Author

Thanks for detailed review @ueshin!

Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@ueshin
Copy link
Collaborator

ueshin commented Apr 9, 2020

Thanks! merging.

@ueshin ueshin merged commit 05b9324 into databricks:master Apr 9, 2020
@HyukjinKwon HyukjinKwon deleted the pyarrow-0.15 branch September 11, 2020 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Re-enable PyArrow 0.15 support back

3 participants