Skip to content

Conversation

@WeichenXu123
Copy link
Member

@WeichenXu123 WeichenXu123 commented Sep 20, 2019

Add missing method rename for koalas dataframe.

Some limitation:

  • Do not support in-place operation.
  • Require the mapper function include return type hint. such as:
def f1(x) -> int:
    return x*10
  • When rename index labels, it is possible to raise SparkException instead of KeyError (Discussion: Could we get the nested exception "KeyError" and re-throw it ?)

@WeichenXu123
Copy link
Member Author

One thing we need to discuss is:
when we want to rename labels in index, we'd better to let user specify the output label type. Because in koalas, the underlying spark dataframe transforming the index labels via pandas udf, which require the udf return type.
Currently, I force the return type to be the same with the old index label type.

@codecov-io
Copy link

codecov-io commented Sep 20, 2019

Codecov Report

Merging #806 into master will decrease coverage by 0.24%.
The diff coverage is 81.25%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #806      +/-   ##
==========================================
- Coverage   94.29%   94.05%   -0.25%     
==========================================
  Files          32       32              
  Lines        5911     5988      +77     
==========================================
+ Hits         5574     5632      +58     
- Misses        337      356      +19
Impacted Files Coverage Δ
databricks/koalas/missing/frame.py 100% <ø> (ø) ⬆️
databricks/koalas/frame.py 95.95% <81.25%> (-0.87%) ⬇️
databricks/koalas/__init__.py 82.5% <0%> (-2.5%) ⬇️
databricks/conftest.py 95.34% <0%> (-2.33%) ⬇️
databricks/koalas/generic.py 94.94% <0%> (-0.51%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9186870...e1839fe. Read the comment docs.

@WeichenXu123 WeichenXu123 changed the title [WIP] Add missing method rename for koalas dataframe Add missing method rename for koalas dataframe Sep 23, 2019
@WeichenXu123
Copy link
Member Author

@HyukjinKwon @ueshin Ready for review. I leave some discussions in PR description. When these discussion resolved, I will add some edge tests for them.

@WeichenXu123 WeichenXu123 requested review from HyukjinKwon and ueshin and removed request for HyukjinKwon September 23, 2019 14:11
@HyukjinKwon
Copy link
Member

I discussed about this offline and reviewed roughly @ueshin. Let me leave it to you or I review a day after tomorrow (after my vacation).

Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WeichenXu123 Thanks for working on this! I left some comments.

Btw, pandas supports to rename both index and columns at the same time.

>>> df
   A  B
0  1  4
1  2  5
2  3  6
>>> df.rename(index={0: 'x', 1: 'y', 2: 'z'}, columns={'A': 'a', 'B': 'b'})
   a  b
x  1  4
y  2  5
z  3  6

Do you have a plan to support this?

@WeichenXu123
Copy link
Member Author

Updated code to support renaming index and columns at the same time. @ueshin
(although the level arguments will be shared by them, it looks a little weird)

@WeichenXu123
Copy link
Member Author

Why the codecov failed ? the travis passed.

@HyukjinKwon
Copy link
Member

Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some nits. Otherwise, LGTM.

@softagram-bot
Copy link

Softagram Impact Report for pull/806 (head commit: 3cd9394)

⭐ Change Overview

Showing the changed files, dependency changes and the impact - click for full size
(Open in Softagram Desktop for full details)

📄 Full report

Impact Report explained. Give feedback on this report to [email protected]

@HyukjinKwon HyukjinKwon merged commit 2f3e894 into databricks:master Sep 30, 2019
@HyukjinKwon
Copy link
Member

Thanks @WeichenXu123

@WeichenXu123 WeichenXu123 deleted the add_df_rename branch October 3, 2019 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants