Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Mar 17, 2020

This PR proposes to add axis at ks.concat(...).

import databricks.koalas as ks

df1 = ks.DataFrame([['a', 1], ['b', 2]], columns=['letter', 'number'])
df2 = ks.DataFrame([['bird', 'polly'], ['monkey', 'george']], columns=['animal', 'name'])
ks.concat([df1, df2], axis=1)
  letter  number  animal    name
0      a       1    bird   polly
1      b       2  monkey  george

Resolves #625, Closes #1009.

@HyukjinKwon HyukjinKwon force-pushed the concat-axis=1 branch 2 times, most recently from c8b73b2 to 4fc4f38 Compare March 18, 2020 01:13
@codecov-io
Copy link

codecov-io commented Mar 18, 2020

Codecov Report

Merging #1349 into master will decrease coverage by 0.13%.
The diff coverage is 97.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1349      +/-   ##
==========================================
- Coverage   95.23%   95.09%   -0.14%     
==========================================
  Files          34       34              
  Lines        7576     7998     +422     
==========================================
+ Hits         7215     7606     +391     
- Misses        361      392      +31
Impacted Files Coverage Δ
databricks/koalas/namespace.py 88.67% <100%> (+0.91%) ⬆️
databricks/koalas/frame.py 96.74% <94.44%> (+0.02%) ⬆️
databricks/koalas/groupby.py 91.24% <0%> (-0.23%) ⬇️
databricks/koalas/missing/frame.py 100% <0%> (ø) ⬆️
databricks/koalas/missing/indexes.py 100% <0%> (ø) ⬆️
databricks/koalas/missing/series.py 100% <0%> (ø) ⬆️
databricks/koalas/base.py 97.4% <0%> (+0.09%) ⬆️
databricks/koalas/indexes.py 96.98% <0%> (+0.21%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update da3740d...184fdc0. Read the comment docs.

Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@HyukjinKwon HyukjinKwon force-pushed the concat-axis=1 branch 4 times, most recently from b1d0547 to 97a7718 Compare March 20, 2020 00:39
@HyukjinKwon HyukjinKwon force-pushed the concat-axis=1 branch 2 times, most recently from cfd51ee to 0ef9bad Compare March 20, 2020 02:14
@HyukjinKwon
Copy link
Member Author

In the last commits, I addressed some corner cases such as operations against single column index vs multi-column index, and fixed some nits.

(
[kdf3[("X", "A")].rename("ABC"), kdf3[("X", "B")]],
[pdf3[("X", "A")].rename("ABC"), pdf3[("X", "B")]],
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This two test cases were added to verify single index vs multi-index in the column.

ValueError,
r"Labels have to be unique; however, got duplicated labels \['A'\].",
lambda: ks.concat([kdf.A, kdf4.A], join="inner", axis=1),
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test also was added to verify duplicated column cases.

@HyukjinKwon HyukjinKwon force-pushed the concat-axis=1 branch 2 times, most recently from d18c4a2 to 64e9d7f Compare March 20, 2020 03:00
pdf5 = pd.DataFrame({"A": [0, 2, 4], "B": [1, 3, 5]}, index=[1, 2, 3])
pdf6 = pd.DataFrame({"C": [1, 2, 3]}, index=[1, 3, 5])
kdf5 = ks.from_pandas(pdf5)
kdf6 = ks.from_pandas(pdf6)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added another test case.

@HyukjinKwon
Copy link
Member Author

Merged!

@HyukjinKwon HyukjinKwon merged commit 34ac6f6 into databricks:master Mar 20, 2020
@HyukjinKwon HyukjinKwon deleted the concat-axis=1 branch September 11, 2020 07:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

concat(axis=1) currently is not supported

4 participants