Skip to content

Conversation

@itholic
Copy link
Contributor

@itholic itholic commented Feb 18, 2020

Implement Series.combine_first

  • basic example
>>> s1 = ks.Series([1, np.nan])
>>> s2 = ks.Series([3, 4])
>>> s1.combine_first(s2)
0    1.0
1    4.0
Name: 0, dtype: float64
  • MultiIndex
>>> midx1 = pd.MultiIndex([['lama', 'cow', 'falcon', 'koala'],
...                        ['speed', 'weight', 'length', 'power']],
...                       [[0, 3, 1, 1, 1, 2, 2, 2],
...                        [0, 2, 0, 3, 2, 0, 1, 3]])
>>> midx2 = pd.MultiIndex([['lama', 'cow', 'falcon'],
...                        ['speed', 'weight', 'length']],
...                       [[0, 0, 0, 1, 1, 1, 2, 2, 2],
...                        [0, 1, 2, 0, 1, 2, 0, 1, 2]])
>>> kser1 = ks.Series([45, 200, 1.2, 30, 250, 1.5, 320, 1], index=midx1)
>>> kser2 = ks.Series([-45, 200, -1.2, 30, -250, 1.5, 320, 1, -0.3], index=midx2)
>>> kser1
lama    speed      45.0
koala   length    200.0
cow     speed       1.2
        power      30.0
        length    250.0
falcon  speed       1.5
        weight    320.0
        power       1.0
Name: 0, dtype: float64
>>> kser2
lama    speed     -45.0
        weight    200.0
        length     -1.2
cow     speed      30.0
        weight   -250.0
        length      1.5
falcon  speed     320.0
        weight      1.0
        length     -0.3
Name: 0, dtype: float64

>>> kser1.combine_first(kser2)
cow     length    250.0
        power      30.0
        speed       1.2
        weight   -250.0
falcon  length     -0.3
        power       1.0
        speed       1.5
        weight    320.0
koala   length    200.0
lama    length     -1.2
        speed      45.0
        weight    200.0
Name: 0, dtype: float64

@itholic
Copy link
Contributor Author

itholic commented Feb 18, 2020

i considered putting Series.combine and Series.combine_first in single PR,

but their implementation concept was way different than i thought, so i separated them.

@codecov-io
Copy link

codecov-io commented Mar 1, 2020

Codecov Report

Merging #1290 into master will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1290      +/-   ##
==========================================
+ Coverage   95.25%   95.26%   +0.01%     
==========================================
  Files          34       34              
  Lines        7541     7559      +18     
==========================================
+ Hits         7183     7201      +18     
  Misses        358      358              
Impacted Files Coverage Δ
databricks/koalas/missing/series.py 100.00% <ø> (ø)
databricks/koalas/series.py 96.86% <100.00%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 45c325b...c4fb5d0. Read the comment docs.

@HyukjinKwon
Copy link
Member

I think you should rebase and sync to the current master, @itholic .

@HyukjinKwon
Copy link
Member

Looks fine otherwise.

@HyukjinKwon HyukjinKwon merged commit d4012b6 into databricks:master Mar 25, 2020
@itholic itholic deleted the s_combine_first branch March 25, 2020 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants