Skip to content

Conversation

@HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented May 7, 2019

This PR targets to inline Series.unique to Pandas'. After this PR:
Pandas

>>> import pandas as pd
>>> pd.DataFrame({"Person": ['a', 'b', 'b', 'c']})
  Person
0      a
1      b
2      b
3      c
>>> df = pd.DataFrame({"Person": ['a', 'b', 'b', 'c']})
>>> df.unique()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 4376, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'unique'
>>> df.Person.unique()
array(['a', 'b', 'c'], dtype=object)

Koalas

>>> import databricks.koalas as ks
>>> ks.DataFrame({"Person": ['a', 'b', 'b', 'c']})
  Person
0      a
1      b
2      b
3      c
>>> df = ks.DataFrame({"Person": ['a', 'b', 'b', 'c']})
>>> df.unique()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/hyukjin.kwon/workspace/forked/koalas/databricks/koalas/frame.py", line 1528, in __getattr__
    return Series(self._sdf.__getattr__(key), self, self._metadata.index_info)
  File "/Users/hyukjin.kwon/workspace/forked/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 1296, in __getattr__
AttributeError: 'DataFrame' object has no attribute 'unique'
>>> df.Person.unique()
0    c
1    b
2    a
Name: Person, dtype: object

Resolves #233

@HyukjinKwon HyukjinKwon requested review from rxin and ueshin May 7, 2019 06:36
@codecov-io
Copy link

codecov-io commented May 7, 2019

Codecov Report

Merging #249 into master will increase coverage by 0.26%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #249      +/-   ##
==========================================
+ Coverage   92.17%   92.44%   +0.26%     
==========================================
  Files          35       35              
  Lines        3158     3203      +45     
==========================================
+ Hits         2911     2961      +50     
+ Misses        247      242       -5
Impacted Files Coverage Δ
databricks/koalas/frame.py 92.87% <ø> (+0.92%) ⬆️
databricks/koalas/series.py 91.69% <100%> (+0.41%) ⬆️
databricks/koalas/tests/test_dataframe.py 100% <0%> (ø) ⬆️
databricks/koalas/tests/test_series.py 100% <0%> (ø) ⬆️
databricks/koalas/tests/test_utils.py 100% <0%> (ø) ⬆️
databricks/koalas/utils.py 100% <0%> (ø) ⬆️
databricks/koalas/namespace.py 90.34% <0%> (+0.41%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a3e5160...19cb1a1. Read the comment docs.

-------
Returns the unique values as a Series.
See Examples section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i don't think you need this since examples is literally one line below. i'm going to merge this and remove this line.

@rxin rxin merged commit b5e08b5 into databricks:master May 8, 2019
@HyukjinKwon HyukjinKwon deleted the unique-series branch November 6, 2019 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

unique function is broken

4 participants