Skip to content

Conversation

@beobest2
Copy link
Contributor

@beobest2 beobest2 commented Jan 31, 2020

Resolve #1248

Since the Python interpreter calls __repr__ if __str__ is not implemented,
the output will be identical to pandas using the implemented __repr__ of series.py.

>>> ks = ks.Series([1, 3, 5, np.nan, 6, 8])
>>> print(ks)
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
Name: 0, dtype: float64
>>> print(str(ks))
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
Name: 0, dtype: float64
>>> print(repr(ks))
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
Name: 0, dtype: float64
>>> ps = pd.Series([1, 3, 5, np.nan, 6, 8])
>>> print(ps)
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64
>>> print(str(ps))
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64
>>> print(repr(ps))
0    1.0
1    3.0
2    5.0
3    NaN
4    6.0
5    8.0
dtype: float64

@codecov-io
Copy link

codecov-io commented Jan 31, 2020

Codecov Report

Merging #1250 into master will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #1250      +/-   ##
=========================================
+ Coverage   95.09%   95.1%   +0.01%     
=========================================
  Files          35      35              
  Lines        7154    7152       -2     
=========================================
- Hits         6803    6802       -1     
+ Misses        351     350       -1
Impacted Files Coverage Δ
databricks/koalas/series.py 96.37% <ø> (+0.13%) ⬆️
databricks/koalas/frame.py 96.83% <0%> (-0.01%) ⬇️
databricks/koalas/ml.py 96.29% <0%> (+0.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dfd3bd1...ba559bb. Read the comment docs.

Copy link
Collaborator

@ueshin ueshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Actually I don't remember why we use the string.
@HyukjinKwon do you know the reason?

@itholic
Copy link
Contributor

itholic commented Feb 1, 2020

LGTM, too if there is no proper reason we should use the existing behavior.

@itholic
Copy link
Contributor

itholic commented Feb 1, 2020

nit: we better avoid ks as a variable name of Series since there is a high probability of overwriting the Koalas package name, ks. (I recommend kser instead, generally)

@beobest2
Copy link
Contributor Author

beobest2 commented Feb 1, 2020

nit: we better avoid ks as a variable name of Series since there is a high probability of overwriting the Koalas package name, ks. (I recommend kser instead, generally)

Thanks for the good tip :)

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine given that it's consistent with DataFrame. I don't know if there's the particular reason either.

One concern might be it triggers an actual job when, for instance, the string representation is expected in an error message .. but I suspect it's fine for now.

@HyukjinKwon HyukjinKwon merged commit c9eedb2 into databricks:master Feb 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Output of print ks.Series

5 participants