-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fixed #38419 - BUG: set_index screws up the dtypes on empty DataFrames #38430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @jordi-crespo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2020-12-13 12:05:16 UTC |
{"a": Series(dtype="datetime64[ns]"), "b": Series(dtype="int64"), "c": []} | ||
) | ||
|
||
if df.empty: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this. There's only one frame and we know it's empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although feel free to add
assert df.empty is True
df = DataFrame(columns=list(df.columns.values)) | ||
|
||
expected = df.set_index(["a", "b"]) | ||
assert (df.loc[:, ["a", "b"]].dtypes == expected.index.to_frame().dtypes).all() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use tm.assert_frame_equal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry but absolutely not, this it too complicated
Construct expected ahead of time and just do
tm.assert_frame_equal(result, expected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the construction as explicit as possible. The point here is to make it as easy as possible to see what the test does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to do something like this
In [21]: import pandas as pd
...: import pandas._testing as tm
...:
...: df1 = pd.DataFrame({'a': pd.Series(dtype='datetime64[ns]'), 'b': pd.Series(dtype='int64'), 'c': []})
...: df2= df1.set_index(['a', 'b'])
...:
...: result = df2.dtypes
...: expected = df1[['c']].dtypes # double square brackets to slice to a DataFrame not to a Series
...: tm.assert_series_equal(result, expected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check
thanks @jordi-crespo |
Fixes #38419 - BUG: set_index screws up the dtypes on empty DataFrames