Skip to content

Fixed #38419 - BUG: set_index screws up the dtypes on empty DataFrames #38430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 13, 2020

Conversation

jordi-crespo
Copy link
Contributor

Fixes #38419 - BUG: set_index screws up the dtypes on empty DataFrames

  • [ test_set_index_empty_dataframe] tests added / passed

@pep8speaks
Copy link

pep8speaks commented Dec 12, 2020

Hello @jordi-crespo! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-12-13 12:05:16 UTC

{"a": Series(dtype="datetime64[ns]"), "b": Series(dtype="int64"), "c": []}
)

if df.empty:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this. There's only one frame and we know it's empty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although feel free to add

assert df.empty is True

df = DataFrame(columns=list(df.columns.values))

expected = df.set_index(["a", "b"])
assert (df.loc[:, ["a", "b"]].dtypes == expected.index.to_frame().dtypes).all()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use tm.assert_frame_equal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but absolutely not, this it too complicated
Construct expected ahead of time and just do

tm.assert_frame_equal(result, expected)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make the construction as explicit as possible. The point here is to make it as easy as possible to see what the test does

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want to do something like this

In [21]: import pandas as pd
    ...: import pandas._testing as tm
    ...: 
    ...: df1 = pd.DataFrame({'a': pd.Series(dtype='datetime64[ns]'), 'b': pd.Series(dtype='int64'), 'c': []})
    ...: df2= df1.set_index(['a', 'b'])
    ...: 
    ...: result = df2.dtypes
    ...: expected = df1[['c']].dtypes # double square brackets to slice to a DataFrame not to a Series 
    ...: tm.assert_series_equal(result, expected)

Copy link
Contributor Author

@jordi-crespo jordi-crespo Dec 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check

@jreback jreback added Index Related to the Index class or subclasses Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Dec 13, 2020
@jreback jreback added this to the 1.3 milestone Dec 13, 2020
@jreback jreback merged commit e2dec8d into pandas-dev:master Dec 13, 2020
@jreback
Copy link
Contributor

jreback commented Dec 13, 2020

thanks @jordi-crespo

luckyvs1 pushed a commit to luckyvs1/pandas that referenced this pull request Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: set_index screws up the dtypes on empty DataFrames
4 participants