Skip to content

BUG: DataFrame.merge(suffixes=) does not respect None #24782

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
simonjayhawkins opened this issue Jan 15, 2019 · 3 comments · Fixed by #24819
Closed

BUG: DataFrame.merge(suffixes=) does not respect None #24782

simonjayhawkins opened this issue Jan 15, 2019 · 3 comments · Fixed by #24819
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Milestone

Comments

@simonjayhawkins
Copy link
Member

>>> import pandas as pd
>>> print(pd.__version__)
0.24.0rc1+6.gabfe72d7c.dirty
>>> from pandas import DataFrame
>>> df = DataFrame([1])
>>> df
   0
0  1
>>>
>>> df.columns
RangeIndex(start=0, stop=1, step=1)
>>>
>>> result = df.merge(df, left_index=True, right_index=True, suffixes=(None,'_dup'))
>>> result
   0None  0_dup
0      1      1
>>>
>>> result.columns
Index(['0None', '0_dup'], dtype='object')
>>>
>>> expected = result.rename(columns = {'0None':0})
>>> expected
   0  0_dup
0  1      1
>>>
>>> expected.columns
Index([0, '0_dup'], dtype='object')

specifying an empty string changes the dtype of the column label

>>> result = df.merge(df, left_index=True, right_index=True, suffixes=('','_dup'))
>>> result
   0  0_dup
0  1      1
>>> result.columns
Index(['0', '0_dup'], dtype='object')
@WillAyd
Copy link
Member

WillAyd commented Jan 15, 2019

Thanks for the report. Makes sense to have a more robust handling of None here - PRs are always welcome

@mroeschke mroeschke added Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Jan 15, 2019
@charlesdong1991
Copy link
Member

I am on it!

@simonjayhawkins
Copy link
Member Author

Thanks @charlesdong1991

albertoueda added a commit to albertoueda/pyterrier that referenced this issue May 12, 2021
There is a bug in Pandas < 0.25.0 that adds the string "None" as a suffix to column names on merging. That impacts pipelines such as those using `CombSumTransformer`, causing an error on searching columns such as "score" (because it became "scoreNone" instead).

Issue describing the error: 
pandas-dev/pandas#24782

```python
>>> result = df.merge(df, left_index=True, right_index=True, suffixes=(None,'_dup'))
>>> result
   0None  0_dup
0      1      1
>>> result.columns
Index(['0None', '0_dup'], dtype='object')
```

There is also the possibility of other recent versions (current is 1.2.4), I'm using the 1.0.1 and it looks fine for now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants