Skip to content

BUG: concat with empty frame upcasts float32 -> float64 #15525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adbull opened this issue Feb 27, 2017 · 5 comments
Closed

BUG: concat with empty frame upcasts float32 -> float64 #15525

adbull opened this issue Feb 27, 2017 · 5 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request

Comments

@adbull
Copy link
Contributor

adbull commented Feb 27, 2017

Code Sample, a copy-pastable example if possible

>>> import pandas as pd
>>> x = pd.DataFrame([[0]], dtype='float32')
>>> pd.concat([x, x.loc[0:]]).dtypes
0    float32
dtype: object
>>> pd.concat([x, x.loc[1:]]).dtypes
0    float64
dtype: object

Problem description

Calling concat() on two float32 DataFrames, one of which is empty, will upcast the data to float64, wasting memory.

Expected Output

0    float32
dtype: object

0    float32
dtype: object

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Darwin
OS-release: 16.4.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.19.2
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.11.3
scipy: 0.18.1
statsmodels: 0.6.1
xarray: 0.9.1
IPython: 4.2.0
sphinx: 1.5.1
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.6
lxml: 3.7.2
bs4: 4.5.3
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: None
jinja2: 2.8.1
boto: 2.45.0
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Feb 27, 2017

This is expected. empty frames are not considered for dtype considerations as there are some performance considerations. You can certainly look if you want and see a potential fix. But this is not a bug. marking as wont' fix.

@jreback jreback closed this as completed Feb 27, 2017
@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Feb 27, 2017
@jreback jreback added this to the won't fix milestone Feb 27, 2017
@jreback
Copy link
Contributor

jreback commented Feb 27, 2017

hmm on second thought we actually have quite a bit off issues where we delt with this.

#12411, #12045, #11594

though these were pretty much datetime related.

@jreback jreback reopened this Feb 27, 2017
@jreback jreback added Bug Difficulty Intermediate and removed Reshaping Concat, Merge/Join, Stack/Unstack, Explode labels Feb 27, 2017
@jreback jreback modified the milestones: Next Major Release, won't fix Feb 27, 2017
@jreback
Copy link
Contributor

jreback commented Feb 27, 2017

of course community PR's to fix would be really helpful :>

@jaehoonhwang
Copy link
Contributor

Hi, I'm new to contributing pandas.
Can I help with this bug and if so can you point to me at right direction?

@jreback
Copy link
Contributor

jreback commented Mar 3, 2017

actually this is a dupe of #13247; their is a pull-request #13337 that can be adapted (it is basically done, just needs a little polishing). happy to have you take that over.

@jreback jreback closed this as completed Mar 3, 2017
@jreback jreback added the Duplicate Report Duplicate issue or pull request label Mar 3, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

3 participants