Skip to content

to_dense does not preserve dtype in SparseArray #10648

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ebolyen opened this issue Jul 21, 2015 · 3 comments
Closed

to_dense does not preserve dtype in SparseArray #10648

ebolyen opened this issue Jul 21, 2015 · 3 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Sparse Sparse Data Type
Milestone

Comments

@ebolyen
Copy link

ebolyen commented Jul 21, 2015

This isn't a huge deal, but it seems a little odd:

In [1]: import pandas as pd

In [2]: a = pd.SparseArray([True, False, False, False, True], fill_value=False, dtype=bool)

In [3]: a
Out[3]: 
[True, False, False, False, True]
Fill: False
IntIndex
Indices: array([0, 4], dtype=int32)

In [4]: a.dtype
Out[4]: dtype('bool')

In [5]: d = a.to_dense()

In [6]: d
Out[6]: array([ 1.,  0.,  0.,  0.,  1.])

In [7]: d.dtype
Out[7]: dtype('float64')

I would have expected d to retain the dtype of bool. I can cast down, but I am still wasting 7 bytes per element in the process.

@ebolyen
Copy link
Author

ebolyen commented Jul 21, 2015

INSTALLED VERSIONS

commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-57-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.7
Cython: 0.22.1
numpy: 1.9.2
scipy: 0.15.1
statsmodels: None
IPython: 3.2.0
sphinx: 1.2.2
patsy: None
dateutil: 2.4.2
pytz: 2015.4
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.4.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None

@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions Sparse Sparse Data Type labels Jul 22, 2015
@jreback jreback added this to the Next Major Release milestone Jul 22, 2015
@jreback
Copy link
Contributor

jreback commented Jul 22, 2015

prob a bug, not really well tested and not really a lot of dev support in sparse
pull requests are welcome

@jreback jreback changed the title to_dense does not preserve dtype in SparseArray to_dense does not preserve dtype in SparseArray Jul 22, 2015
@ebolyen
Copy link
Author

ebolyen commented Jul 22, 2015

I may be speaking too soon, but after skimming the code for this, it looks pretty simple to fix. I'll work on a PR!

ebolyen added a commit to ebolyen/pandas that referenced this issue Jul 22, 2015
Also fixes values and get_values.

closes pandas-dev#10648
@jreback jreback modified the milestones: 0.17.0, Next Major Release Jul 24, 2015
@jreback jreback modified the milestones: Next Major Release, 0.17.0 Sep 2, 2015
@jreback jreback modified the milestones: 0.18.1, Next Major Release Apr 3, 2016
@jreback jreback closed this as completed in 2d13410 Apr 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Sparse Sparse Data Type
Projects
None yet
2 participants