-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Empty subtypes of Index return their type, rather than Index #10599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -938,6 +938,10 @@ def test_difference(self): | |||
self.assertEqual(len(result), 0) | |||
self.assertEqual(result.name, first.name) | |||
|
|||
# empty difference for subtypes | |||
result = self.periodIndex.difference(self.periodIndex) | |||
self.assertIsInstance(result, pd.PeriodIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use tm.assert_index_equal(...)
here (and directly construct the expected)
# GH 10596 - empty difference retains index's type | ||
|
||
result = idx.difference(idx) | ||
self.assertIsInstance(result, type(idx)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the way to tthis generically is define a create_empty
method analagous to create_index
for each subclass which has is an empty version of create_index
.
@jreback I'm having some issues with comparing empty Indexes. For Timedelta, Datetime & Categorical, I get the equivalent of this: AssertionError: TimedeltaIndex([], dtype='timedelta64[ns]', freq='D') != TimedeltaIndex([], dtype='timedelta64[ns]', freq='D') I'd love to take a look at the wider problem in the future; in the meantime is there a way of getting this fix in? Is there a strong reason you don't like the type test rather than the direct comparison? |
no, use |
just asserting type papers over the meta data propogation, so its just creates a new bug, rather than fixing the existing one. |
767f716
to
d1e478f
Compare
Great @jreback, that should be good to go. Let me know if I've missed anything. Cheers |
pls add a release note in whatsnew/v0.17.0 squash to a single commit ping when green. |
d64c90b
to
beab262
Compare
I have a PyTables test fail - does anyone have any idea what this might be? I've dug around, but not getting anywhere fast; and have zero PyTables experience / installation. ERROR: test_to_hdf_with_object_column_names (pandas.io.tests.test_pytables.TestHDFStore)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/travis/build/pydata/pandas/pandas/io/tests/test_pytables.py", line 4678, in test_to_hdf_with_object_column_names
df.to_hdf(path, 'df', format='table', data_columns=True)
File "/home/travis/build/pydata/pandas/pandas/core/generic.py", line 920, in to_hdf
return pytables.to_hdf(path_or_buf, key, self, **kwargs)
File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 269, in to_hdf
f(store)
File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 264, in <lambda>
f = lambda store: store.put(key, value, **kwargs)
File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 826, in put
self._write_to_group(key, value, append=append, **kwargs)
File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 1275, in _write_to_group
s.write(obj=value, append=append, complib=complib, **kwargs)
File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 3798, in write
**kwargs)
File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 3404, in create_axes
axis=axis
File "/home/travis/build/pydata/pandas/pandas/core/frame.py", line 2522, in reindex_axis
fill_value=fill_value)
File "/home/travis/build/pydata/pandas/pandas/core/generic.py", line 1852, in reindex_axis
limit=limit)
File "/home/travis/build/pydata/pandas/pandas/core/index.py", line 3062, in reindex
if not is_categorical_dtype(target) and not target.is_unique:
File "properties.pyx", line 34, in pandas.lib.cache_readonly.__get__ (pandas/lib.c:39442)
File "/home/travis/build/pydata/pandas/pandas/core/index.py", line 744, in is_unique
return self._engine.is_unique
File "index.pyx", line 213, in pandas.index.IndexEngine.is_unique.__get__ (pandas/index.c:4392)
File "index.pyx", line 248, in pandas.index.IndexEngine._do_unique_check (pandas/index.c:4874)
File "index.pyx", line 261, in pandas.index.IndexEngine._ensure_mapping_populated (pandas/index.c:5047)
File "index.pyx", line 267, in pandas.index.IndexEngine.initialize (pandas/index.c:5135)
File "hashtable.pyx", line 703, in pandas.hashtable.PyObjectHashTable.map_locations (pandas/hashtable.c:11242)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'double' |
what are the columns on the df |
The columns are either strings or categoricals - that's the line this fails on. types_should_run = [ tm.makeStringIndex, tm.makeCategoricalIndex ]
...
for index in types_should_run:
df = DataFrame(np.random.randn(10, 2), columns=index(2))
with ensure_clean_path(self.path) as path:
df.to_hdf(path, 'df', format='table', data_columns=True)
result = pd.read_hdf(path, 'df', where="index = [{0}]".format(df.index[0]))
assert(len(result)) |
@@ -344,7 +344,7 @@ Bug Fixes | |||
- Bug in ``ExcelReader`` when worksheet is empty (:issue:`6403`) | |||
- Bug in ``Table.select_column`` where name is not preserved (:issue:`10392`) | |||
- Bug in ``offsets.generate_range`` where ``start`` and ``end`` have finer precision than ``offset`` (:issue:`9907`) | |||
|
|||
- Bug in ``Subclasses of Index with no values returned Index objects rather than their own classes, in some cases`` (:issue:`10596`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the double-back ticks Index only here (otherwise you are quoting the entire string).
Resolves #10596