Empty subtypes of Index return their type, rather than Index #10599

max-sixty · 2015-07-16T00:25:13Z

Resolves #10596

jreback · 2015-07-16T00:27:00Z

pandas/tests/test_index.py

@@ -938,6 +938,10 @@ def test_difference(self):
        self.assertEqual(len(result), 0)
        self.assertEqual(result.name, first.name)

+        # empty difference for subtypes
+        result = self.periodIndex.difference(self.periodIndex)
+        self.assertIsInstance(result, pd.PeriodIndex)


use tm.assert_index_equal(...) here (and directly construct the expected)

jreback · 2015-07-16T17:00:48Z

pandas/tests/test_index.py

+            # GH 10596 - empty difference retains index's type
+
+            result = idx.difference(idx)
+            self.assertIsInstance(result, type(idx))


so the way to tthis generically is define a create_empty method analagous to create_index for each subclass which has is an empty version of create_index.

max-sixty · 2015-07-17T19:27:44Z

@jreback I'm having some issues with comparing empty Indexes. For Timedelta, Datetime & Categorical, I get the equivalent of this:

AssertionError: TimedeltaIndex([], dtype='timedelta64[ns]', freq='D') != TimedeltaIndex([], dtype='timedelta64[ns]', freq='D')

I'd love to take a look at the wider problem in the future; in the meantime is there a way of getting this fix in? Is there a strong reason you don't like the type test rather than the direct comparison?

jreback · 2015-07-17T19:56:07Z

no, use idx1.equals(idx2)

jreback · 2015-07-17T19:57:21Z

just asserting type papers over the meta data propogation, so its just creates a new bug, rather than fixing the existing one.

max-sixty · 2015-07-17T23:42:21Z

Great @jreback, that should be good to go. Let me know if I've missed anything. Cheers

jreback · 2015-07-17T23:53:43Z

pls add a release note in whatsnew/v0.17.0

squash to a single commit

ping when green.

max-sixty · 2015-07-19T00:44:27Z

I have a PyTables test fail - does anyone have any idea what this might be? I've dug around, but not getting anywhere fast; and have zero PyTables experience / installation.

ERROR: test_to_hdf_with_object_column_names (pandas.io.tests.test_pytables.TestHDFStore)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/pydata/pandas/pandas/io/tests/test_pytables.py", line 4678, in test_to_hdf_with_object_column_names
    df.to_hdf(path, 'df', format='table', data_columns=True)
  File "/home/travis/build/pydata/pandas/pandas/core/generic.py", line 920, in to_hdf
    return pytables.to_hdf(path_or_buf, key, self, **kwargs)
  File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 269, in to_hdf
    f(store)
  File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 264, in <lambda>
    f = lambda store: store.put(key, value, **kwargs)
  File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 826, in put
    self._write_to_group(key, value, append=append, **kwargs)
  File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 1275, in _write_to_group
    s.write(obj=value, append=append, complib=complib, **kwargs)
  File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 3798, in write
    **kwargs)
  File "/home/travis/build/pydata/pandas/pandas/io/pytables.py", line 3404, in create_axes
    axis=axis
  File "/home/travis/build/pydata/pandas/pandas/core/frame.py", line 2522, in reindex_axis
    fill_value=fill_value)
  File "/home/travis/build/pydata/pandas/pandas/core/generic.py", line 1852, in reindex_axis
    limit=limit)
  File "/home/travis/build/pydata/pandas/pandas/core/index.py", line 3062, in reindex
    if not is_categorical_dtype(target) and not target.is_unique:
  File "properties.pyx", line 34, in pandas.lib.cache_readonly.__get__ (pandas/lib.c:39442)
  File "/home/travis/build/pydata/pandas/pandas/core/index.py", line 744, in is_unique
    return self._engine.is_unique
  File "index.pyx", line 213, in pandas.index.IndexEngine.is_unique.__get__ (pandas/index.c:4392)
  File "index.pyx", line 248, in pandas.index.IndexEngine._do_unique_check (pandas/index.c:4874)
  File "index.pyx", line 261, in pandas.index.IndexEngine._ensure_mapping_populated (pandas/index.c:5047)
  File "index.pyx", line 267, in pandas.index.IndexEngine.initialize (pandas/index.c:5135)
  File "hashtable.pyx", line 703, in pandas.hashtable.PyObjectHashTable.map_locations (pandas/hashtable.c:11242)
ValueError: Buffer dtype mismatch, expected 'Python object' but got 'double'

jreback · 2015-07-19T01:11:33Z

what are the columns on the df
(this is the point of the test - you can only have strong columns not numbers and such)

max-sixty · 2015-07-19T01:28:16Z

The columns are either strings or categoricals - that's the line this fails on.
If you cut through to what this issue is, that'd be awesome. Otherwise I can try and install PyTables and debug from the bottom.

        types_should_run = [ tm.makeStringIndex, tm.makeCategoricalIndex ]
...
        for index in types_should_run:
            df = DataFrame(np.random.randn(10, 2), columns=index(2))
            with ensure_clean_path(self.path) as path:
                df.to_hdf(path, 'df', format='table', data_columns=True)
                result = pd.read_hdf(path, 'df', where="index = [{0}]".format(df.index[0]))
                assert(len(result))

jreback · 2015-07-20T12:33:48Z

doc/source/whatsnew/v0.17.0.txt

@@ -344,7 +344,7 @@ Bug Fixes
 - Bug in ``ExcelReader`` when worksheet is empty (:issue:`6403`)
 - Bug in ``Table.select_column`` where name is not preserved (:issue:`10392`)
 - Bug in ``offsets.generate_range`` where ``start`` and ``end`` have finer precision than ``offset`` (:issue:`9907`)
-
+- Bug in ``Subclasses of Index with no values returned Index objects rather than their own classes, in some cases`` (:issue:`10596`)


use the double-back ticks Index only here (otherwise you are quoting the entire string).

jreback added Indexing Related to indexing on series/frames, not to indexes themselves Period Period data type Compat pandas objects compatability with Numpy or Python functions labels Jul 16, 2015

jreback added this to the 0.17.0 milestone Jul 16, 2015

jreback reviewed Jul 16, 2015
View reviewed changes

max-sixty force-pushed the master branch from 19581a7 to 4f9d01a Compare July 16, 2015 16:47

jreback reviewed Jul 16, 2015
View reviewed changes

max-sixty force-pushed the master branch from dd33a1d to c8ffc37 Compare July 17, 2015 19:10

max-sixty force-pushed the master branch 3 times, most recently from 767f716 to d1e478f Compare July 17, 2015 23:41

max-sixty force-pushed the master branch 3 times, most recently from d64c90b to beab262 Compare July 18, 2015 22:41

max-sixty force-pushed the master branch from beab262 to 4e6afb6 Compare July 19, 2015 01:34

max-sixty mentioned this pull request Jul 19, 2015

Drop & insert on subtypes of index return their subtypes #10620

Closed

jreback reviewed Jul 20, 2015
View reviewed changes

max-sixty closed this Jul 24, 2015

max-sixty force-pushed the master branch from 4e6afb6 to ebea3a3 Compare July 24, 2015 07:34

max-sixty mentioned this pull request Jul 28, 2015

Empty subtypes of Index return their type, rather than Index #10687

Closed

jreback mentioned this pull request Jul 28, 2015

Methods on an PeriodIndex that return an empty set don't return a PeriodIndex object #10596

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty subtypes of Index return their type, rather than Index #10599

Empty subtypes of Index return their type, rather than Index #10599

max-sixty commented Jul 16, 2015

jreback Jul 16, 2015

jreback Jul 16, 2015

max-sixty commented Jul 17, 2015

jreback commented Jul 17, 2015

jreback commented Jul 17, 2015

max-sixty commented Jul 17, 2015

jreback commented Jul 17, 2015

max-sixty commented Jul 19, 2015

jreback commented Jul 19, 2015

max-sixty commented Jul 19, 2015

jreback Jul 20, 2015

Empty subtypes of Index return their type, rather than Index #10599

Empty subtypes of Index return their type, rather than Index #10599

Conversation

max-sixty commented Jul 16, 2015

jreback Jul 16, 2015

Choose a reason for hiding this comment

jreback Jul 16, 2015

Choose a reason for hiding this comment

max-sixty commented Jul 17, 2015

jreback commented Jul 17, 2015

jreback commented Jul 17, 2015

max-sixty commented Jul 17, 2015

jreback commented Jul 17, 2015

max-sixty commented Jul 19, 2015

jreback commented Jul 19, 2015

max-sixty commented Jul 19, 2015

jreback Jul 20, 2015

Choose a reason for hiding this comment