Dataframe rename issue. #4403

halleygithub · 2013-07-30T03:03:33Z

I just upgrage from 0.11 to 0.12 version. And meet dataframe rename error caused by upgrading. (It worked well in 0.11) .

>>> df4
                 TClose      RT    TExg
STK_ID RPT_Date                        
600809 20130331   22.02  0.0454  0.0422

>>> df5
                 STK_ID  RPT_Date STK_Name  TClose
STK_ID RPT_Date                                   
600809 20120930  600809  20120930     山西汾酒   38.05
       20121231  600809  20121231     山西汾酒   41.66
       20130331  600809  20130331     山西汾酒   30.01

>>> k=pd.merge(df4, df5, how='inner', left_index=True, right_index=True)
>>> k
                 TClose_x      RT    TExg  STK_ID  RPT_Date STK_Name  TClose_y
STK_ID RPT_Date                                                               
600809 20130331     22.02  0.0454  0.0422  600809  20130331     山西汾酒     30.01

>>> k.rename(columns={'TClose_x':'TClose', 'TClose_y':'QT_Close'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "d:\Python27\lib\site-packages\pandas\core\base.py", line 40, in __repr__
    return str(self)
  File "d:\Python27\lib\site-packages\pandas\core\base.py", line 20, in __str__
    return self.__bytes__()
  File "d:\Python27\lib\site-packages\pandas\core\base.py", line 32, in __bytes__
    return self.__unicode__().encode(encoding, 'replace')
  File "d:\Python27\lib\site-packages\pandas\core\frame.py", line 668, in __unicode__
    self.to_string(buf=buf)
  File "d:\Python27\lib\site-packages\pandas\core\frame.py", line 1556, in to_string
    formatter.to_string()
  File "d:\Python27\lib\site-packages\pandas\core\format.py", line 294, in to_string
    strcols = self._to_str_columns()
  File "d:\Python27\lib\site-packages\pandas\core\format.py", line 239, in _to_str_columns
    str_columns = self._get_formatted_column_labels()
  File "d:\Python27\lib\site-packages\pandas\core\format.py", line 435, in _get_formatted_column_labels
    dtypes = self.frame.dtypes
  File "d:\Python27\lib\site-packages\pandas\core\frame.py", line 1696, in dtypes
    return self.apply(lambda x: x.dtype)
  File "d:\Python27\lib\site-packages\pandas\core\frame.py", line 4416, in apply
    return self._apply_standard(f, axis)
  File "d:\Python27\lib\site-packages\pandas\core\frame.py", line 4491, in _apply_standard
    raise e
TypeError: ("'NoneType' object is not iterable", u'occurred at index TExg')

>>> df4.dtypes
TClose    float64
RT        float64
TExg      float64
dtype: object

>>> df5.dtypes
STK_ID       object
RPT_Date     object
STK_Name     object
TClose      float64
dtype: object
>>>

The text was updated successfully, but these errors were encountered:

jreback · 2013-07-30T14:34:45Z

can you supply a reproducible for these initial frames (e.g. a function which does it exactly)

e.g. something that can be evaled to created it because need to reproduce the unicode characters
(this is a unicode error), just happens to show up in the dtype printing

DataFrame([['foo',1.0....])

cpcloud · 2013-07-30T21:21:19Z

i think that's a possibly spurious raise there...it should probably be a bare raise since NoneType not being iterable is not informative

cpcloud · 2013-07-30T21:55:17Z

i can repro this using the above frames

@halleygithub please supply some code to create the above frames.

there's a bug in icol or BlockManager.iget

cpcloud · 2013-07-30T22:00:00Z

ahh duplicate TExg block somehow...

cpcloud · 2013-07-30T22:00:32Z

we really need to remove that raise e there that's only way i was able to figure out this was in internals

jreback · 2013-07-30T22:02:31Z

no that raise is correct

just str(df)

cpcloud · 2013-07-30T22:09:47Z

huh? the raise doesn't show the correct location of the exception because it catches everything

here's part of the traceback

/home/phillip/Documents/code/py/pandas/pandas/core/frame.pyc in dtypes(self)
   1685     @property
   1686     def dtypes(self):
-> 1687         return self.apply(lambda x: x.dtype)
   1688
   1689     def convert_objects(self, convert_dates=True, convert_numeric=False, copy=True):

/home/phillip/Documents/code/py/pandas/pandas/core/frame.pyc in apply(self, func, axis, broadcast, raw, args, **kwds)
   4397                     return self._apply_raw(f, axis)
   4398                 else:
-> 4399                     return self._apply_standard(f, axis)
   4400             else:
   4401                 return self._apply_broadcast(f, axis)

/home/phillip/Documents/code/py/pandas/pandas/core/frame.pyc in _apply_standard(self, func, axis, ignore_failures)
   4472                     # no k defined yet
   4473                     pass
-> 4474                 raise e
   4475
   4476

TypeError: ("'NoneType' object is not iterable", u'occurred at index TExg')

this doesn't tell me anything about the location of the raise except that it was somewhere in looping thru series_gen

only when i removed e did the full traceback show up

maybe there's a way to show that without removing the e...

how would it be different anyway? would the possibly caught NameError / UnboundLocalError be raised instead?

cpcloud · 2013-07-30T22:11:51Z

In [4]: df4 = DataFrame({'TClose': [22.02], 'RT': [0.0454], 'TExg': [0.0422]}, index=MultiIndex.from_tuples([(600809, 20130331)], names=['STK_ID', 'RPT_Date']))

In [5]: df5 = DataFrame({'STK_ID': [600809] * 3, 'RPT_Date': [20120930,20121231,20130331], 'STK_Name': [u'饡驦', u'饡驦', u'饡驦'], 'TClose': [38.05, 41.66, 30.01]},index=MultiIndex.from_tuples([(600809, 20120930
), (600809, 20121231),(600809,20130331)], names=['STK_ID', 'RPT_Date']))

In [6]: k = merge(df4,df5,how='inner',left_index=True,right_index=True)

different characters but same error results.

cpcloud · 2013-07-30T22:12:54Z

curiously if you type store k then restart ipython, type store -r k and then

k.rename(columns={'TClose_x':'TClose'})

the error does not show up 😠

jreback · 2013-07-30T22:13:11Z

I think there is a pr out there to take out the e

but regardless the apply hits the error but its really in the construction

can u post your creation example?

cpcloud · 2013-07-30T22:14:33Z

it's there

cpcloud · 2013-07-30T22:21:44Z

this seems fishy

ipdb> self.items
Index([u'RT', u'TClose', u'TExg', u'RPT_Date', u'STK_ID', u'STK_Name', u'TClose_y'], dtype=object)
ipdb> self.blocks
[ObjectBlock: [TExg], 1 x 1, dtype object, IntBlock: [RT, TClose], 2 x 1, dtype int64, FloatBlock: [RT, TClose, TExg, TClose_y], 4 x 1, dtype float64]

where is RPT_Date in the blocks?

jreback · 2013-07-31T01:06:46Z

@halleygithub thanks for the report
turned out to be a very subtle issue

halleygithub · 2013-07-31T02:09:53Z

I attach the cPickle dump file of (df4, df5) here : http://ajqznkugcw.l25.yunpan.cn/lk/QnPqhJRCMdspq

So if you want, you can download it to take a check .

It seems that the issue is solved. So How can I resolve my probelm ? Can I have the latest daily development builds of the pandas windows binaries from http://pandas.pydata.org/pandas-build/dev/ ?

My application did meet several issues after upgrading and need to test one by one. Thanks.

halleygithub · 2013-07-31T04:36:17Z

OK. I manually revise the merge.py and get thing run. Still expect binary builds. Thanks,

jreback · 2013-07-31T10:57:35Z

Great
periodically check back for the dev builds

smcinerney · 2013-08-15T23:09:28Z

Can you please add this as a known-issue in the 0.12 whatsnew? along with the DeprecationWarnings?

cpcloud · 2013-08-15T23:14:24Z

We can add it in the dev docs, but i'm pretty sure things are "frozen" for 0.12 stuff

cpcloud · 2013-08-15T23:14:35Z

Would you like to submit a pull request?

jtratner · 2013-08-15T23:16:19Z

@cpcloud Could pandas do a point release? (maybe before @jreback Series' refactor).

cpcloud · 2013-08-15T23:17:58Z

possibly, although i'm not sure if we ever came to a consensus there

cc @y-p since he suggested it on the dev mailing list a little before 0.12 came out

i'm 👍 on doing a point release

@wesm ?

wesm · 2013-08-15T23:21:50Z

What's the status of master? Do we need to create a maintenance branch and start backporting bug fixes?

jreback · 2013-08-15T23:23:02Z

this is fixed in 0.12 IIRC

jreback · 2013-08-15T23:24:07Z

actually master at this point is ok if u really wanted to release

smcinerney · 2013-08-15T23:35:33Z

merge() is broken in the 0.12 macports release I got yesterday

cpcloud · 2013-08-15T23:35:59Z

Can you be a bit more specific than just "broken"? Please open an issue if you can.

jtratner · 2013-08-15T23:44:13Z

@jreback this is not fixed in 0.12. checkout of v0.12.0 and running this (on OSX) still causes the failure described above.

smcinerney · 2013-08-15T23:44:38Z

I'm saying that this issue 4403 (merge breaks on indexing) is still in the 0.12 release on macports. People will hit this and at minimum it needs to documented as a known-issue in the whatsnew, or some such. I had to manually edit the changes of pull request 4410.

jreback · 2013-08-16T01:51:55Z

@jtratner I stand corrected this was fixed early 0.13
but remains this is actually pretty hard to reproduce
you have to do very specific things to create it

IMHO this is not worth a 0.12.1 at this point
lets figure out a timeline for 0.13

smcinerney · 2013-08-16T02:03:03Z

@jreback, does it not occur on (any?) df merge with a non-unique index?

Separate to the timeline for merging the fix, I'm suggesting this be noted in the 0.12 whatsnew.

jreback · 2013-08-16T02:10:14Z

@smcinerney no this only occurs after a merge with a non unique index after the merge that u then rename

I am not averse to posting something in the docs. though I have found that people usually just ask on so, mailing list or post an issue

since everyone is now aware I think we can respond pretty easily

(even issues that have really big and bold warnings are often ignored in the docs :)

jreback mentioned this issue Jul 31, 2013

BUG: Fix an issue in merging blocks where the resulting DataFrame had partially set _ref_locs #4410

Merged

jreback closed this as completed in #4410 Jul 31, 2013

jtratner mentioned this issue Aug 15, 2013

ENH/API: Keep original traceback in DataFrame.apply #4549

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataframe rename issue. #4403

Dataframe rename issue. #4403

halleygithub commented Jul 30, 2013

jreback commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

jreback commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

jreback commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

jreback commented Jul 31, 2013

halleygithub commented Jul 31, 2013

halleygithub commented Jul 31, 2013

jreback commented Jul 31, 2013

smcinerney commented Aug 15, 2013

cpcloud commented Aug 15, 2013

cpcloud commented Aug 15, 2013

jtratner commented Aug 15, 2013

cpcloud commented Aug 15, 2013

wesm commented Aug 15, 2013

jreback commented Aug 15, 2013

jreback commented Aug 15, 2013

smcinerney commented Aug 15, 2013

cpcloud commented Aug 15, 2013

jtratner commented Aug 15, 2013

smcinerney commented Aug 15, 2013

jreback commented Aug 16, 2013

smcinerney commented Aug 16, 2013

jreback commented Aug 16, 2013

Dataframe rename issue. #4403

Dataframe rename issue. #4403

Comments

halleygithub commented Jul 30, 2013

jreback commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

jreback commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

jreback commented Jul 30, 2013

cpcloud commented Jul 30, 2013

cpcloud commented Jul 30, 2013

jreback commented Jul 31, 2013

halleygithub commented Jul 31, 2013

halleygithub commented Jul 31, 2013

jreback commented Jul 31, 2013

smcinerney commented Aug 15, 2013

cpcloud commented Aug 15, 2013

cpcloud commented Aug 15, 2013

jtratner commented Aug 15, 2013

cpcloud commented Aug 15, 2013

wesm commented Aug 15, 2013

jreback commented Aug 15, 2013

jreback commented Aug 15, 2013

smcinerney commented Aug 15, 2013

cpcloud commented Aug 15, 2013

jtratner commented Aug 15, 2013

smcinerney commented Aug 15, 2013

jreback commented Aug 16, 2013

smcinerney commented Aug 16, 2013

jreback commented Aug 16, 2013