Skip to content

TST: test failures on 3.4/windows for timezones #7420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jun 10, 2014 · 17 comments · Fixed by #7478
Closed

TST: test failures on 3.4/windows for timezones #7420

jreback opened this issue Jun 10, 2014 · 17 comments · Fixed by #7478
Labels
Testing pandas testing functions or related to the test suite Timezones Timezone data dtype
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jun 10, 2014

 ======================================================================
ERROR: test_string_index_alias_tz_aware (pandas.tseries.tests.test_timezones.TestTimeZoneSupportDateutil)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py", line 561, in test_string_index_alias_tz_aware
    self.assertAlmostEqual(result, ts[2])
  File "C:\Python34-64\lib\unittest\case.py", line 818, in assertAlmostEqual
    if first == second:
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\core\generic.py", line 692, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

======================================================================
FAIL: test_series_frame_tz_convert (pandas.tseries.tests.test_timezones.TestTimeZones)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py", line 870, in test_series_frame_tz_convert
    assert_frame_equal(result, expected.T)
  File "c:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\util\testing.py", line 578, in assert_frame_equal
    assert col in right
AssertionError

----------------------------------------------------------------------
Ran 7330 tests in 533.475s

FAILED (SKIP=220, errors=1, failures=1)
@jreback jreback added this to the 0.14.1 milestone Jun 10, 2014
@jreback
Copy link
Contributor Author

jreback commented Jun 10, 2014

cc @dbew

good news is all other versions are ok!

@jreback
Copy link
Contributor Author

jreback commented Jun 10, 2014

this is python 34-32 and 34-64

@dbew
Copy link
Contributor

dbew commented Jun 11, 2014

Thanks. We're getting there.

Just looking at the test code, I'm surprised by the error for test_string_index_alias_tz_aware. The test should be comparing two floats, so I don't understand why it's getting a Series.

I can't compile pandas on python 3.4 on windows right now - I'm getting linker errors. I've tried fiddling around with the settings but no luck yet. I've been using anaconda + mingw so far but googling suggests that the platform sdk would be better. I can't get that set up at work though. I'll try and find some time when I get home this evening.

Btw, I've been doing the coding for this dateutil integration at work and the company I work for would like to have their contribution recognized in the release notes. I'm aware that this might not be the place to ask but I couldn't find a better place on github. Do you know if that's possible?

@jreback
Copy link
Contributor Author

jreback commented Jun 11, 2014

Heres a copy of the scripts I use on windows. I found installing the windows SDK was easier/straightforward that mingw. https://github.com/pydata/pandas/tree/master/scripts/windows_builder

@jreback
Copy link
Contributor Author

jreback commented Jun 11, 2014

@dbew I don' t have a problem with company mention (and you too!). Something like 'contributed by David Bew and blah blah....`.

@jreback
Copy link
Contributor Author

jreback commented Jun 11, 2014

here's some debugging info

======================================================================
FAIL: test_convert_datetime_list (pandas.tseries.tests.test_timezones.TestTimeZoneSupportPytz)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py", line 696, in test_convert_datetime_list
    self.assertTrue(dr.equals(dr2))
AssertionError: False is not true

======================================================================
FAIL: test_utc_box_timestamp_and_localize (pandas.tseries.tests.test_timezones.TestTimeZoneSupportPytz)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py", line 357, in test_utc_box_timestamp_and_localize
    self.assert_('EDT' in repr(rng_eastern[0].tzinfo) or 'tzfile' in repr(rng_eastern[0].tzinfo))
AssertionError: False is not true

======================================================================
FAIL: test_with_tz (pandas.tseries.tests.test_timezones.TestTimeZoneSupportPytz)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py", line 407, in test_with_tz
    self.assertIs(central[0].tz, comp)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\util\testing.py", line 96, in assertIs
    assert a is b, "%s: %r is not %r" % (msg.format(a,b), a, b)
AssertionError: : <DstTzInfo 'US/Central' CDT-1 day, 19:00:00 DST> is not <DstTzInfo 'US/Central' CST-1 day, 18:00:00 STD>

======================================================================
FAIL: test_series_frame_tz_convert (pandas.tseries.tests.test_timezones.TestTimeZones)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py", line 870, in test_series_frame_tz_convert
    assert_frame_equal(result, expected.T)
  File "C:\Users\Jeff Reback\Documents\GitHub\pandas\build\lib.win-amd64-3.4\pandas\util\testing.py", line 578, in assert_frame_equal
    assert col in right
AssertionError

----------------------------------------------------------------------
Ran 126 tests in 0.931s

FAILED (SKIP=1, failures=4)

C:\Users\Jeff Reback\Documents\GitHub\pandas>c:\python34-64\Scripts\nosetests.exe build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py --pdb --pdb-failure
....................................................S...> c:\python34-64\lib\unittest\case.py(651)assertTrue()
-> raise self.failureException(msg)
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py(696)test_convert_datetime_list()
-> self.assertTrue(dr.equals(dr2))
(Pdb) l
691
692         def test_convert_datetime_list(self):
693             dr = date_range('2012-06-02', periods=10, tz=self.tzstr('US/Eastern'))
694
695             dr2 = DatetimeIndex(list(dr), name='foo')
696  ->         self.assertTrue(dr.equals(dr2))
697             self.assertEqual(dr.tz, dr2.tz)
698             self.assertEqual(dr2.name, 'foo')
699
700         def test_frame_from_records_utc(self):
701             rec = {'datum': 1.5,
(Pdb) p df
*** NameError: name 'df' is not defined
(Pdb) p dr
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-06-02 00:00:00-05:00, ..., 2012-06-11 00:00:00-05:00]
Length: 10, Freq: D, Timezone: US/Eastern
(Pdb) p dr2
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-06-02 01:00:00-05:00, ..., 2012-06-11 01:00:00-05:00]
Length: 10, Freq: None, Timezone: US/Eastern
(Pdb) df2.inferred_freq
*** NameError: name 'df2' is not defined
(Pdb) dr2.inferred_freq
'D'
(Pdb) c
F...............................................> c:\python34-64\lib\unittest\case.py(651)assertTrue()
-> raise self.failureException(msg)
(Pdb) u
> c:\python34-64\lib\unittest\case.py(1287)deprecated_func()
-> return original_func(*args, **kwargs)
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py(357)test_utc_box_timestamp_and_localize()
-> self.assert_('EDT' in repr(rng_eastern[0].tzinfo) or 'tzfile' in repr(rng_eastern[0].tzinfo))
(Pdb) l
352             # right tzinfo
353             rng = date_range('3/13/2012', '3/14/2012', freq='H', tz='utc')
354             rng_eastern = rng.tz_convert(self.tzstr('US/Eastern'))
355             # test not valid for dateutil timezones.
356             # self.assertIn('EDT', repr(rng_eastern[0].tzinfo))
357  ->         self.assert_('EDT' in repr(rng_eastern[0].tzinfo) or 'tzfile' in repr(rng_eastern[0].tzinfo))
358
359         def test_timestamp_tz_convert(self):
360             strdates = ['1/1/2012', '3/1/2012', '4/1/2012']
361             idx = DatetimeIndex(strdates, tz=self.tzstr('US/Eastern'))
362
(Pdb) p repr(rng_eastern[0].tzinfo)
"<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>"
(Pdb) p repr(rng_eastern[0].tzinfo)
"<DstTzInfo 'US/Eastern' EST-1 day, 19:00:00 STD>"
(Pdb) c
F..> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\util\testing.py(96)assertIs()
-> assert a is b, "%s: %r is not %r" % (msg.format(a,b), a, b)
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py(407)test_with_tz()
-> self.assertIs(central[0].tz, comp)
(Pdb) u
> c:\python34-64\lib\unittest\case.py(574)run()
-> testMethod()
(Pdb) d
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py(407)test_with_tz()
-> self.assertIs(central[0].tz, comp)
(Pdb) l
402
403             # normalized
404             central = dr.tz_convert(tz)
405             self.assertIs(central.tz, tz)
406             comp = self.localize(tz, central[0].to_pydatetime().replace(tzinfo=None)).tzinfo
407  ->         self.assertIs(central[0].tz, comp)
408
409             # compare vs a localized tz
410             comp = self.localize(tz, dr[0].to_pydatetime().replace(tzinfo=None)).tzinfo
411             self.assertIs(central[0].tz, comp)
412
(Pdb) p centra[0].tz
*** NameError: name 'centra' is not defined
(Pdb) p central[0].tz
<DstTzInfo 'US/Central' CDT-1 day, 19:00:00 DST>
(Pdb) p comp
<DstTzInfo 'US/Central' CST-1 day, 18:00:00 STD>
(Pdb) c
F.............> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\util\testing.py(578)assert_frame_equal()
-> assert col in right
(Pdb) u
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py(870)test_series_frame_tz_convert()
-> assert_frame_equal(result, expected.T)
(Pdb) u
> c:\python34-64\lib\unittest\case.py(574)run()
-> testMethod()
(Pdb) d
> c:\users\jeff reback\documents\github\pandas\build\lib.win-amd64-3.4\pandas\tseries\tests\test_timezones.py(870)test_series_frame_tz_convert()
-> assert_frame_equal(result, expected.T)
(Pdb) l
865             assert_frame_equal(result, expected)
866
867             df = df.T
868             result = df.tz_convert('Europe/Berlin', axis=1)
869             self.assertEqual(result.columns.tz.zone, 'Europe/Berlin')
870  ->         assert_frame_equal(result, expected.T)
871
872             # can't convert tz-naive
873             rng = date_range('1/1/2011', periods=200, freq='D')
874             ts = Series(1, index=rng)
875             assertRaisesRegexp(TypeError, "Cannot convert tz-naive", ts.tz_convert, 'US/Eastern')
(Pdb) p result
   2011-01-01 06:00:00+02:00  2011-01-02 06:00:00+02:00  \
a                          1                          1

   2011-01-03 06:00:00+02:00  2011-01-04 06:00:00+02:00  \
a                          1                          1

   2011-01-05 06:00:00+02:00  2011-01-06 06:00:00+02:00  \
a                          1                          1

   2011-01-07 06:00:00+02:00  2011-01-08 06:00:00+02:00  \
a                          1                          1

   2011-01-09 06:00:00+02:00  2011-01-10 06:00:00+02:00  \
a                          1                          1

             ...              2011-07-10 06:00:00+01:00  \
a            ...                                      1

   2011-07-11 06:00:00+01:00  2011-07-12 06:00:00+01:00  \
a                          1                          1

   2011-07-13 06:00:00+01:00  2011-07-14 06:00:00+01:00  \
a                          1                          1

   2011-07-15 06:00:00+01:00  2011-07-16 06:00:00+01:00  \
a                          1                          1

   2011-07-17 06:00:00+01:00  2011-07-18 06:00:00+01:00  \
a                          1                          1

   2011-07-19 06:00:00+01:00
a                          1

[1 rows x 200 columns]
(Pdb) p expected.T
   2011-01-01 06:00:00+02:00  2011-01-02 06:00:00+02:00  \
a                          1                          1

   2011-01-03 06:00:00+02:00  2011-01-04 06:00:00+02:00  \
a                          1                          1

   2011-01-05 06:00:00+02:00  2011-01-06 06:00:00+02:00  \
a                          1                          1

   2011-01-07 06:00:00+02:00  2011-01-08 06:00:00+02:00  \
a                          1                          1

   2011-01-09 06:00:00+02:00  2011-01-10 06:00:00+02:00  \
a                          1                          1

             ...              2011-07-10 06:00:00+01:00  \
a            ...                                      1

   2011-07-11 06:00:00+01:00  2011-07-12 06:00:00+01:00  \
a                          1                          1

   2011-07-13 06:00:00+01:00  2011-07-14 06:00:00+01:00  \
a                          1                          1

   2011-07-15 06:00:00+01:00  2011-07-16 06:00:00+01:00  \
a                          1                          1

   2011-07-17 06:00:00+01:00  2011-07-18 06:00:00+01:00  \
a                          1                          1

   2011-07-19 06:00:00+01:00
a                          1

[1 rows x 200 columns]
(Pdb)

@dbew
Copy link
Contributor

dbew commented Jun 12, 2014

How come there are four failures this time and test_string_index_alias_tz_aware isn't among them?

I made some progress with py3.4 - I can install cython successfully - but didn't quite get pandas building. I'll try again this evening.

@jreback
Copy link
Contributor Author

jreback commented Jun 12, 2014

those other ones might be suprious.....

@jreback
Copy link
Contributor Author

jreback commented Jun 14, 2014

@dbew ?

@dbew
Copy link
Contributor

dbew commented Jun 16, 2014

Now have pandas compiled on python3.4 on windows. This is a bit weird.

If I run nosetests for the whole package then I get the original two failures. If I run the tests for just the test files in pandas/tseries/tests or even for the single file pandas/tseries/tests/test_timezones.py then I get 4 completely different test failures.

None of the tests are failing if I run them one at a time with nose and working through the original two failures in ipython, I don't see any problems with the results. As far as I can tell, none of these tests should be failing.

I'll keep digging but if you have any ideas of what could be wrong I'd be pleased to hear them!

@jreback
Copy link
Contributor Author

jreback commented Jun 16, 2014

hmm...maybe some sort of global/module state is being modified in the tests?

@dbew
Copy link
Contributor

dbew commented Jun 16, 2014

I guess it must be - but why is that only a problem on python 3.4 on windows?

If there was a modification to the pandas global state which is breaking the tests, I'd expect that to be a problem across windows versions (and on linux, for that matter).

@jreback
Copy link
Contributor Author

jreback commented Jun 16, 2014

@dbew agreed this is odd. But then again its possible dateutil is doing something internally that is 'broken' in 3.4 in the speicifc cases we are testing. not sure.

3.4. DOES break some other very subtle things (which might be techinically incorrectly done in prior versions).

@dbew
Copy link
Contributor

dbew commented Jun 16, 2014

Ok, like I said, I'll keep digging and see if I can replicate this in a smaller example.

@dbew
Copy link
Contributor

dbew commented Jun 16, 2014

I now have a nice small example that fails in ipython!

from pandas import date_range, DatetimeIndex
dr = date_range('2012-06-02', periods=10, tz='dateutil/US/Eastern')
dr2 = DatetimeIndex(list(dr), name='foo')
dr.equals(dr2)
OUT: True
dr = date_range('2012-06-02', periods=10, tz='US/Eastern')
dr2 = DatetimeIndex(list(dr), name='foo')
dr.equals(dr2)
OUT: False

@jreback
Copy link
Contributor Author

jreback commented Jun 16, 2014

hmm, the 2nd just fails for a trivial reason, e.g .the frequency is not inferred (but if you look at dr2.inferred_freq it is correct). This frequncy is not always inferred on creation as it is only done on demand. though I think in this case it should be, don't really remember

@dbew
Copy link
Contributor

dbew commented Jun 16, 2014

I've found the problem. I'll submit a PR shortly.

The issue was cache collisions between pytz and dateutil timezones in the transition/utc_offset caches in pandas.tslib.

I've modified the cache key so that it distinguishes between the two and added a test to ensure that this is the case. I think the intermittent nature was due to the order of use - e.g. using pytz before dateutil was ok because of the nature of the differences between them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants