Skip to content

UnicodeDecodeError (INTERNALERROR) when doctests contain Unicode #628

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
pytestbot opened this issue Nov 7, 2014 · 10 comments · Fixed by #1318
Closed

UnicodeDecodeError (INTERNALERROR) when doctests contain Unicode #628

pytestbot opened this issue Nov 7, 2014 · 10 comments · Fixed by #1318
Assignees
Labels
plugin: doctests related to the doctests builtin plugin type: bug problem that needs to be addressed

Comments

@pytestbot
Copy link
Contributor

Originally reported by: Jason R. Coombs (BitBucket: jaraco, GitHub: jaraco)


Consider this test:

def foo():
    """
    >>> name = 'с'
    'anything'
    """

Note that isn't the letter 'c' but instead Cyrillic 's'.

Save it as mod.py, then invoke pytest --doctest-modules. It will fail with this output:

> py.test --doctest-modules
============================= test session starts =============================

platform win32 -- Python 3.4.2 -- py-1.4.26 -- pytest-2.6.4
collected 1 items

mod.py
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\main.py", line 84, in wrap_session
INTERNALERROR>     doit(config, session)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\main.py", line 122, in _main
INTERNALERROR>     config.hook.pytest_runtestloop(session=session)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 413, in __call__
INTERNALERROR>     return self._docall(methods, kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 424, in _docall
INTERNALERROR>     res = mc.execute()
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 315, in execute
INTERNALERROR>     res = method(**kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\main.py", line 142, in pytest_runtestloop
INTERNALERROR>     item.config.hook.pytest_runtest_protocol(item=item, nextitem=nextitem)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 413, in __call__
INTERNALERROR>     return self._docall(methods, kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 424, in _docall
INTERNALERROR>     res = mc.execute()
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 315, in execute
INTERNALERROR>     res = method(**kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\runner.py", line 65, in pytest_runtest_protocol
INTERNALERROR>     runtestprotocol(item, nextitem=nextitem)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\runner.py", line 75, in runtestprotocol
INTERNALERROR>     reports.append(call_and_report(item, "call", log))
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\runner.py", line 111, in call_and_report
INTERNALERROR>     report = hook.pytest_runtest_makereport(item=item, call=call)

INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\main.py", line 167, in call_matching_hooks
INTERNALERROR>     return hookmethod.pcall(plugins, **kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 417, in pcall
INTERNALERROR>     return self._docall(methods, kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 424, in _docall
INTERNALERROR>     res = mc.execute()
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\core.py", line 315, in execute
INTERNALERROR>     res = method(**kwargs)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\runner.py", line 214, in pytest_runtest_makereport
INTERNALERROR>     longrepr = item.repr_failure(excinfo)
INTERNALERROR>   File "c:\python\lib\site-packages\pytest-2.6.4-py3.4.egg\_pytest\doctest.py", line 62, in repr_failure
INTERNALERROR>     filelines = py.path.local(filename).readlines(cr=0)
INTERNALERROR>   File "c:\python\lib\site-packages\py-1.4.26-py3.4.egg\py\_path\common.py", line 141, in readlines
INTERNALERROR>     content = self.read('rU')
INTERNALERROR>   File "c:\python\lib\site-packages\py-1.4.26-py3.4.egg\py\_path\common.py", line 135, in read
INTERNALERROR>     return f.read()
INTERNALERROR>   File "c:\python\lib\encodings\cp1252.py", line 23, in decode
INTERNALERROR>     return codecs.charmap_decode(input,self.errors,decoding_table)[0]
INTERNALERROR> UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 36: character maps to <undefined>

==============================  in 0.30 seconds ===============================

@pytestbot
Copy link
Contributor Author

Original comment by Jason R. Coombs (BitBucket: jaraco, GitHub: jaraco):


The test that elicited this error has been around for a long time in the irc library. I'm pretty sure the test worked at one time, but I even rolled back to pytest 2.5.2, and it still failed (though more silently). I suspect the issue lies in py.

@pytestbot
Copy link
Contributor Author

Original comment by Jason R. Coombs (BitBucket: jaraco, GitHub: jaraco):


Aha! The error only occurs when the doctest fails. If the doctest is passing, the error doesn't occur.

@pytestbot pytestbot added the type: bug problem that needs to be addressed label Jun 15, 2015
@pfctdayelise pfctdayelise added plugin: doctests related to the doctests builtin plugin and removed type: bug problem that needs to be addressed labels Jul 25, 2015
@pfctdayelise
Copy link
Contributor

Related to #710?

@pfctdayelise pfctdayelise added the type: bug problem that needs to be addressed label Jul 25, 2015
@RonnyPfannschmidt
Copy link
Member

Re-notifying @jaraco

@jaraco
Copy link
Contributor

jaraco commented Jul 26, 2015

I don't think it is related. 710 is about comparing matching and non-matching results and this issue is when a Unicode character not in the file system encoding (or maybe another encoding) appears in a failing doctest. I haven't tried this on a non-Windows system, so I'll do that now to see if it behaves differently or if I can identify which encoding is relevant.

@jaraco
Copy link
Contributor

jaraco commented Jul 26, 2015

On OS X, the test fails as expected:

$ py.test test.py --doctest-modules
====================================== test session starts =======================================
platform darwin -- Python 3.4.3 -- py-1.4.30 -- pytest-2.7.2
rootdir: /Users/jaraco/Dropbox/code/public/pytest, inifile: tox.ini
collected 1 items 

test.py F

============================================ FAILURES ============================================
_______________________________________ [doctest] test.foo _______________________________________
002     """
003     >>> name = 'с'
Expected:
    'anything'
Got nothing

/Users/jaraco/Dropbox/code/public/pytest/issue628/test.py:3: DocTestFailure
==================================== 1 failed in 0.08 seconds ====================================

On that system, getfilesystemencoding returns 'utf-8'.

@jaraco
Copy link
Contributor

jaraco commented Jul 26, 2015

The following standalone Python script will replicate the error message when run on Windows or any other system where locale.getpreferredencoding(False) == 'cp-1252' (ref open docs):

some_unicode = 'с'

with open('file.txt', 'wb') as binary_file:
    binary_file.write(some_unicode.encode('utf-8'))

with open('file.txt', 'r') as text_file:
    result = text_file.read()

assert some_unicode == result

@jaraco
Copy link
Contributor

jaraco commented Jul 26, 2015

It seems the issue is that something in the doctest runner is saving the output to a file using UTF-8 encoding, but then when it attempts to read that file, it defaults to locale.getpreferredencoding. The patch applied in my fork (referenced above) corrects the issue (though I suspect it may not work on Python 2 and probably deserves more review).

@RonnyPfannschmidt
Copy link
Member

@jaraco should one of us pick it up, I have @nicoddemus in mind

@jaraco
Copy link
Contributor

jaraco commented Jan 1, 2016

@RonnyPfannschmidt Yes, please - if possible. I don't have any actions or ideas pending beyond what I've mentioned above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
plugin: doctests related to the doctests builtin plugin type: bug problem that needs to be addressed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants