-
Notifications
You must be signed in to change notification settings - Fork 107
Type hints #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type hints #61
Conversation
@@ -7,16 +7,19 @@ | |||
import re | |||
import six | |||
from six import moves | |||
from typing import AnyStr, Optional, Iterable, Tuple, Union, Sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get off annoying bot, these imports are used. You're not aware of PEP-484.
Current coverage is 94.07%
|
def read_bom(data): | ||
# type: (bytes) -> Tuple[str, bytes] | ||
r"""Read the byte order mark in the text, if present, and | ||
return the encoding represented by the BOM and the BOM. | ||
|
||
If no BOM can be detected, ``(None, None)`` is returned. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nitpick - this is not reflected by the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah; currently mypy assumes that None is a valid value for any type, so Tuple[str, bytes]
allows None, None
. I recall using Union[Tuple[str, bytes], Tuple[None, None]]
in some other place, but I'm not sure it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, didn't know that, thanks! mypy behavior makes sense for tuple, although not that I would expect it in general :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They have docs for this: http://mypy.readthedocs.io/en/latest/planned_features.html. Let me try Union again :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it returns a weird error for this line if annotation is changed to Union:
bom_enc, bom = read_bom(html_body_str)
w3lib/encoding.py: note: In function "html_to_unicode":
w3lib/encoding.py:274: error: 'Union[Tuple[builtins.str, builtins.bytes], Tuple[void, void]]' object is not iterable
It looks related to Union handling, not to Tuple[None, None] handling because it also fails if Tuple[str, str] is used as a second option for Union.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So mypy does not see that all union "kinds" here are iterable, that's a pity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
relevant mypy issue: python/mypy#1575
thanks @lopuhin for the catch
@@ -0,0 +1,25 @@ | |||
# -*- coding: utf-8 -*- | |||
""" | |||
Which string type to use? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
if we want to avoid the runtime dependency, we could use a hack like:
it'd make the imports kinda ugly, tho. |
FTR: I'll continue this PR soon. |
FTR: There was a lot of new development in past 2 years on typing. Now type annotations from third-party libraries can be even more useful, as mypy implemented support for https://www.python.org/dev/peps/pep-0561/ - it is possible to check user code against annotations of third-party libraries. |
Bumping to close outdated PR. |
I've tried to add PEP-484 type hints in this PR (checked using mypy). They showed how tricky are our string types :) For each string there are several options:
There are several interesting points revealed by these type checks, e.g. that
html_body_declared_encoding
only worked in Python 3 because stdlibencodings.normalize_encoding
accept binary values (I'm not sure this stdlib module is even documented).Update: this PR is rather useless for the end users because of python/mypy#1190; it also adds
typing
runtime dependency which could be good to avoid. Switch to stub files?