Skip to content

Add type annotations #883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 40 commits into from
Feb 17, 2021
Merged

Add type annotations #883

merged 40 commits into from
Feb 17, 2021

Conversation

isac322
Copy link
Contributor

@isac322 isac322 commented Nov 18, 2020

Pull Request Checklist

Thank you for taking the time to improve Arrow! Before submitting your pull request, please check all appropriate boxes:

  • 🧪 Added tests for changed code.
  • 🛠️ All tests pass when run locally (run tox or make test to find out!).
  • 🧹 All linting checks pass when run locally (run tox -e lint or make lint to find out!).
  • 📚 Updated documentation for changed code.
  • ⏩ Code is up-to-date with the master branch.

If you have any questions about your code changes or any of the points above, please submit your questions along with the pull request and we will try our best to help!

Description of Changes

  • Added type annotation to all functions and methods.
  • Add runtime dependency typing_extensions for backward compatibility with 3.6 and 3.7
  • Add mypy git pre-commit hook
  • Add mypy configuration file (setup.cfg)
  • Refactor some logics for better type annotation
    • I will create comments for these

Please review this with my comments on PR. I have questions.

Copy link
Member

@jadchaar jadchaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small comments on first glance of a few files.

Copy link
Member

@jadchaar jadchaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was finally able to get through all the files and left a final round of feedback.

Happy holidays to you @isac322 and I hope you can address the comments after the holidays or whenever you get a free moment :).

arrow/arrow.py Outdated

@property
def tzinfo(self):
def tzinfo(self) -> Optional[dt_tzinfo]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once the setter is removed, we should be able to change the return type from Optional[dt_tzinfo] to dt_tzinfo since Arrow objects are never naive.

arrow/arrow.py Outdated
Comment on lines 1604 to 1606
@overload
def __rsub__(self, other: Any) -> timedelta:
pass # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this overload? Isn't it identical to the original __rsub__ definition below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally I want to express NotImplemented path in type annotation. but currently python typing does not support this feature. (python/mypy#4791) And I found related work so I remove all overloads except __sub__.

arrow/arrow.py Outdated
Comment on lines 1632 to 1637
def __gt__(self, other: "Arrow") -> bool:
pass # pragma: no cover

@overload
def __gt__(self, other: dt_datetime) -> bool:
pass # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these overloads be combined:

    @overload
    def __gt__(self, other: Union[dt_datetime, "Arrow"]) -> bool:
        pass  # pragma: no cover

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. #883 (comment)

arrow/arrow.py Outdated
Comment on lines 1661 to 1667
@overload
def __lt__(self, other: "Arrow") -> bool:
pass # pragma: no cover

@overload
def __lt__(self, other: dt_datetime) -> bool:
pass # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Can these overloads be combined using a Union for the other type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. #883 (comment)

arrow/arrow.py Outdated
Comment on lines 1677 to 1682
def __le__(self, other: "Arrow") -> bool:
pass # pragma: no cover

@overload
def __le__(self, other: dt_datetime) -> bool:
pass # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Do let me know if these can or cannot be combined :)!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here. #883 (comment)

@slavaatsig
Copy link

Why not support PEP-563?

@jadchaar
Copy link
Member

jadchaar commented Jan 6, 2021

Why not support PEP-563?

Doesn't PEP-563 only support Python 3.7+? We must support 3.6+.

@jadchaar
Copy link
Member

Hey @isac322, any progress on addressing feedback? We really appreciate your work here, and would love to get this merged soon! This is the last big puzzle piece for version 1.0 :).

@isac322
Copy link
Contributor Author

isac322 commented Jan 13, 2021

@jadchaar Sorry for late reply. I'm currently quite busy for my marriage and company reorganization. Can I check this on last week of January?

@jadchaar
Copy link
Member

@jadchaar Sorry for late reply. I'm currently quite busy for my marriage and company reorganization. Can I check this on last week of January?

No problem at all buddy, take your time! Also, congratulations and best of luck with the marriage/reorganization :).

Copy link
Collaborator

@systemcatch systemcatch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @isac322 thanks for coming back and implementing the suggested changes. It's a big piece of work.

I think this PR is ready to be merged and any minor issues can be fixed afterwards.

@jadchaar , @krisfremen time to merge this?

@isac322
Copy link
Contributor Author

isac322 commented Feb 13, 2021

@jadchaar @systemcatch Could you please mark conversation as resolved if you satisfied on changes for future work? Because there are many conversation, it's quite hard to distinguish TODOs 😢.

@systemcatch
Copy link
Collaborator

I've gone through and marked many conversations as resolved, the few remaining ones need Jad's input.

@jadchaar
Copy link
Member

I've gone through and marked many conversations as resolved, the few remaining ones need Jad's input.

Fantastic and thanks for the hard work @isac322, we really do appreciate it. I will give this a good final look either today or tomorrow!

@jadchaar jadchaar self-requested a review February 14, 2021 07:05
Copy link
Member

@jadchaar jadchaar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I had a few final comments though.

Thanks for your patience @isac322! I have been trying to integrate typing into my own projects recently, but your typing skillset is much more advanced so it took some time for me to understand it :).


return cls._FORMAT_RE.sub(lambda m: cls._format_token(dt, m.group(0)), fmt)
# FIXME: _format_token() is nullable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isac322 is this FIXME good to be removed?


def _format_token(self, dt, token):
def _format_token(self, dt: datetime, token: Optional[str]) -> Optional[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what case can _format_token return None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jadchaar If token is None or any str that does not match with if conditions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these FIXMEs, if left unresolved, cause issues for users in regular use? When will the nullable issues you mark with the FIXME label cause problems?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be. but because I didn't change actual behavior of formatter.py, this PR will not introduce new error to formatter.py.
BTW, I can make token as str because Match.group() always return AnyStr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can assure that token always match one branch in _format_token, I have to narrow down type of token to some Literal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I run the code with the raise ValueError at the end of _format_token, the only test that fails in the test suite is test_nonsense in test_formatter.py:

====================================================================================================================================================== FAILURES ======================================================================================================================================================
_______________________________________________________________________________________________________________________________________ TestFormatterFormatToken.test_nonsense _______________________________________________________________________________________________________________________________________

self = <tests.test_formatter.TestFormatterFormatToken object at 0x1034c1d00>

    def test_nonsense(self):
        dt = datetime(2012, 1, 1, 11)
>       assert self.formatter._format_token(dt, None) is None

tests/test_formatter.py:166: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <arrow.formatter.DateTimeFormatter object at 0x1025cf670>, dt = datetime.datetime(2012, 1, 1, 11, 0), token = None

    def _format_token(self, dt: datetime, token: Optional[str]) -> Optional[str]:
    
        if token and token.startswith("[") and token.endswith("]"):
            return token[1:-1]
    
        if token == "YYYY":
            return self.locale.year_full(dt.year)
        if token == "YY":
            return self.locale.year_abbreviation(dt.year)
    
        if token == "MMMM":
            return self.locale.month_name(dt.month)
        if token == "MMM":
            return self.locale.month_abbreviation(dt.month)
        if token == "MM":
            return f"{dt.month:02d}"
        if token == "M":
            return f"{dt.month}"
    
        if token == "DDDD":
            return f"{dt.timetuple().tm_yday:03d}"
        if token == "DDD":
            return f"{dt.timetuple().tm_yday}"
        if token == "DD":
            return f"{dt.day:02d}"
        if token == "D":
            return f"{dt.day}"
    
        if token == "Do":
            return self.locale.ordinal_number(dt.day)
    
        if token == "dddd":
            return self.locale.day_name(dt.isoweekday())
        if token == "ddd":
            return self.locale.day_abbreviation(dt.isoweekday())
        if token == "d":
            return f"{dt.isoweekday()}"
    
        if token == "HH":
            return f"{dt.hour:02d}"
        if token == "H":
            return f"{dt.hour}"
        if token == "hh":
            return f"{dt.hour if 0 < dt.hour < 13 else abs(dt.hour - 12):02d}"
        if token == "h":
            return f"{dt.hour if 0 < dt.hour < 13 else abs(dt.hour - 12)}"
    
        if token == "mm":
            return f"{dt.minute:02d}"
        if token == "m":
            return f"{dt.minute}"
    
        if token == "ss":
            return f"{dt.second:02d}"
        if token == "s":
            return f"{dt.second}"
    
        if token == "SSSSSS":
            return f"{dt.microsecond:06d}"
        if token == "SSSSS":
            return f"{dt.microsecond // 10:05d}"
        if token == "SSSS":
            return f"{dt.microsecond // 100:04d}"
        if token == "SSS":
            return f"{dt.microsecond // 1000:03d}"
        if token == "SS":
            return f"{dt.microsecond // 10000:02d}"
        if token == "S":
            return f"{dt.microsecond // 100000}"
    
        if token == "X":
            return f"{dt.timestamp()}"
    
        if token == "x":
            return f"{dt.timestamp() * 1_000_000:.0f}"
    
        if token == "ZZZ":
            return dt.tzname()
    
        if token in ["ZZ", "Z"]:
            separator = ":" if token == "ZZ" else ""
            tz = dateutil_tz.tzutc() if dt.tzinfo is None else dt.tzinfo
            # `dt` must be aware object. Otherwise, this line will raise AttributeError
            # https://github.com/arrow-py/arrow/pull/883#discussion_r529866834
            # datetime awareness: https://docs.python.org/3/library/datetime.html#aware-and-naive-objects
            total_minutes = int(cast(timedelta, tz.utcoffset(dt)).total_seconds() / 60)
    
            sign = "+" if total_minutes >= 0 else "-"
            total_minutes = abs(total_minutes)
            hour, minute = divmod(total_minutes, 60)
    
            return f"{sign}{hour:02d}{separator}{minute:02d}"
    
        if token in ("a", "A"):
            return self.locale.meridian(dt.hour, token)
    
        if token == "W":
            year, week, day = dt.isocalendar()
            return f"{year}-W{week:02d}-{day}"
    
    
>       raise ValueError("FOO BAR")
E       ValueError: FOO BAR

arrow/formatter.py:150: ValueError

I don't know when sub would ever pass None to _format_token though... this test case is directly forcing None into the _format_token function.


@isac322 I think a possible solution here could be to simply put a final case in the fall through to just return the empty string. This will take care of all these cases that do not match any of the conditionals in a sane way. It will also allow us to make the return type str rather than Optional[str]. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jadchaar Sounds good! I will add final return statement and remove Optional.

Copy link
Contributor Author

@isac322 isac322 Feb 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jadchaar mypy still complains about Optional.

arrow/formatter.py:125: error: Incompatible return value type (got "Optional[str]", expected "str")
arrow/formatter.py:142: error: Incompatible return value type (got "Optional[str]", expected "str")

First one is return dt.tzname() case.
I know that Arrow always store non naive datetime now. but because DateTimeFormatter is public, someone could pass naive datetime object to DateTimeFormatter.format() and in that case DateTimeFormatter._format_token() will return None.

Second one is return self.locale.meridian(dt.hour, token).

But because those issues are known issue (I only annotate type for those cases), I think we can just leave FIXME comment and cast it as str for now. I want to minimize behavior changing in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given Isac's latest findings, you think this is good to merge @systemcatch? We might want to consider creating some issues to address these two FIXMEs going forward (might require some significant refactoring) if possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jadchaar I created #918 to track these 2 bugs. We should merge this in now (exciting!).

@@ -337,7 +504,8 @@ def _parse_token(self, token, value, parts):
parts["year"] = 1900 + value if value > 68 else 2000 + value

elif token in ["MMMM", "MMM"]:
parts["month"] = self.locale.month_number(value.lower())
# FIXME: month_number() is nullable
parts["month"] = self.locale.month_number(value.lower()) # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isac322 Is this FIXME good to be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jadchaar Hmmmm. I'm not sure Locale.month_number() always return str.

@isac322
Copy link
Contributor Author

isac322 commented Feb 15, 2021

@jadchaar @systemcatch It was a long journey, but it was fun. Thanks for your work!

@maroux
Copy link
Contributor

maroux commented Feb 26, 2021

This is missing py.typed support which means tools like mypy can't check arrow usage. I created #925 for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants