TYPING: type hints for core.indexing #27527

simonjayhawkins · 2019-07-22T22:46:31Z

Any is added in a separate commit in case we can't come to an agreement ;)

jbrockmendel · 2019-07-22T23:07:52Z

pandas/core/indexing.py

        """ convert a range argument """
        return list(key)

-    def _convert_scalar_indexer(self, key, axis: int):
+    def _convert_scalar_indexer(self, key: Any, axis: int) -> Any:


this isn't restricted to scalar key?

monkeytype --disable-type-rewriting gives

def _convert_scalar_indexer( self, key: Optional[Union[float64, List[str], str, ndarray, bytes, RangeIndex, List[Tuple[str, str]], Tuple[slice, slice], Dict[Tuple[str, str], int], Timedelta, bool, Tuple[str, int], Float64Index, int, Tuple[slice, slice, str], TimedeltaIndex, Tuple[int64, int64, int64, int64], Dict[str, int], Int64Index, int32, List[int], Index, float, List[Any], time, Tuple[int, int], Series, Set[str], int64, Interval, Tuple[slice, slice, List[str]], Tuple[slice, int], DatetimeIndex, SparseArray, Tuple[slice, List[int]], Timestamp, List[Timestamp], datetime64, Set[Tuple[str, str]], MultiIndex, datetime, Tuple[str, str], Tuple[slice, List[str]], NaTType, List[bool]]], axis: int ) -> Optional[Union[float64, List[str], str, ndarray, bytes, RangeIndex, List[Tuple[str, str]], Tuple[slice, slice], Dict[Tuple[str, str], int], Timedelta, bool, Tuple[str, int], Float64Index, int, Tuple[slice, slice, str], TimedeltaIndex, Tuple[int64, int64, int64, int64], Dict[str, int], Int64Index, int32, List[int], Index, float, List[Any], time, Tuple[int, int], Series, Set[str], int64, Interval, Tuple[slice, slice, List[str]], Tuple[slice, int], DatetimeIndex, SparseArray, Tuple[slice, List[int]], Timestamp, List[Timestamp], datetime64, Set[Tuple[str, str]], MultiIndex, datetime, Tuple[str, str], Tuple[slice, List[str]], NaTType, List[bool]]]: ...

I could be wrong, but I assume that test functions that raise are not included in the traces. There wouldn't much point otherwise.

So no way to restrict this further? Any just doesn't really provide the reader/developer with any more insight into how the function is supposed to work.

So no way to restrict this further?

probably. but not time well spent until more call sites and more imports are typed.

jbrockmendel · 2019-07-23T22:31:11Z

pandas/core/indexing.py

 import warnings

 import numpy as np
+from numpy import ndarray


I think outside of cython we generally avoid this

sure. will change.

WillAyd · 2019-07-23T23:05:40Z

pandas/core/indexing.py

@@ -88,7 +94,7 @@ class _IndexSlice:
           B1   10   11
    """

-    def __getitem__(self, arg):
+    def __getitem__(self, arg: Any) -> Any:


Do these at least not have to be Hashable and/or some type of Mapping / Sequence?

same.

Not prepared to justify each instance of Any. so will just remove them to streamline the process since that seems to be your preference.

isn’t it possible just reveal type these then keep adding to a Union?

There's probably not so many calls to getitem from within the codebase as from the tests. but, yes, that seems a reasonable approach.

WillAyd · 2019-07-23T23:07:21Z

pandas/core/indexing.py

        """ convert a range argument """
        return list(key)

-    def _convert_scalar_indexer(self, key, axis: int):
+    def _convert_scalar_indexer(self, key: Any, axis: int) -> Any:


So no way to restrict this further? Any just doesn't really provide the reader/developer with any more insight into how the function is supposed to work.

jreback · 2019-07-24T00:19:55Z

i wonder if there is a decorator that we could use in tests to turn on reveal_type and show a summary

simonjayhawkins · 2019-07-24T00:34:02Z

pandas/core/indexing.py

-                key = tuple([key] + [slice(None)] * (len(labels.levels) - 1))
+                list_items = [key]  # type: List[Union[slice,str]]
+                list_items += [slice(None)] * (len(labels.levels) - 1)
+                key = tuple(list_items)


no happy changing code. the original is perfectly valid. will probably revert this and add a type: ignore and wait for a mypy update.

@WillAyd should we add --warn-unused-ignores to either the CI, or warn-unused-ignores=True to the ini file.

or happy to just run ad-hoc periodically.

Can we not reuse key and just call it something else?

the issue is that you can't add a list of slices to a list of strings in an inline expression.

Ah ok that's strange. Let me open an issue on Mypy tracker and we can ref that in ignore comment here

Actually I think the slice is a red herring. This fails with just int and str as well:

values: Union[List[int], List[str]] = [1] + [""] # error: List item 0 has incompatible type "str"; expected "int" values: List[Union[int, str]] = [1] + [""] # error: Incompatible types in assignment (expression has type "List[int]", variable has type "List[Union[int, str]]") # note: "List" is invariant -- see http://mypy.readthedocs.io/en/latest/common_issues.html#variance # note: Consider using "Sequence" instead, which is covariant # error: List item 0 has incompatible type "str"; expected "int"

So I think

see python/mypy#5492

Yea good find. Can just add that as comment and ignore (suggested approach from mypy for dealing with these types of things)

WillAyd

looks good one question

WillAyd · 2019-07-24T15:18:05Z

pandas/core/indexing.py

@@ -860,7 +867,7 @@ def _handle_lowerdim_multi_index_axis0(self, tup: Tuple):
        axis = self.axis or 0
        try:
            # fast path for series or for tup devoid of slices
-            return self._get_label(tup, axis=axis)
+            return self._get_label(tup, axis=axis)  # type: ignore


What error was this throwing? And one below. Sorry if missed in diff

axis needs to be int but is type Axis.

probably not yet got the return type for self.obj._get_axis_number(axis) L112. This is defined in another module so would rather keep additions to single modules for now.

could add a cast, but that lets you lie to the type checker so can be more dangerous than ignoring for now.

not sure whether mypy will figure this out once more type hints are added.

as long as we keeping using --warn-unused-ignores we should know if these become redundant and can be removed. otherwise will probably need to add cast in the future.

How hard would it be to just type that return then? Understood in a different module but reason I ask is I think there is buggy axis behavior when using strings that mypy may detect for us if we don't ignore.

To illustrate, these go down different code paths though result is equivalent

In [5]: df = pd.DataFrame(np.ones((2,2)), index=pd.MultiIndex.from_tuples((('a', 'b'), ('c', 'd')))) In [5]: df.loc[("a", "b")] Out[8]: 0 1.0 1 1.0 Name: (a, b), dtype: float64 In [9]: df.loc(axis="index")[("a", "b")] Out[9]: 0 1.0 1 1.0 Name: (a, b), dtype: float64

How hard would it be to just type that return then? Understood in a different module but reason I ask is I think there is buggy axis behavior when using strings that mypy may detect for us if we don't ignore.

I accept that we probably have different motivations here.

my motivation is to clean up the io.formats code. there is a lot of code duplication there, and i believe buggy code there as well.

so my view is that to support the refactoring activity, it's better to get, say 80%, of each of the imports to formats typed, than to get 100% of just a few.

so i'd prefer "adding # type: ignore comments to silence errors you don’t want to fix now." https://mypy.readthedocs.io/en/latest/existing_code.html

OK sure. I'm just hesitant that ignore and Any mean we won't ever review in earnest, but sounds like you have a plan here so that doesn't apply

I'm just hesitant that ignore and Any mean we won't ever review in earnest

I'm sure that in time we will want to increase the strictness of the mypy checks. so hopefully this won't apply.

also, i'm using MonkeyType to annotate with the actual types passed and returned, rather than what we think they should be. This itself is quickly revealing refactoring/cleanup opportunities.

I guess we could categorize bugs into three categories for the purpose of adding type hints.

bugs introduced changing code just to appease the type checker. hence i would prefer ignore to cast, and cast to code changes.

bugs in the codebase. adding type hints is going to help enormously.

bugs that will be added in the future. Again here, i have the opinion that getting, say 80%, of the codebase done quickly is more likely to be a benefit than perfection on the parts that are done.

WillAyd · 2019-07-24T17:11:08Z

Can you merge master? Just pushed in change to remove ABCs from _typing so should run CI again to double check

simonjayhawkins · 2019-07-24T17:12:56Z

sure

jreback · 2019-07-25T17:17:48Z

I am with @WillAyd here about the type ignore's. I am afraid that we are never going to revisit these; we did this for a while with flake8 and it was a disaster; so I would either: actually define the type or type somewhat less until this can be fixed.

simonjayhawkins · 2019-07-28T10:14:09Z

I am with @WillAyd here about the type ignore's. I am afraid that we are never going to revisit these; we did this for a while with flake8 and it was a disaster; so I would either: actually define the type or type somewhat less until this can be fixed.

fair enough. but this defeats the purpose of why I was adding type hints... to make the types of the function parameters and return types known to other functions.

simonjayhawkins added 2 commits July 22, 2019 23:24

TYPING: type hints for core.indexing

8135214

add Any types autogenerated by MonkeyType

06824a1

simonjayhawkins added Indexing Related to indexing on series/frames, not to indexes themselves Typing type annotations, mypy/pyright type checking labels Jul 22, 2019

jbrockmendel reviewed Jul 22, 2019

View reviewed changes

simonjayhawkins added 2 commits July 23, 2019 00:20

Merge remote-tracking branch 'upstream/master' into typing-core-indexing

d5ae393

Merge remote-tracking branch 'upstream/master' into typing-core-indexing

79657d7

jbrockmendel reviewed Jul 23, 2019

View reviewed changes

ndarray -> np.ndarray

e084af4

WillAyd reviewed Jul 23, 2019

View reviewed changes

remove Any and cast

f76132c

simonjayhawkins commented Jul 24, 2019

View reviewed changes

simonjayhawkins added 4 commits July 24, 2019 02:40

resolve merge conflicts

450d68b

add ignore to mypy errors from rebase

7ea090e

add ignore for List concatenation problem

e19292d

lint error

6a4c26b

WillAyd reviewed Jul 24, 2019

View reviewed changes

WillAyd approved these changes Jul 24, 2019

View reviewed changes

Merge remote-tracking branch 'upstream/master' into typing-core-indexing

3797992

jreback added this to the 1.0 milestone Jul 25, 2019

simonjayhawkins closed this Jul 28, 2019

simonjayhawkins mentioned this pull request Jul 29, 2019

TYPING: add some type hints to core.generic #27646

Closed

Uh oh!

TYPING: type hints for core.indexing #27527

TYPING: type hints for core.indexing #27527

Uh oh!

Conversation

simonjayhawkins commented Jul 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jreback commented Jul 24, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

WillAyd commented Jul 24, 2019

Uh oh!

simonjayhawkins commented Jul 24, 2019

Uh oh!

jreback commented Jul 25, 2019

Uh oh!

simonjayhawkins commented Jul 28, 2019

Uh oh!

Uh oh!