Sort modules in dependency cycles based on dependencies #1530

JukkaL · 2016-05-13T12:11:03Z

Topo sort modules within dependency cycles to get a more consistent build ordering and fewer bogus errors due to ordering issues. Specifically, each import would have a priority (high/low).
Topo sort modules in a cycle by considering high-priority dependencies first. We can use low-priority dependencies as secondary sort criteria.

We'd assign a high priority in these cases:

from m import ... at module top level (not within a function) so that one of the imported names is not a module
Imported module referenced at module top level outside import statements (independent of the kind of import)

Exceptions:

Anything within if False: is ignored and can't affect ordering.
References to a module within type annotations are ignored. (Optimally, we'd only ignore string-literal-quoted references and comment annotations, but we don't have that information yet.)

Intuitively, we want to make build ordering approximate the order in which modules are initialized at runtime.

Example:

from util import helper_func   # high, at top level
import model  # high, used at top level (as base class)
import logger  # low, use only within function

class Person(model.Model):
    def check(self) -> None:
        from validators import validate    # low, within function

def log(s: str) -> None:
    logger.log(s)

Expected benefits:

A base class will usually be type checked before a derived class (if defined in different modules), so attributes and decorated methods in the base class would have inferred types when type checking the derived class.
Less likely to access an imported type alias that hasn't been processed yet, but only if type alias is imported using from m import ....
Order of checking modules within a cycle would be more predictable. This is useful, as ordering changes may result in "Cannot determine type" errors that seem to come out of nowhere.
Cycles caused by mypy-only imports within if False: blocks won't affect the ordering of a build.

This would partially solve #481. The rules above still don't cover all interesting problems but they should give a clear improvement for a lot of code.

This could perhaps be implemented as a new pass in process_stale_scc, where we'd sort the strongly connected component before the main semantic analysis pass.

The text was updated successfully, but these errors were encountered:

gvanrossum · 2016-06-10T01:05:28Z

Test case (reduced from a real-world example):

File x.py:

class Base:
    pass
def foo() -> int:
    import y
    return y.Sub.attr

File y.py:

import x
class Sub(x.Base):
    attr = 0

This passes when run as mypy x.py but fails with mypy y.py:

y.py:1: note: In module imported here:
x.py: note: In function "foo":
x.py:5: error: Cannot determine type of 'attr'

gvanrossum · 2016-06-22T02:49:57Z

FWIW, that test case is not easily solved with the suggested heuristic. The heuristic suggests processing x.py first, y.py second (so that the base class is processed before the subclass), but if you do that it will always give that error. :-(

JukkaL · 2016-06-22T09:42:19Z

Yeah, it wouldn't help here. However, the order of processing would be more stable so at least the error would be less likely to appear after an unrelated change.

There are additional things we could do instead, but it's not clear whether they are worth the extra complexity.

One approach that I think we've discussed before is to run full type checking first on module top levels and __init__ methods and type check function and method bodies in a separate pass. This wouldn't help with non-__init__ methods that define attributes, but maybe this is rare enough that the need to annotate those won't feel like a burden. Also, this may complicate other things such as more fine-grained incremental checking, but not necessarily. The motivation should be clear: we'd infer types for most things that are visible outside the module before we type check most things that need those types, so this should fix many of the "Cannot determine type of x" errors.

Another idea it to have another light-weight attribute type inference pass that runs before main type inference that can only deal with simple cases, such as these:

class A:
    x = 0   # type of `x` is clearly `int`

    def __init__(self, y: str) -> None:
        self.y = y   # again this is simple

Some potential heuristics we might use in the light-weight pass:

Only look at class bodies and __init__ methods. Skip other method and function bodies. This way this could be pretty fast.
There must only be a single assignment to an attribute in the initialization scope, so we don't need to worry about partial types.
Assignment must not be nested within a statement such as if or a loop, as these can be tricky.
Initializer expression has a simple type. For example, it can be a literal or refer to a local variable with an explicit (not inferred) type that doesn't get assigned to. It can also be a call to a function/class with a simple (non-generic) return type.

The motivation is that if this would infer types for, say, 2/3 of attributes (not a proper estimate) before type checking, the issue would be much less severe and the cases where a type annotation would be required would often be non-trivial so arguably the annotations may improve clarity.

gvanrossum · 2016-06-22T15:20:47Z

OK, then I'll go ahead and submit a PR for what I have. The rules I've implemented so far:

top-level from M import ... gets priority HIGH
top-level import M gets priority MEDIUM
all imports inside a function get priority LOW

I don't yet use information about how the imported variable gets used yet; I'm not sure that information is easily available before the semantic analysis starts.

I need a new example though -- something that actually fails due to an import cycle and where the proposed algorithm improves things.

gvanrossum · 2016-06-23T03:27:16Z

I ran this over our internal server repo and had to silence a few errors. A light-weight type checker for class attributes would definitely have helped -- all errors were about class attributes with a simple int or str literal as value. So I think I'd like to tackle that next. (The other repo was a red herring.)

This thwarts certain kinds of errors due to circular dependencies (see #1530).

gvanrossum · 2016-06-24T02:17:12Z

I'm working on some code to infer types from simple literal assignments here: https://github.com/python/mypy/tree/class_attrs

@rwbarton you might want to follow along. There are some tests failing; so far I've identified some failures in the weak-mode tests and some others due to tests for other features relying on undefined types. I've shut up the weak-mode test failures by simply skipping this feature when any weak options are set. I'll shut up the others by making the expressions slightly more complex. We'll see what else fails.

UPDATE: More failures due to the rendering of variables with known types being different (no '*' following the name).

rwbarton · 2016-06-24T13:13:39Z

I'm working on some code to infer types from simple literal assignments here: https://github.com/python/mypy/tree/class_attrs

Yes, this looks right to me.

…forms (#1736) This partially addresses #1530 (but does not fully fix it). The new algorithm always processes x before y, while the old one gives an error when y is processed before x: ``` x.py:4: note: In module imported here: y.py: note: In class "Sub": y.py:3: error: Cannot determine type of 'attr' ``` I am working on additional heuristics in https://github.com/python/mypy/tree/class_attrs.

This thwarts certain kinds of errors due to circular dependencies (see #1530).

Set the type of `x = 0` and similar in the semantic analyzer. This thwarts certain kinds of errors due to circular dependencies (see #1530). Skip this business if weak_opts is non-empty. Disable lightweight type check if build target < TYPE_CHECK. Only do the lightweight type inferencing thing outside functions.

JukkaL added the feature label May 13, 2016

JukkaL added this to the 0.4.x milestone May 13, 2016

gvanrossum mentioned this issue Jun 22, 2016

Make import cycles more predictable by prioritizing different import forms #1736

Merged

gvanrossum pushed a commit that referenced this issue Jun 24, 2016

Set the type of x = 0 and similar in the semantic analyzer.

12af55b

This thwarts certain kinds of errors due to circular dependencies (see #1530).

gvanrossum closed this as completed in #1736 Jun 27, 2016

gvanrossum pushed a commit that referenced this issue Jun 27, 2016

Set the type of x = 0 and similar in the semantic analyzer.

106e8ea

This thwarts certain kinds of errors due to circular dependencies (see #1530).

gvanrossum mentioned this issue Jun 27, 2016

Infer variable types from simple literals in semanal.py #1756

Merged

gvanrossum pushed a commit that referenced this issue Jun 30, 2016

Set the type of x = 0 and similar in the semantic analyzer.

6beec4d

This thwarts certain kinds of errors due to circular dependencies (see #1530).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort modules in dependency cycles based on dependencies #1530

Sort modules in dependency cycles based on dependencies #1530

JukkaL commented May 13, 2016

gvanrossum commented Jun 10, 2016 •

edited

Loading

gvanrossum commented Jun 22, 2016 •

edited

Loading

JukkaL commented Jun 22, 2016

gvanrossum commented Jun 22, 2016

gvanrossum commented Jun 23, 2016

gvanrossum commented Jun 24, 2016 •

edited

Loading

rwbarton commented Jun 24, 2016

Sort modules in dependency cycles based on dependencies #1530

Sort modules in dependency cycles based on dependencies #1530

Comments

JukkaL commented May 13, 2016

gvanrossum commented Jun 10, 2016 • edited Loading

gvanrossum commented Jun 22, 2016 • edited Loading

JukkaL commented Jun 22, 2016

gvanrossum commented Jun 22, 2016

gvanrossum commented Jun 23, 2016

gvanrossum commented Jun 24, 2016 • edited Loading

rwbarton commented Jun 24, 2016

gvanrossum commented Jun 10, 2016 •

edited

Loading

gvanrossum commented Jun 22, 2016 •

edited

Loading

gvanrossum commented Jun 24, 2016 •

edited

Loading