Skip to content

Traverse module ancestors when traversing reachable graph nodes during dmypy update #18906

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

sterliakov
Copy link
Collaborator

@sterliakov sterliakov commented Apr 10, 2025

Fixes #18396. Fixes #17652. Hopefully fixes #15486 (but not enough info to reproduce the original problem).

See discussion in #18396. This PR forces collecting all ancestors of all modules during dep graph traversal in incremental update.

Ancestors are included in load_graph, which means not traversing them during update results in some modules being erroneously treated as deleted:

mypy/mypy/build.py

Lines 3141 to 3146 in a4e79ea

for dep in st.ancestors + dependencies + st.suppressed:
ignored = dep in st.suppressed_set and dep not in entry_points
if ignored and dep not in added:
manager.missing_modules.add(dep)
elif dep not in graph:
try:

@sterliakov sterliakov marked this pull request as ready for review April 11, 2025 00:04

This comment has been minimized.

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a right thing to do. Few comments:

  • This design is quite counter-intuitive, it took me a while to scroll down to the very bottom of fine_grained_increment_follow_imports() where problem happens. Could you please add a comment at the point where seen set is first created there, explaining that it will be updated by various method calls to mark modules that are still reachable by following imports (and after all delete the rest). Also please update the docstrings of two methods that do modify seen (right now those docstrings are actively misleading, it looks like the only point of modifying seen is to avoid infinite loops during traversal).
  • To help you create a self-contained test, this assert is usually hit when the same file is processed twice under different module ids. For example after kicking out an a/__init__.py, a file a/b.py can be assigned a module id b, while before it appeared as a.b. But please don't spend more than few hours on this. It is OK to merge without a test, if it is too hard to write.
  • I am quite sure this will fix dmypy: INTERNAL ERROR: AssertionError in _add_error_info() #15486 as well (that also appears on unmodified re-run with --follow-imports=normal)

Copy link
Collaborator

@svalentin svalentin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sterliakov for your debugging and for the fix!
This seems correct to me as well. Only one other thing I would check is if new_files is correct as well. (though I would guess it is correct from your testing)

If writing a unit test is hard, maybe an integration test is easier. Those are in test-data/unit/daemon.test. See the other tests in there for the format. testDaemonQuickstart is an example of a test that changes files and runs dmypy multiple times.

@sterliakov
Copy link
Collaborator Author

sterliakov commented Apr 11, 2025

Okay, I think the test is possible to write, thank you for suggestions! So far I know that the following is enough to repro:

import imageio

1 + 'a'

(consistently crashes on the third run)

Now I only need to figure out what's so special about imageio:) I'll try to cook something today.

@svalentin
Copy link
Collaborator

Oh, awesome that you narrowed it down to just 1 import!

While it might be good to have a test, since that makes it clear how the bug happened, I also think we can merge this as is without a test, and the test in a separate PR, if it's taking you a lot of time. I don't believe this change would make things worse.

@sterliakov
Copy link
Collaborator Author

Okay, may I write an awkward, slow test depending on stdlib stub?

The following crashes on master and passes with my patch:

[case testDaemonImportAncestors]
$ dmypy run test.py
Daemon started
test.py:2: error: Unsupported operand types for + ("int" and "str")  [operator]
Found 1 error in 1 file (checked 1 source file)
== Return code: 1
$ dmypy run test.py
test.py:2: error: Unsupported operand types for + ("int" and "str")  [operator]
Found 1 error in 1 file (checked 1 source file)
== Return code: 1
$ dmypy run test.py
test.py:2: error: Unsupported operand types for + ("int" and "str")  [operator]
Found 1 error in 1 file (checked 1 source file)
== Return code: 1
[file test.py]
from xml.etree.ElementTree import Element
1 + 'a'

If yes, I'll be back in ~4 h and submit all these changes:)

@ilevkivskyi
Copy link
Member

@sterliakov I think this is fine.

@sterliakov
Copy link
Collaborator Author

sterliakov commented Apr 11, 2025

Only one other thing I would check is if new_files is correct as well.

Looks like it is, new_files and seen are always updated at the same time, and this change doesn't really affect how seen is collected - only what dependencies are traversed for every entry.

I think this should be ready for another review round!

Copy link
Contributor

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

@ilevkivskyi ilevkivskyi merged commit 1ba23f1 into python:master Apr 11, 2025
18 checks passed
@arthurbrenno
Copy link

heroes

@svalentin
Copy link
Collaborator

@sterliakov is the hero, he found and fixed the bug! :) Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants