Skip to content

Conversation

@msullivan
Copy link
Collaborator

This works by reworking IR generation to proceed a SCC at a time and
writing out caches of serialized IR information so that we can
generated code that calls into a module without compiling the module
in full. A mypy plugin is used to ensure cache validity by checking
that a hash of the metadata matches and that all of the generated source
is present and matches.

Closes mypyc/mypyc#682.

@msullivan msullivan force-pushed the incremental branch 3 times, most recently from 63544d0 to 31226be Compare November 6, 2019 01:10
@msullivan
Copy link
Collaborator Author

Excitingly I can not reproduce the windows CI failure on my local windows machine

@msullivan msullivan force-pushed the incremental branch 8 times, most recently from 15ee998 to 91c616a Compare November 6, 2019 08:00
@msullivan
Copy link
Collaborator Author

I think I have a bead on the Windows issues and I think it can be hacked around in the test harness. I don't think it is blocking the review

Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really a review, just a comment about writing cache on Windows.

f.write(contents)
if old_contents != encoded_contents:
with open(path, 'wb') as f:
f.write(encoded_contents)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is relevant, but long time ago we had problems on Windows with incremental mode (see for example #3215). This is why we have some non-trivial logic in and around FilesystemMetadataStore.write(). Maybe you can use the same here?

@msullivan
Copy link
Collaborator Author

I switched to using setuptools in the run tests and that seems to fix it. setuptools changes how it copies files into place with --inplace so I think that fixed it.

@msullivan msullivan force-pushed the incremental branch 2 times, most recently from ca3e1a6 to 122e53e Compare November 8, 2019 21:54
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a quick pass -- looks good! Left a few comments; I'll continue my review tomorrow.

self.mark = False


class MypycPlugin(Plugin):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move this to a separate module, since a mypy plugin seems like its own thing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin is pretty closely tied into other logic in this module, since what it does is enforce the caching and group settings from the rest of the driver logic, so I would prefer to keep it in the same module as that code. (Splitting it out in the most obvious way would introduce a cycle, though that is avoidable without much pain.)

I do think that it will be worth splitting and maybe renaming emitmodule to separate GroupGenerator from the rest. My inclination is to do that in a follow-up PR to keep down the diff noise for this diff and to avoid needing to rebase some follow-up code I've already written, though I could do it now too if you think it's important.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable.

Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Only left some comments about additional documentation and possible tests. Incremental compilation will be a game-changer for making mypyc suitable for bigger projects.

Have you tried incremental compilation when compiling mypy/mypyc?

self.mark = False


class MypycPlugin(Plugin):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable.

if compute_hash(meta_json) != ir_data['meta_hash']:
return None

# Check that all of the source files are present and as expected
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe elaborate a bit about when they may not be present or as expected (i.e. is this a normal condition or has something been corrupted).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. Main situation is if the user deleted build/ without deleting .mypy_cache, which is sort of corruption but also reasonable to expect that it will happen.

This works by reworking IR generation to proceed a SCC at a time and
writing out caches of serialized IR information so that we can
generated code that calls into a module without compiling the module
in full. A mypy plugin is used to ensure cache validity by checking
that a hash of the metadata matches and that all of the generated source
is present and matches.

Closes mypyc/mypyc#682.
@msullivan msullivan merged commit 8d562e2 into master Nov 12, 2019
@msullivan msullivan deleted the incremental branch November 12, 2019 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Separate and incremental compilation

4 participants