-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
bpo-45949: Pure Python freeze module for cross builds (GH-29899) #29899
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Where would the frozen importlib files come from? Are you adding those back to git? (Otherwise I don't see how this helps.) |
|
Cross builds don't just need the same version -- they need to use the same commit, unless we're in beta or beyond. I'm not sure what's the point of ever using freeze_module.py for non-cross builds. Is it just so that freeze_module.py gets exercised regularly? |
Good point, I'll update the documentation.
Yes, you are correct. It is not necessary to use |
The main point of the PR is to get rid of This patch removes the dependency. Users will be able to cross build a 3.11 interpreter with a 3.11 host Python (once byte code and API has stabilized in beta). |
How about committing the importlib bootstrap related files to git and generate the rest with pure python version on both standard and cross builds provided the bytecode is same, this would simplify things a lot ? |
In the past we had the importlib bootstrap files in git. The approach has the downside that it increases the size of git history a lot. Really a lot a lot. The files are fairly large, about 350k in total. They also must be regenerated every time the source Python files are updated or any aspect of the byte code generator and optimizer is changed. Even a tiny change results in a large diff, often affecting all lines of the header files. |
Yes, it will increase the size of git history but does anyone clones the repo in full ?, for builds most would do a shallow clone.
For diff, these files can be marked as linguist-generated then github would suppress them in PRs |
I have a full clone and usually everybody else with a local checkout has a full clone as well. |
It opens a great possibility to introduce a vulnerability into the Python codebase. Huge (especially suppressed) diffs are hard to analyze so they allow alterations of a generated file that will go unnoticed. In addition, GitHub is just a part of a development conveyor. Before it there ara local non-Github tools (git diff/gitk, TortoiseGit, Sublime Merge, and so on) that do not support
Actually this is the very purpose of Git as a distributed version control system, to distribute full clones and work with them offline. |
If generated files were to be added to the repo then it would have been verified on CI that the generated files and files generated on CI are equal. |
You would also have to make sure that non of the validating tools rely on the generated files. |
If bots are/will be involved, then I agree and retract my note on vulnerability planting. In fact, I am aware that bootstrapping in a bare environment with no predecessor (so _bootstrap_python minimal interpreter is introduced) is a can of worms that force hardly bearable compromises. |
Indeed, maintainability of the diffs is still a question. Even with suppression, unanalysable black boxes carried along with actual proposed changed is not a good idea. |
7d9cda1
to
dce68d0
Compare
I have added more comments to explain the bootstrap process. Sorry for the squash merge + force push. I ran into merge conflicts and eventually gave up. Merge was less work. |
f7e07fd
to
2a44284
Compare
454514c
to
55e1527
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM.
I've left a few minor comments. The only matter of consequence is about make rule dependencies in the cross-compiling case, as well as running make regen-frozen
as one of the steps for a cross-compiled build.
I'm approving the PR under the assumption that you'll resolve my concerns before merging. 🙂
Makefile.pre.in
Outdated
# We manually freeze getpath.py rather than through freeze_modules | ||
Python/frozen_modules/getpath.h: Modules/getpath.py $(FREEZE_MODULE_C) | ||
$(FREEZE_MODULE_BOOTSTRAP) getpath $(srcdir)/Modules/getpath.py Python/frozen_modules/getpath.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of "manually", can this be generated now? (added to Tools/scripts/freeze_modules.py)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getpath is not like the others. It would need special handling in Tools/scripts/freeze_modules.py
, too. Let's tackle the getpath target in another PR. I only moved existing block to a different location so it is closer to other early bootstrap blocks.
reference count > 1. Marshal adds the `FLAG_REF` flag and creates a | ||
reference `hashtable`. | ||
""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a short note here (or even a TODO) about the relationship with Tools/freeze/freeze.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
and sorry for the delay in getting you a review! |
Use `_bootstrap_python` interpreter and pure Python implementation of `freeze_module` to generate frozen byte code files. Only importlib bootstrap files are generated with `Programs/_freeze_module`. This simplifies cross building, as the build system no longer needs a `_freeze_module` binary. A standard Python installation with same version is sufficient. Signed-off-by: Christian Heimes <[email protected]>
55e1527
to
4fd065f
Compare
- Fix typo - Remove confusing comment - Improve dependency handling for freezing
4fd065f
to
0d328a9
Compare
@ericsnowcurrently Are you satisfied with the current state of the patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Use
_bootstrap_python
interpreter and pure Python implementation offreeze_module
to generate frozen byte code files. Only importlibbootstrap files are generated with
Programs/_freeze_module
.This simplifies cross building, as the build system no longer needs a
_freeze_module
binary. A standard Python installation with sameversion is sufficient.
Signed-off-by: Christian Heimes [email protected]
https://bugs.python.org/issue45949