Skip to content

Build has a circular dependency on tomli. #430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jameshilliard opened this issue Jan 10, 2022 · 31 comments
Closed

Build has a circular dependency on tomli. #430

jameshilliard opened this issue Jan 10, 2022 · 31 comments

Comments

@jameshilliard
Copy link
Contributor

It seems we have a circular dependency issue with build preventing bootstrapping a pep517 toolchain if using the latest version of tomli.

I'm seeing the following dependency cycles here preventing me from using the latest tomli with build:

build -> pep517 -> tomli -> build(needed for building pep517 packages like tomli)
build -> tomli -> build(needed for building pep517 packages like tomli)

I think the solution is to probably have both pep517 and build vendor tomli, although maybe only pep517 needs to vendor it.

@layday
Copy link
Member

layday commented Jan 10, 2022

And how do you build pep517 without build?

@jameshilliard
Copy link
Contributor Author

And how do you build pep517 without build?

Well at the moment the latest pep517 release does have a distutils based setup.py but yeah that's also going to be a problem.

@layday
Copy link
Member

layday commented Jan 10, 2022

There are several ways we (or you) could approach this, including:

  • Vendoring all the things in all the places
  • Creating a bootstrapping meta-package that all of build/flit/etc. depend on a la https://github.com/FFY00/python-bootstrap
  • Creating a bootstrapping backend that can build foundational dependencies - flit is best positioned to assume this role
  • Reverting to using setup.py install (not happening)

Happy to hear other ideas.

@jameshilliard
Copy link
Contributor Author

  • Vendoring all the things in all the places

Might be the best option.

Creating a bootstrapping meta-package that all of build/flit/etc. depend on a la https://github.com/FFY00/python-bootstrap

That approach is problematic since the submodule deps don't seem to be namespace vendored so they would potentially conflict with the normal versions of those packages.

Creating a bootstrapping backend that can build foundational dependencies - flit is best positioned to assume this role

That's duplicating the entry points, we really shouldn't need to use 2 different pep517 frontends here IMO.

Reverting to using setup.py install (not happening)

What's the issue with the setuptools shim? That's exactly what build is using and should coexist with pyproject.toml the best from what I've seen.

@layday
Copy link
Member

layday commented Jan 10, 2022

That approach is problematic since the submodule deps don't seem to be namespace vendored so they would potentially conflict with the normal versions of those packages.

You will not be deploying the submodule deps - they're only there for bootstrapping.

What's the issue with the setuptools shim? That's exactly what build is using and should coexist with pyproject.toml the best from what I've seen.

The issue is that setup.py install is deprecated.

@jameshilliard
Copy link
Contributor Author

You will not be deploying the submodule deps - they're only there for bootstrapping.

Eh? It seems to be installing the submodule deps into the current site-packages is it not? We don't want to be installing host toolchain packages like build twice into the same place as that can cause tricky to debug conflicts.

The issue is that setup.py install is deprecated.

Seems like an issue that could be fixed by keeping it around for bootstrapping purposes, at least until someone finds a better solution.

@layday
Copy link
Member

layday commented Jan 10, 2022

Eh? It seems to be installing the submodule deps into the current site-packages is it not? We don't want to be installing host toolchain packages like build twice into the same place as that can cause tricky to debug conflicts.

Right, that's why I said meta-package - you would not have a separate package for build or the build package would be a no-op and depend on the meta-package.

@jameshilliard
Copy link
Contributor Author

you would not have a separate package for build or the build package would be a no-op and depend on the meta-package.

So if we would do that we would end up with an unwieldy meta-package that would prevent us from managing/installing any of the bundled meta-package dependencies(which are really fully separate python packages/not properly vendored) separately. It doesn't sound like a great option to me.

@FFY00
Copy link
Member

FFY00 commented Jan 10, 2022

That approach is problematic since the submodule deps don't seem to be namespace vendored so they would potentially conflict with the normal versions of those packages.

The idea is that you build bootstrapping versions of these packages and then use them to build the proper versions you want to distribute. You surely need to do the same with GCC for eg., this is a perfectly standard bootstrapping procedure. It would be great to be able to avoid it, but we are stuck with this for now.

@jameshilliard
Copy link
Contributor Author

The idea is that you build bootstrapping versions of these packages and then use them to build the proper versions you want to distribute.

We're a source based distro, this means the version used for bootstrapping is inherently the version we distribute(we distribute only source code effectively, no prebuilt deb/rpm equivalents).

You surely need to do the same with GCC for eg., this is a perfectly standard bootstrapping procedure.

We have infrastructure that builds a complete host toolchain from source using minimal host dependencies, it is somewhat complex though(multi-stage gcc build). I think gentoo does something similar.

It would be great to be able to avoid it, but we are stuck with this for now.

I mean, there are obviously ways it can be avoided(like using existing setuptools tooling for bootstrapping), so we're not stuck with this bootstrapping problem for technical reasons so much as policy it seems.

@gaborbernat
Copy link
Contributor

I mean, there are obviously ways it can be avoided(like using existing setuptools tooling for bootstrapping), so we're not stuck with this bootstrapping problem for technical reasons so much as policy it seems.

This would imply that:

  • any library/tool that is participating in the bootstrapping would need to always use setuptools, and nothing else. We cannot make this guarantee as we don't control all those libraries/tools.
  • setuptools commits to not drop support for the setup.py interface, which again doesn't seem to be the case, as described here https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html by one of the maintainers.

@layday
Copy link
Member

layday commented Jan 10, 2022

I think it's sensible that all bootstrapping dependencies should use the same backend; all of them do currently reside in the PyPA (correction: tomli is separate) so they are under our (collective) control. I would hope that it's not too difficult to establish consensus on a boostrapping backend.

To clarify, I'm not suggesting we should revert to setup.py but to help packagers by keeping the closure minimal.

@FFY00
Copy link
Member

FFY00 commented Jan 10, 2022

We're a source based distro, this means the version used for bootstrapping is inherently the version we distribute(we distribute only source code effectively, no prebuilt deb/rpm equivalents).

That seems like a limitation from your part, and you should have ways around this if you package other software that needs bootstrapping. I have designed https://github.com/FFY00/python-bootstrap to be compatible with standard bootstrapping procedures, namely, it should at the very least be able to be used in the same way as the GCC bootstrapping procedure, which itself should basically be compatible with everyone because essentially everyone needs to do it anyway. This will result in a more complex bootstrapping procedure than you are used to for Python, which is undesirable, but it should be compatible at least.

We have infrastructure that builds a complete host toolchain from source using minimal host dependencies, it is somewhat complex though(multi-stage gcc build). I think gentoo does something similar.

You can do something similar here, what I am proposing is essentially a multi-stage build process, which you may split or not into different steps via your tooling.

I mean, there are obviously ways it can be avoided(like using existing setuptools tooling for bootstrapping), so we're not stuck with this bootstrapping problem for technical reasons so much as policy it seems.

No, they aren't, at least long term. This is not a policy issue, direct setup.py invocations are deprecated and will eventually be removed, the reasoning for that is technical and not really relevant here.

@pfmoore
Copy link
Member

pfmoore commented Jan 10, 2022

all of them do currently reside in the PyPA (correction: tomli is separate) so they are under our (collective) control.

To be clear here, PyPA doesn't have the power to dictate what individual projects do, although we can clearly advise/encourage.

@layday
Copy link
Member

layday commented Jan 10, 2022

The immediate next sentence says "consensus".

@jameshilliard
Copy link
Contributor Author

  • any library/tool that is participating in the bootstrapping would need to always use setuptools, and nothing else. We cannot make this guarantee as we don't control all those libraries/tools.

Which would be a good idea initially during the migration period...I'm not saying it would be forever, just that it provides the cleanest migration path at the moment. Aren't basically all build dependencies official pypa projects(other than tomli which is small and easy to vendor)?

setuptools commits to not drop support for the setup.py interface, which again doesn't seem to be the case, as described here https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html by one of the maintainers.

I mean, providing a setup.py backwards compatibility shim temporarily is different from requiring setup.py for an install...

I think it's sensible that all bootstrapping dependencies should use the same backend

Yeah, that would be helpful, if flit_core becomes that for the entire dependency tree it would be more manageable I guess, although we're still special casing a frontend for it but I suppose it could be worse.

That seems like a limitation from your part, and you should have ways around this if you package other software that needs bootstrapping.

It's somewhat language specific. Most toolchains don't have circular dependency issues this bad though.

I have designed https://github.com/FFY00/python-bootstrap to be compatible with standard bootstrapping procedures, namely, it should at the very least be able to be used in the same way as the GCC bootstrapping procedure, which itself should basically be compatible with everyone because essentially everyone needs to do it anyway. This will result in a more complex bootstrapping procedure than you are used to for Python, which is undesirable, but it should be compatible at least.

It's very different from how we do GCC bootstrapping from what I can tell, which is an extremely complex multi-stage process and it's also very GCC specific.

You can do something similar here, what I am proposing is essentially a multi-stage build process, which you may split or not into different steps via your tooling.

Which is a huge pain(also potentially slows down builds a lot) and totally unnecessary from a technical point of view here.

No, they aren't, at least long term. This is not a policy issue, direct setup.py invocations are deprecated and will eventually be removed, the reasoning for that is technical and not really relevant here.

I'm talking about for the short term...for migration specifically. I guess for long term if everything uses flit_core and that has fully integrated build+install support then that's at least a viable option to avoid a complex multi-stage process.

@jameshilliard
Copy link
Contributor Author

jameshilliard commented Jan 10, 2022

My suggestion for this was to add a python -m flit_core.wheel CLI entry point that can roughly be used in place of build for bootstrapping, to make wheels of installer, tomli, pep517 and suchlike low-level packages. So bootstrapping would go something like this:

  • Install flit_core using its bootstrap install script (Add bootstrap install script for flit_core flit#481) - i.e. it makes a wheel of itself, then that is unpacked to site-packages.

  • Build a wheel of installer using python -m flit_core.wheel, then use installer from source to install that.

  • For each low-level package using Flit (tomli, pep517, maybe packaging in the future, ...)

    • Build a wheel with python -m flit_core.wheel
    • Install using python -m installer
  • Use build from source to build a wheel of itself, install with installer

Ok, so now that I think I have a better understanding of the situation I think this is probably the best approach as it seems like it can potentially avoid the circular dependency issues.

I think it would be helpful here for there to be a python -m flit_core.install option as well in addition to python -m flit_core.wheel so this can be staged properly and generalized properly at least for all the PEP-517 toolchain bootstrap dependencies rather than special casing the flit_core build+install process.

@FFY00
Copy link
Member

FFY00 commented Jan 10, 2022

It's very different from how we do GCC bootstrapping from what I can tell, which is an extremely complex multi-stage process and it's also very GCC specific.

I guess your mileage might vary, but I maintain 6 GCC toolchains in Arch Linux, so I do have some experience there, and I would describe this process as following the same model overall. Similarly to GCC, we are building a bootstrapping version of the required tooling and then using that to build the actual tooling we want to distribute. GCC is even more complex as it requires building a libc with the bootstrapping version and then rebuilding itself against the libc that was built.

Which is a huge pain(also potentially slows down builds a lot) and totally unnecessary from a technical point of view here.

I do not disagree that it's a huge pain. You can make whatever technical review of this that you want, but if your solution includes opening bugs in the upstreams asking for them to change to fit your model, then you need to accept that that might not happen. I understand that you have experience with packaging and bootstrapping, and respect that, but so do we, and unless we've failed to properly consider your specific use-case (which after our extensive discussion, I assure I did not), you need to respect that we are the ones maintaining the software and after considering all the use-cases we need to support, maintenance costs, and social costs, we have made our call.

@gaborbernat
Copy link
Contributor

Which would be a good idea initially during the migration period...I'm not saying it would be forever, just that it provides the cleanest migration path at the moment. Aren't basically all build dependencies official pypa projects(other than tomli which is small and easy to vendor)?

Just to reiterate:

To be clear here, PyPA doesn't have the power to dictate what individual projects do, although we can clearly advise/encourage.

The organization has no control over what any of those projects do. We as a collective can advise to some consensus; with what individual projects can disagree and disregard. Ultimately what each project in particular does is to their respective maintainer(s). You can try to force through a PEP that makes it standard that tools/libraries in the chain must not use another backend than setuptools/flit, but that's only useful to the point where you can say to them look you're not following the standard but still not force them to change.

@FFY00
Copy link
Member

FFY00 commented Jan 10, 2022

I think it would be helpful here for there to be a python -m flit_core.install option as well in addition to python -m flit_core.wheel so this can be staged properly and generalized properly at least for all the PEP-517 toolchain bootstrap dependencies rather than special casing the flit_core build+install process.

There are several issues with this proposal, mainly from maintenance and social costs, so I doubt it will happen. FWIW, I did come into Python packaging with the same background as you, trying to solve the same problem as you, and having learned from that experience, I underestimated a lot of things, mainly social costs, which worked against me. It seems to me that you are making the same mistake here, so I would recommend that you reconsider the weight you assign to other people's positions and try to work with them instead of trying to change their minds. This might be difficult at first, but after getting to know the other people in the community a little bit better, you will get a better understanding of their technical background and knowledge, which will help 😅

@takluyver
Copy link
Member

I think it would be helpful here for there to be a python -m flit_core.install option

I don't really see this happening. The scope of flit_core is to be a backend in the sense of PEP 517, and to be more or less as minimal as practical. It's easy to add a very basic CLI interface python -m flit_core.wheel to things it can already do (build wheels), but it currently has no code to do installations.

To approach it another way, adding python -m flit_core.install means either:

  • Adding install functionality with limited scope (e.g. pure Python libraries with no scripts). In this case, you want to transition to using installer as soon as possible for the full installation functionality. I've already proposed how to do this with an install script distributed alongside flit_core but not as part of it.
  • Adding full install functionality, i.e. vendoring or reimplementing installer, plus a CLI to control where stuff gets installed (see recent discussions on installer about that). This is fundamentally at odds with the limited scope of flit_core.

@jameshilliard
Copy link
Contributor Author

  • Adding install functionality with limited scope (e.g. pure Python libraries with no scripts).

Yeah, I think that would be what would make the most sense, since most of the logic there would be needed for the flit_core self install anyways.

In this case, you want to transition to using installer as soon as possible for the full installation functionality. I've already proposed how to do this with an install script distributed alongside flit_core but not as part of it.

Yeah, the idea would be that we would only useflit_core's install feature for installing build and installer along with their deps, just what's needed to bootstrap the PEP-517 toolchain, at that point we would switch to using build+installer for builds/installs, it's just difficult to make use of anything in that toolchain until the toolchain is fully installed due to dependency ordering limitations of our build system.

Adding full install functionality, i.e. vendoring or reimplementing installer, plus a CLI to control where stuff gets installed (see recent discussions on installer about that). This is fundamentally at odds with the limited scope of flit_core.

Just enough install functionality to get the full PEP-517 build+install toolchain brought up would be sufficient, it's at least reasonably straight forward to implement an integration then from what I can tell.

@jameshilliard
Copy link
Contributor Author

Similarly to GCC, we are building a bootstrapping version of the required tooling and then using that to build the actual tooling we want to distribute. GCC is even more complex as it requires building a libc with the bootstrapping version and then rebuilding itself against the libc that was built.

We are a source based distro so we distribute source to do the multi-stage build, basically we distribute the bootstrapping version in source form and the normal version in source form as well(integrated with tooling that makes it compile+install automatically).

I understand that you have experience with packaging and bootstrapping, and respect that, but so do we, and unless we've failed to properly consider your specific use-case (which after our extensive discussion, I assure I did not), you need to respect that we are the ones maintaining the software and after considering all the use-cases we need to support, maintenance costs, and social costs, we have made our call.

Our tooling I think is probably a bit different from how you understand it, which is normal unless you've worked with our tooling directly as it's somewhat unique, I'm an expert with it and I still have to review the build system code a lot of the time to understand how things work exactly a lot.

The organization has no control over what any of those projects do. We as a collective can advise to some consensus; with what individual projects can disagree and disregard. Ultimately what each project in particular does is to their respective maintainer(s).

For build system tooling I tend to think a more centralized approach(like that of mesonbuild) often works better as things integrate cleaner that way, that def explains at least why things are a bit messy here as cross-project coordination is often difficult.

FWIW, I did come into Python packaging with the same background as you, trying to solve the same problem as you, and having learned from that experience, I underestimated a lot of things, mainly social costs, which worked against me. It seems to me that you are making the same mistake here, so I would recommend that you reconsider the weight you assign to other people's positions and try to work with them instead of trying to change their minds. This might be difficult at first, but after getting to know the other people in the community a little bit better, you will get a better understanding of their technical background and knowledge, which will help 😅

Yeah, I know I'm coming more from a systems integration background rather than a language specific build tool development background and haven't followed the history all that closely here. Sorry if I'm coming off a bit hostile, my general approach is just to go back and forth questioning stuff to figure out why things are being done a certain way and what the viable solutions are. If I'm questioning an approach it generally just means I'm having trouble seeing a good way to implement it in the context of our integration requirements.

@takluyver
Copy link
Member

the idea would be that we would only use flit_core's install feature for installing build and installer along with their deps

build already installs a script (python-build), so you probably want to use installer by that point. installer has minimal dependencies, so you can use it to install everything except flit_core. I'm not seeing a good reason to add another install interface inside flit_core.

@jameshilliard
Copy link
Contributor Author

build already installs a script (python-build), so you probably want to use installer by that point

Ah, yeah I think that would be workable.

I'm not seeing a good reason to add another install interface inside flit_core.

So yeah as long as flit_core can build wheels and install itself then I think I can break the dependency cycle by having installer build itself by invoking the flit_core.wheel build and then installing itself(flit_core will have to be fully installed for installer to be able to depend on it though).

@takluyver
Copy link
Member

OK, thanks. Then I think we have a path forwards with two small additions to flit_core:

@layday
Copy link
Member

layday commented Jan 10, 2022

I don't see how any of this is better than doing what @FFY00 suggested with the 'shadow' package as demonstrated by https://github.com/FFY00/python-bootstrap. At the end of the day you're still gonna have to run some part of the stack from path, which currently appears to be flit and the installer.

@jameshilliard
Copy link
Contributor Author

  • The ability to invoke python -m flit_core.wheel at the command line to build a wheel (for packages which use Flit)

Yeah, and maybe make it so that this can be used by flit_core for its own build(and then have the install script just install from that build) for consistency in build/install staging.

I don't see how any of this is better than doing what @FFY00 suggested with the 'shadow' package as demonstrated by https://github.com/FFY00/python-bootstrap.

It's easier this way due to fairly strict dependency ordering requirements on our side.

The python-bootstrap project is inserting all build and installer projects and their deps into the python path simultaneously in a single stage essentially. This approach fundamentally breaks the dependency graph strict ordering requirements of our tooling which basically only lets us build+install packages one at a time.

We can path hack the current package being built+installed to use its own in tree source for that build+install but we can't path hack it in combination with another package that has yet to be built+installed.

At the end of the day you're still gonna have to run some part of the stack from path, which currently appears to be the installer.

We're only running installer from path during the installer build+install process though in this case(flit_core.wheel would run as a normally installed package from site-packages), we should be able to handle that since we can path hack the current package build+install.

@takluyver
Copy link
Member

@layday I'm not sure that it is better, just a different option. The python-bootstrap route means downloading the source of several key pieces of packaging infrastructure at once, and (possibly, depending on how the downstream packaging works) putting them all in one big 'python packaging basics' downstream package. That's more convenient - just one thing to handle as a special case, and then everything else is normal. But it's somewhat messy, since you need to somehow get the source of several separate upstream projects together, outside your normal dependency mechanisms.

The approach I'm talking about is more work for downstreams, because there are several packages that need to be built as somewhat special cases. But it means that each individual upstream project can be packaged one at a time, and translated 1:1 to downstream packages.

and maybe make it so that [flit_core.wheel] can be used by flit_core for its own build(and then have the install script just install from that build)

I think that's workable, yup.

@jameshilliard
Copy link
Contributor Author

The python-bootstrap route means downloading the source of several key pieces of packaging infrastructure at once, and (possibly, depending on how the downstream packaging works) putting them all in one big 'python packaging basics' downstream package. That's more convenient - just one thing to handle as a special case, and then everything else is normal.

With how our infrastructure is set up the python-bootstrap route appears to be the most difficult to integrate option, mostly since it requires violating various assumptions regarding dependency ordering for us.

The approach I'm talking about is more work for downstreams, because there are several packages that need to be built as somewhat special cases. But it means that each individual upstream project can be packaged one at a time, and translated 1:1 to downstream packages.

For us the flit_core.wheel builder approach seems to be much easier/less work and more maintainable since it at least can be made to fit our infrastructure dependency ordering expectations, even though it requires some special casing(we do regularly handle special cases like this at least).

I think that's workable, yup.

Yeah, and the reason for syncing the build+install stages with our infrastructure build+install stages like that is so that we trigger any potential wheel build errors before installation. It makes debugging issues easier if we can trigger errors as early as possible.

@layday
Copy link
Member

layday commented Jan 27, 2022

It doesn't look like there's anything actionable for build here, please continue discussion over at flit.

@layday layday closed this as completed Jan 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants