Skip to content

run pip check only once for PythonBundle #3432

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 23, 2025

Conversation

Flamefire
Copy link
Contributor

(created using eb --new-pr)

We have 2 checks in PythonPackage:

  • pip check
  • pip list -> Check for "0.0.0" versions

In PythonBundle those are run for every extension after the build of the whole EC even though running it once is enough because the result will always be the same.

This PR uses the following logic:

sanity_pip_check should be set at the top of PythonBundle and not for the individual extensions. Currently if any extension has it enabled the check will be run so it does not make sense to disable/enable it for individual extensions. PythonBundle passes its value for this to every extension as a default so a deprecation is added in case it gets changed in an extension.

Similar reasoning applies to unversioned_packages: Only a single value for the whole bundle is useful and hence should be set at the top. For kind of backwards compatibility during the deprecation an union of all those values is used in the check.

PythonPackage does no longer do the pip checks if it is an extension and the parent EC (e.g. PythonBundle) has a value for sanity_pip_check set.

PythonBundle does the pip check if itself or any extension has requested it issuing a deprecation if something differs.

Refactoring

To make this possible some refactoring was required.
This makes the diff look large although it is mostly moved code. Explanation follows to help navigate the changes

  • run_pip_check is moved out of sanity_check_step of PythonPackage such that it can be used by PythonBundle
  • This required moving the dependent method det_installed_python_packages out of the class too, the original PythonPackage.get_installed_python_packages needs to stay for backwards compatibility which prevents giving the same name to the free function. Maybe in EB 5 we can remove it and use get_installed_python_packages for the global method? det_-prefix is chosen similar to det_py_libdirs
  • PythonBundle.sanity_check_step now requires python_cmd to be available which was only set in the prepare_step that is skipped in --sanity-check-only --> Factor out prepare_python from prepare_step similar to PythonPackage
  • There was a mismatch in the code to detect the python command to use although I see no reason for that. I factored out find_python_cmd from PythonPackage.prepare_python and call it from PythonBundle. I left the check for a loaded Python module in PythonBundle as I don't know the reason for that check. IMO it should either be in both or neither

Fixes #3418

I overwrite _sanity_check_step_extensions now for this. This also ensures that the extensions are initialized. Related PR: easybuilders/easybuild-framework#4620

@Flamefire Flamefire changed the title Single pip check 5.x [5.0x] run pip check only once for PythonBundle Sep 4, 2024
@boegel boegel added this to the 5.0 milestone Sep 7, 2024
@boegel boegel added the bug fix label Sep 25, 2024
@boegel
Copy link
Member

boegel commented Oct 8, 2024

@Flamefire Can you look into fixing the merge conflicts?

I'm keen on getting this merged soon, but there's a lot of code shuffling going on here that makes the review a bit tough...

@Flamefire Flamefire force-pushed the single-pip-check-5.x branch 2 times, most recently from 19789d6 to f78c6f6 Compare October 9, 2024 14:23
@Flamefire
Copy link
Contributor Author

Ok, the merge conflict mostly originated from the addition of a max-Python version. I added that to the moved code.

I split up the change into one commit that should only be a refactoring without any effective changes, then the actual change(s)

While doing the refactoring I noticed some weirdness with specifying the required Python version in ECs using the system Python dependency:

  • If only the req_py_majver is set, the minor version will be set to the minor version of the used python which doesn't make sense
  • The check for the max version when the minor version is missing fails almost always

I fixed both in separate commits to avoid having to test this code again.

I can split this into 3 PRs though if preferred (refactoring, pip-check, pyver fixes)

@Flamefire Flamefire force-pushed the single-pip-check-5.x branch 3 times, most recently from cde58b6 to 286d365 Compare October 10, 2024 08:32
@Flamefire
Copy link
Contributor Author

Flamefire commented Oct 10, 2024

I copied the refactoring to #3475 for easier review

Both tested with a random selection of recent-ish PythonBundle and PythonPackage ECs

@Flamefire
Copy link
Contributor Author

Test report by @Flamefire

Overview of tested easyconfigs (in order)

  • SUCCESS Flask-3.0.3-GCCcore-13.3.0.eb
  • SUCCESS flit-3.9.0-GCCcore-13.2.0.eb
  • SUCCESS Mako-1.2.4-GCCcore-13.2.0.eb
  • SUCCESS pytest-workflow-2.1.0-GCCcore-13.3.0.eb
  • SUCCESS poetry-1.8.3-GCCcore-13.3.0.eb
  • SUCCESS pydantic-2.6.4-GCCcore-13.2.0.eb
  • SUCCESS scikit-build-0.17.6-GCCcore-13.2.0.eb
  • SUCCESS Z3-4.13.0-GCCcore-13.2.0.eb
  • SUCCESS lxml-5.3.0-GCCcore-13.3.0.eb
  • SUCCESS cryptography-41.0.5-GCCcore-13.2.0.eb
  • SUCCESS Cython-3.0.10-GCCcore-13.2.0.eb
  • SUCCESS Pillow-10.2.0-GCCcore-13.2.0.eb
  • SUCCESS archspec-0.2.2-GCCcore-13.2.0.eb

Build succeeded for 13 out of 13 (13 easyconfigs in total)
i7139 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor, Python 3.8.17
See https://gist.github.com/Flamefire/a4a6e8292ac9c977a15360f7828db932 for a full test report.

@Micket
Copy link
Contributor

Micket commented Oct 10, 2024

This needs rebasing on #3475 now

@Flamefire
Copy link
Contributor Author

This needs rebasing on #3475 now

Done and split the pyver fix commit into #3478

@boegel boegel modified the milestones: 5.0.0, release after 5.0.0 Mar 18, 2025
@boegel boegel changed the title [5.0x] run pip check only once for PythonBundle run pip check only once for PythonBundle Mar 19, 2025
@boegel boegel changed the base branch from 5.0.x to develop March 19, 2025 11:11
@boegel
Copy link
Member

boegel commented Mar 19, 2025

@Flamefire I changed to target branch in this PR from 5.0.x to develop, you should synchronize your PR branch with current develop branch (which has received a massive update after the release of EasyBuild v5.0.0, see #3670)

@Micket
Copy link
Contributor

Micket commented Apr 22, 2025

Should i look into getting this PR merged or #3428
Sorry I lost track of this before the 5.0 release.

@Flamefire Flamefire force-pushed the single-pip-check-5.x branch from dc2bb92 to c227e72 Compare April 23, 2025 07:02
@Flamefire
Copy link
Contributor Author

Should i look into getting this PR merged or #3428 Sorry I lost track of this before the 5.0 release.

This used to be for 5.x while the other was for develop. I updated both and compared them so I could combine outstanding changes.
So this is the most current version now and I closed the other one.

@boegel
Copy link
Member

boegel commented Apr 23, 2025

Should i look into getting this PR merged or #3428 Sorry I lost track of this before the 5.0 release.

There's a lot of extra code in here, and since this doesn't require any backwards incompatible changes, we opted not to block the release of EasyBuild 5.0.0 over this.

We should get back to this indeed though, I would definitely like to see this fixed, but given the easyblocks this is touching, we'll need to tread carefully here...

@Flamefire Flamefire force-pushed the single-pip-check-5.x branch from c227e72 to 0177e4b Compare May 12, 2025 10:26
boegel added 3 commits May 22, 2025 16:06
…age easyblock + add dedicated unit test for it
…improve deprecation warnings if sanity_pip_check and unversioned_packages are set for specific extensions in PythonBundle
@boegel boegel force-pushed the single-pip-check-5.x branch from 54cd746 to 12febcc Compare May 22, 2025 19:32
@boegel
Copy link
Member

boegel commented May 22, 2025

Test report by @boegel

Overview of tested easyconfigs (in order)

  • SUCCESS Python-3.10.8-GCCcore-12.2.0.eb
  • SUCCESS Python-bundle-PyPI-2024.06-GCCcore-13.3.0.eb
  • SUCCESS SciPy-bundle-2023.11-gfbf-2023b.eb
  • SUCCESS numexpr-2.9.0-foss-2023a.eb
  • SUCCESS cryptography-41.0.1-GCCcore-12.3.0.eb
  • SUCCESS matplotlib-3.9.2-gfbf-2024a.eb

Build succeeded for 6 out of 6 (6 easyconfigs in total)
node3625.doduo.os - Linux RHEL 9.4, x86_64, AMD EPYC 7552 48-Core Processor (zen2), Python 3.9.18
See https://gist.github.com/boegel/b8c8ecfd5f3b7be340379bef46fa1619 for a full test report.

Copy link
Member

@boegel boegel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've thoroughly reviewed this, reworked it a bit, added tests for the standalone functions that have been lifted out of the PythonPackage easyblock, and tested it.

Finally good to go now, thanks for the effort @Flamefire!

@boegel boegel merged commit e11c05c into easybuilders:develop May 23, 2025
17 checks passed
msg = ('Package %s in unversioned_packages was not found in the installed packages. '
'Check that the name from `python -m pip list` is used which may be different '
'than the module name.' % unversioned_package)
msg = f"Package '{unversioned_package}' in unversioned_packages was not found in "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel What was wrong with the original syntax? I find the repetition of the variable name overly verbose and IMO introduces a new source for errors. It seems like PEP even recommends this, although I have only found a secondary source

PEP 8 recommends the use of implicit string concatenation within parentheses

@@ -1161,6 +1167,9 @@ def sanity_check_step(self, *args, **kwargs):
# If the main easyblock (e.g. PythonBundle) defines the variable
# we trust it does the pip check if requested and checks for mismatches
sanity_pip_check = False
msg = "Sanity 'pip check' disabled for {self.name} extension, "
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boegel Related to the above: If we don't define msg as a single-use variable but put the string inside the function call we get line-continuation fro free

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: Nice-to-have
Development

Successfully merging this pull request may close these issues.

PythonBundle should perform 1 single pip check instead of each python package repeating it
3 participants