-
Notifications
You must be signed in to change notification settings - Fork 773
{tools}[foss/2024a] PyTorch v2.6.0, parameterized v0.9.0, optree v0.14.1, ... #22824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{tools}[foss/2024a] PyTorch v2.6.0, parameterized v0.9.0, optree v0.14.1, ... #22824
Conversation
…GCCcore-13.3.0.eb, optree-0.14.1-GCCcore-13.3.0.eb, pytest-rerunfailures-15.0-GCCcore-13.3.0.eb, tlparse-0.3.37-GCCcore-13.3.0.eb, pytest-subtests-0.13.1-GCCcore-13.3.0.eb, pytest-shard-0.1.2-GCCcore-13.3.0.eb and patches: PyTorch-2.6.0_add-checkfunctionexists-include.patch, PyTorch-2.6.0_avoid_caffe2_test_cpp_jit.patch, PyTorch-2.6.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch, PyTorch-2.6.0_disable-gcc12-warnings.patch, PyTorch-2.6.0_disable_tests_which_need_network_download.patch, PyTorch-2.6.0_fix-accuracy-issues-in-linalg_solve.patch, PyTorch-2.6.0_fix-distributed-tests-without-gpus.patch, PyTorch-2.6.0_fix-edge-case-causing-test_trigger_bisect_on_error-failure.patch, PyTorch-2.6.0_fix-ExcTests.test_trigger_on_error.patch, PyTorch-2.6.0_fix-flaky-test_aot_export_with_torch_cond.patch, PyTorch-2.6.0_fix-inductor-device-interface.patch, PyTorch-2.6.0_fix-server-in-test_control_plane.patch, PyTorch-2.6.0_fix-skip-decorators.patch, PyTorch-2.6.0_fix-sympy-1.13-compat.patch, PyTorch-2.6.0_fix-test_autograd_cpp_node_saved_float.patch, PyTorch-2.6.0_fix-test_linear_with_embedding.patch, PyTorch-2.6.0_fix-test_linear_with_in_out_buffer-without-mkl.patch, PyTorch-2.6.0_fix-test_public_bindings.patch, PyTorch-2.6.0_fix-test_unbacked_bindings_for_divisible_u_symint.patch, PyTorch-2.6.0_fix-vsx-vector-shift-functions.patch, PyTorch-2.6.0_fix-xnnpack-float16-convert.patch, PyTorch-2.6.0_increase-tolerance-test_aotdispatch-matmul.patch, PyTorch-2.6.0_increase-tolerance-test_quick-baddbmm.patch, PyTorch-2.6.0_increase-tolerance-test_vmap_autograd_grad.patch, PyTorch-2.6.0_no-cuda-stubs-rpath.patch, PyTorch-2.6.0_remove-test_slice_with_floordiv.patch, PyTorch-2.6.0_skip-diff-test-on-ppc.patch, PyTorch-2.6.0_skip-test_checkpoint_wrapper_parity-on-cpu.patch, PyTorch-2.6.0_skip-test_init_from_local_shards.patch, PyTorch-2.6.0_skip-test_jvp_linalg_det_singular.patch, PyTorch-2.6.0_skip-test-requiring-MKL.patch, PyTorch-2.6.0_skip-test_segfault.patch, PyTorch-2.6.0_skip-tests-without-fbgemm.patch
Updated software
|
|
@boegelbot please test @ jsc-zen3 |
|
@boegel: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 2858907011 processed Message to humans: this is just bookkeeping information for me, |
|
There might be another problem I haven't seen before: |
|
Test report by @boegelbot |
|
Test report by @Flamefire |
|
Test report by @lexming |
|
Test report by @lexming |
|
@lexming I see an unusually large number of failed tests. Do you have any details on that? As for the blocking error:
I'll look into the easyblock and relax the check if I cannot fix the issue. This test is a kind of sanity-check whether the results collected look complete and reasonable. Hence the hard failure to force looking into it instead of silently missing e.g. a large number of test results. |
|
I got the following error: Any ideas about what the issue is? |
|
Looks similar to the results of @lexming |
|
Here's the log file: easybuild-PyTorch-2.6.0-20250516.130225.QgNtC.log.gz |
|
@tardigradus In your log I see: This is a pytest bug for which I added a patch to our existing module: #22602 @lexming Probably the same for you looking at the similar failures
This should be fixed with easybuilders/easybuild-easyblocks#3723 |
|
I have reinstalled pytest as suggested, but now fails with What am I doing wrong? |
You need to start of the develop branch. IIRC the PR containing that patch isn't in a release yet. Or in a release newer than what you are using. |
Thanks for the hints. I have tried downloading the patch, but then I get errors about other patch files being missing. So perhaps I should "start of(f?) the develop branch", but what exactly does that mean? Is it covered in the documentation? If so, could you please let me know which part? |
|
Basically: All PRs might rely on files/changes not in any release yet. So if you don't use "something" with those changes you'll see issues. Otherwise install the current develop branch: https://docs.easybuild.io/installation-alternative/?h=install+easybuild+develop#installation-of-latest-development-version The next best option that might work is run easybuild with In both cases: New easyconfigs might require new easyblocks so this might fail anyway so better install all 3 repos (easyconfigs, easyblocks, framework) from the develop branch as described in the doc above. |
|
Thanks for the clarification. As the notes from the 2025.05.21 conference call indicate that the next release is nigh, I'll probably just wait for that and see how I get on. |
|
I have upgraded to EB 5.1.0. However, PyTorch 2.6.0 has a strict dependency on Would a solution be to create |
|
patch file PyTorch-2.6.0_fix-cpuinfo-bug-with-smt.patch is missing. |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
Test report by @Flamefire |
|
This build fails if |
No we don't. Why/How does it fail? It works for me (see above test reports) |
|
@boegelbot please test @ jsc-zen3 |
|
@lexming: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de PR test command '
Test results coming soon (I hope)... Details- notification for comment with ID 3125910886 processed Message to humans: this is just bookkeeping information for me, |
|
Test report by @lexming |
|
Test report by @boegelbot |
lexming
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is gold
|
Merging, thanks for the hard work @Flamefire ! |
|
easybuilders/easybuild-easyblocks#3803 is also relevant which avoids unnecessary failures. I'm updating that while testing PT 2.7 |
attaching the eb logs, we seem to be way over the 16 baseline for errors allowed |
|
It fails first because it misses a failed test. That should be inside One issue needs a rebuild of a dependent module: #22602 Then there are multiple What I'd try (on this machine) is running the test with the pypi package:
|
|
@arielzn It's hard to keep track of discussions like this in merged PRs, can we open an issue on this instead? |
(created using
eb --new-pr)