{ai}[foss/2022b] PyTorch v2.1.2 w/ CUDA 12.0.0 #20155

Flamefire · 2024-03-19T15:53:39Z

(created using eb --new-pr)

Requires:

{tools}[GCCcore/10.3.0 - 14.2.0] unittest-xml-reporting v3.1.0, lxml v5.3.0, libxslt v1.1.42 #22205

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb

casparvl · 2024-03-21T08:59:55Z

Test report by @casparvl
FAILED
Build succeeded for 3 out of 4 (1 easyconfigs in total)
gcn6.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 545.23.08, Python 3.6.8
See https://gist.github.com/casparvl/d118e91d83550334542ae6a2ee0a536e for a full test report.

Flamefire · 2024-03-21T09:56:48Z

Here and in #20156 test_cpp_extensions_aot_ninja fails and the related one too. But not due a test failure but some actual error. Can you check the log?

casparvl · 2024-03-21T18:23:50Z

Hm, the log contains a lot, it's a bit hard to read, but I think this is the relevant part:

Error log

Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
[1/1] /scratch-nvme/1/casparl/generic/software/CUDA/12.0.0/bin/nvcc  -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/
torch/include -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/scratch-nvme/1/
casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/include/TH -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3
.10/site-packages/torch/include/THC -I/scratch-nvme/1/casparl/generic/software/CUDA/12.0.0/include -I/gpfs/nvme1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-202
2b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/self_compiler_include_dirs_test -I/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/include/python3.10 -c -c /gpfs/nvme1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/torch_library.cu -o /gpfs/nvm
e1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/build/temp.linux-x86_64-cpython-310/torch_library.o -D__CUDA
_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi10
17"' -DTORCH_EXTENSION_NAME=torch_library -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_80,code=sm_80 -ccbin gcc -std=c++17
FAILED: /gpfs/nvme1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/build/temp.linux-x86_64-cpython-310/torch_library.o
/scratch-nvme/1/casparl/generic/software/CUDA/12.0.0/bin/nvcc  -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/
include -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/include/TH -I/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/si
te-packages/torch/include/THC -I/scratch-nvme/1/casparl/generic/software/CUDA/12.0.0/include -I/gpfs/nvme1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUD
A-12.0.0/pytorch-v2.1.2/test/cpp_extensions/self_compiler_include_dirs_test -I/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/include/python3.10 -c -c /gpfs/nvme1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/torch_library.cu -o /gpfs/nvme1/1/c
asparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/build/temp.linux-x86_64-cpython-310/torch_library.o -D__CUDA_NO_HA
LF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O2 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1017"' -
DTORCH_EXTENSION_NAME=torch_library -D_GLIBCXX_USE_CXX11_ABI=1 -gencode=arch=compute_80,code=sm_80 -ccbin gcc -std=c++17
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h: In function typename pybind11::detail::type_caster<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&):
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:120: error: expected template-name before < toke
n
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                        ^
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:120: error: expected identifier before < token
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before >
 token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                           ^
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before )
 token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2100, in _run_ninja_build
    subprocess.run(
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/subprocess.py", line 526, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/gpfs/nvme1/1/casparl/ebbuildpath/PyTorch/2.1.2/foss-2022b-CUDA-12.0.0/pytorch-v2.1.2/test/cpp_extensions/setup.py", line 90, in <module>
    setup(
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/__init__.py", line 87, in setup
    return distutils.core.setup(**attrs)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
    return run_commands(dist)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_c
ommands
    dist.run_commands()
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_c
ommands
    self.run_command(cmd)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 992, in run_c
ommand
    cmd_obj.run()
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/command/install.py", line 68, in run
    return orig.install.run(self)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/command/install.py", line 698, in run
    self.run_command('build')
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_command
    self.distribution.run_command(command)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 992, in run_c
ommand
    cmd_obj.run()
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/command/build.py", line 24, in run
    super().run()
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132,
 in run
    self.run_command(cmd_name)
File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 319, in run_co
mmand
    self.distribution.run_command(command)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/dist.py", line 1217, in run_command
    super().run_command(command)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 992, in run_command
    cmd_obj.run()
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 79, in run
    _build_ext.run(self)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 186, in r
un
    _build_ext.build_ext.run(self)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line
346, in run
    self.build_extensions()
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 873, in build_extensions
    build_ext.build_extensions(self)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/Cython/Distutils/old_build_ext.py", line 195, in b
uild_extensions
    _build_ext.build_ext.build_extensions(self)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line
466, in build_extensions
    self._build_extensions_serial()
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line
492, in _build_extensions_serial
    self.build_extension(ext)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 202, in bui
ld_extension
    _build_ext.build_extension(self, ext)
  File "/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line
547, in build_extension
    objects = self.compiler.compile(
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 686, in unix_wrap_ninja_com
pile
    _write_ninja_file_and_compile_objects(
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_
and_compile_objects
    _run_ninja_build(
   File "/scratch-nvme/1/casparl/ebtmpdir/eb-ukx4l8ka/tmpglj5n990/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2116, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
test_cpp_extensions_aot_ninja 1/1 failed!

casparvl · 2024-03-21T18:25:35Z

Error in test_cpp_extensions looks similar btw:

Error log:

/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h: In function typename pybind11::detail::type_caster
<typename pybind11::detail::intrinsic_type<T>::type>::cast_op_type<T> pybind11::detail::cast_op(make_caster<T>&):
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:120: error: expected template-name before < toke
n
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                        ^
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:120: error: expected identifier before < token
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:123: error: expected primary-expression before >
 token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                           ^
/scratch-nvme/1/casparl/generic/software/pybind11/2.10.3-GCCcore-12.2.0/include/pybind11/detail/../cast.h:45:126: error: expected primary-expression before )
 token
   45 |     return caster.operator typename make_caster<T>::template cast_op_type<T>();
      |                                                                                                                              ^
error: command '/scratch-nvme/1/casparl/generic/software/CUDA/12.0.0/bin/nvcc' failed with exit code 1
test_cpp_extensions_aot_no_ninja 1/1 failed!
Running test_cpp_extensions_jit 1/1 ... [2024-03-21 08:38:32.382254]
Executing ['/scratch-nvme/1/casparl/generic/software/Python/3.10.8-GCCcore-12.2.0/bin/python', '-bb', 'test_cpp_extensions_jit.py', '--shard-id=0', '--num-sh
ards=1', '-v', '-vv', '-rfEX', '-p', 'no:xdist', '--use-pytest', '--reruns=2'] ... [2024-03-21 08:38:32.382937]

Expand the folded group to see the log file of test_cpp_extensions_jit 1/1

Flamefire · 2024-03-21T18:26:37Z

Yep that is a known issue: Reinstall your pybind11 with the latest EC

casparvl · 2024-03-22T09:25:19Z

Great, will do! Sorry, there are so many fixes that I often can't keep up and don't always rebuild stuff XD I'll send a new test report after the pybind11 rebuild.

Flamefire · 2024-03-22T09:47:44Z

Great, will do! Sorry, there are so many fixes that I often can't keep up and don't always rebuild stuff XD I'll send a new test report after the pybind11 rebuild.

Yeah I know that is annoying, but we can't do much better than updating the existing EC(s) for such major bugs. It came up recently with someone else too so I remembered it.
Note that you can always try to search parts of the error in this repo or grep the local checkout. IIRC the patch contains the relevant part of the error.

Side note: This is actually a good reason to run the PyTorch test suite and investigate errors: Our pybind11 version isn't (wasn't) compatible with this PyTorch version which would make it less usable as this error is likely to pop up in user code using this module.

casparvl · 2024-03-22T10:01:14Z

Ok, I rebuild pybind11, it's now rebuilding this PR. Now we have to practice patience again ;-)

sassy-crick · 2024-03-22T22:46:36Z

Test report sassy-crick:
SUCCESS
Xeon(R) Platinum 8358, A100 GPU, Red Hat Enterprise Linux release 8.8 (Ootpa)
See here for a full test report

casparvl · 2024-03-23T09:33:40Z

Test report by @casparvl
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
gcn6.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 545.23.08, Python 3.6.8
See https://gist.github.com/casparvl/0a676dbccf3d4c9f2580582ad65d5e25 for a full test report.

Flamefire · 2024-03-23T09:58:35Z

@casparvl Looks similar to #20156 so increasing the allowed failures to 10 might be enough

casparvl · 2024-03-24T12:23:36Z

Failures are the same for

test_nn
test_jit_profiling
test_jit_legacy
distributed/fsdp/test_fsdp_core
test_jit
between this one and add patches to fix test issues for PyTorch 2.1.2 with foss/2023a + CUDA 12.1.1 #20156.

The only new one was a failure in distributed/test_c10d_nccl, which seems to have resulted in a hang:

____________________________________________ ProcessGroupNCCLTest.test_nccl_watchdog_cudagraph _____________________________________________
Traceback (most recent call last):
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-0svq1wxg/tmpqiveorku/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 506, in wrapper
    self._join_processes(fn)
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-0svq1wxg/tmpqiveorku/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 725, in _join_processes
    self._check_return_codes(elapsed_time)
  File "/scratch-nvme/1/casparl/ebtmpdir/eb-0svq1wxg/tmpqiveorku/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py", line 780, in _check_return_codes
    raise RuntimeError(
RuntimeError: Process 0 terminated or timed out after 300.0367577075958 seconds
----------------------------------------------------------- Captured stdout call -----------------------------------------------------------
Timing out after 300 seconds and killing subprocesses.
----------------------------------------------------------- Captured stdout call -----------------------------------------------------------
Timing out after 300 seconds and killing subprocesses.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/scratch-nvme/1/casparl/ebtmpdir/eb-0svq1wxg/tmpqiveorku/lib/python3.10/site-packages/torch/testing/_internal/common_distributed.py:718: KeyboardInterrupt
(to show a full traceback on KeyboardInterrupt use --full-trace)
======================================================= 2 rerun in 897.82s (0:14:57) =======================================================

casparvl · 2024-03-24T12:25:33Z

@boegelbot please test @ generoso
CORE_CNT=16

boegelbot · 2024-03-24T12:30:20Z

@casparvl: Request for testing this PR well received on login1

PR test command 'EB_PR=20155 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs /opt/software/slurm/bin/sbatch --job-name test_PR_20155 --ntasks="16" ~/boegelbot/eb_from_pr_upload_generoso.sh' executed!

exit code: 0
output:

Submitted batch job 13198

Test results coming soon (I hope)...

Details

- notification for comment with ID 2016793002 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegelbot · 2024-03-24T23:12:27Z

Test report by @boegelbot
SUCCESS
Build succeeded for 2 out of 2 (1 easyconfigs in total)
cnx5 - Linux Rocky Linux 8.5, x86_64, Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz (haswell), Python 3.6.8
See https://gist.github.com/boegelbot/499e262317097dc60188ace77f7f957e for a full test report.

casparvl · 2024-03-26T10:18:11Z

@boegelbot please test @ jsc-zen3
CORE_CNT=16

boegelbot · 2024-03-26T10:20:08Z

@casparvl: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20155 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20155 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 3860

Test results coming soon (I hope)...

Details

- notification for comment with ID 2020037044 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegelbot · 2024-03-26T17:05:51Z

Test report by @boegelbot
SUCCESS
Build succeeded for 3 out of 3 (1 easyconfigs in total)
jsczen3c1.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.3, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/f2bf0146d4d6cf62ee135ddc832e422f for a full test report.

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb

casparvl · 2024-04-02T09:36:32Z

Test report by @casparvl
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
gcn6.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 545.23.08, Python 3.6.8
See https://gist.github.com/casparvl/3376bb4371939aa03f6eacf1856727b7 for a full test report.

Flamefire · 2024-04-02T10:22:12Z

Test report by @casparvl FAILED Build succeeded for 0 out of 1 (1 easyconfigs in total) gcn6.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 545.23.08, Python 3.6.8 See https://gist.github.com/casparvl/3376bb4371939aa03f6eacf1856727b7 for a full test report.

You are missing the patches from #19666 which are in develop

casparvl · 2024-04-02T14:31:38Z

Ah, let me sync your branch with develop - I'm assuming you won't mind... :)

casparvl · 2024-04-02T14:34:14Z

Ok, rebuild started succesfully now. Test reporting should be there somewhere tonight. I'll trigger one more rebuild on one of the test clusters for good measure. Should be good to go afterwards...

casparvl · 2024-04-02T14:34:24Z

@boegelbot please test @ jsc-zen3
CORE_CNT=16

boegelbot · 2024-04-02T14:40:08Z

@casparvl: Request for testing this PR well received on jsczen3l1.int.jsc-zen3.fz-juelich.de

PR test command 'if [[ develop != 'develop' ]]; then EB_BRANCH=develop ./easybuild_develop.sh 2> /dev/null 1>&2; EB_PREFIX=/home/boegelbot/easybuild/develop source init_env_easybuild_develop.sh; fi; EB_PR=20155 EB_ARGS= EB_CONTAINER= EB_REPO=easybuild-easyconfigs EB_BRANCH=develop /opt/software/slurm/bin/sbatch --job-name test_PR_20155 --ntasks="16" ~/boegelbot/eb_from_pr_upload_jsc-zen3.sh' executed!

exit code: 0
output:

Submitted batch job 3909

Test results coming soon (I hope)...

Details

- notification for comment with ID 2032212011 processed

Message to humans: this is just bookkeeping information for me,
it is of no use to you (unless you think I have a bug, which I don't).

boegelbot · 2024-04-02T21:14:29Z

Test report by @boegelbot
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
jsczen3c2.int.jsc-zen3.fz-juelich.de - Linux Rocky Linux 9.3, x86_64, AMD EPYC-Milan Processor (zen3), Python 3.9.18
See https://gist.github.com/boegelbot/ef331e090522687eafe998b223c49185 for a full test report.

casparvl · 2024-04-03T05:22:16Z

Test report by @casparvl
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
gcn6.local.snellius.surf.nl - Linux RHEL 8.6, x86_64, Intel(R) Xeon(R) Platinum 8360Y CPU @ 2.40GHz, 4 x NVIDIA NVIDIA A100-SXM4-40GB, 545.23.08, Python 3.6.8
See https://gist.github.com/casparvl/5830985b8a671f1c06e9d11f62593b9d for a full test report.

akesandgren · 2024-04-29T12:07:21Z

Test report by @akesandgren
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
b-cn1603.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 545.29.06, Python 3.10.12
See https://gist.github.com/akesandgren/581ba5cfbd45762c316a059be80e91ac for a full test report.

Flamefire · 2024-12-06T19:52:30Z

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
i8009 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 555.42.06, Python 3.8.17
See https://gist.github.com/Flamefire/a76061566d142c8ca24acc3a14f1bcce for a full test report.

Flamefire · 2024-12-09T07:04:20Z

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
i8009 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 555.42.06, Python 3.8.17
See https://gist.github.com/Flamefire/5dc93cf8d43773f7fbd08e99328f1fb7 for a full test report.

Flamefire · 2024-12-11T20:50:33Z

Test report by @Flamefire
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
i8002 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7352 24-Core Processor (zen2), 8 x NVIDIA NVIDIA A100-SXM4-40GB, 555.42.06, Python 3.8.17
See https://gist.github.com/Flamefire/b1f252151626692f4c2140d0c61676f4 for a full test report.

github-actions · 2025-02-21T12:53:41Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index bce1b68aa7..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,32 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 use_pip = True
@@ -170,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -177,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 65dfced170..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,36 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 use_pip = True
@@ -224,10 +233,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

Diff against PyTorch-2.1.2-foss-2023a.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index a79f709480..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,35 +165,40 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 use_pip = True
+buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
     '': [
@@ -169,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Flamefire · 2025-02-22T00:02:32Z

Test report by @Flamefire
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3633
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
i7015 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor (zen2), Python 3.8.17
See https://gist.github.com/Flamefire/36fe9053d119f7dbc839ac7f4248cce1 for a full test report.

github-actions · 2025-02-24T07:48:24Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index bce1b68aa7..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,32 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 use_pip = True
@@ -170,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -177,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 65dfced170..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,36 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 use_pip = True
@@ -224,10 +233,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

Diff against PyTorch-2.1.2-foss-2023a.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index a79f709480..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,35 +165,40 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 use_pip = True
+buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
     '': [
@@ -169,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Flamefire · 2025-02-28T17:41:59Z

Test report by @Flamefire
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3633
SUCCESS
Build succeeded for 1 out of 1 (1 easyconfigs in total)
i7185 - Linux Rocky Linux 8.9 (Green Obsidian), x86_64, AMD EPYC 7702 64-Core Processor (zen2), Python 3.8.17
See https://gist.github.com/Flamefire/c67402f6ec692d9bebc2694be88a6685 for a full test report.

github-actions · 2025-03-20T11:16:34Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,70 +121,83 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -166,24 +214,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,34 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -169,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,38 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -223,10 +233,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

github-actions · 2025-03-20T13:32:39Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,70 +121,83 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -166,24 +214,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,34 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -169,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,38 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -223,10 +233,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

github-actions · 2025-03-20T16:00:30Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,70 +121,83 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -166,24 +214,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,34 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -169,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,38 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -223,10 +233,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

github-actions · 2025-03-20T18:41:10Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,70 +121,83 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -166,24 +214,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,34 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -169,6 +214,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +231,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..d28afe075c 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,38 +165,39 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
+use_pip = True
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
 
 excluded_tests = {
@@ -223,10 +233,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

github-actions · 2025-03-21T07:12:19Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,68 +121,80 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -166,24 +213,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,32 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -169,6 +213,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +230,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,36 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -223,10 +232,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

github-actions · 2025-03-21T07:16:15Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,68 +121,80 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -166,24 +213,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,32 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -169,6 +213,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +230,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,36 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -223,10 +232,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

…es: PyTorch-2.1.2_add-cuda-skip-markers.patch, PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch, PyTorch-2.1.2_fix-device-mesh-check.patch, PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch, PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch, PyTorch-2.1.2_fix-test_memory_profiler.patch, PyTorch-2.1.2_fix-test_torchinductor-rounding.patch, PyTorch-2.1.2_fix-vsx-vector-abs.patch, PyTorch-2.1.2_fix-vsx-vector-div.patch, PyTorch-2.1.2_fix-with_temp_dir-decorator.patch, PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch, PyTorch-2.1.2_relax-cuda-tolerances.patch, PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch, PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch, PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch, PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch, PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch

github-actions · 2025-03-21T09:15:10Z

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Diff against PyTorch-2.3.0-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 308397336a..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.3.0-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,16 +1,18 @@
 name = 'PyTorch'
-version = '2.3.0'
+version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
 patches = [
     'PyTorch-1.7.0_disable-dev-shm-test.patch',
+    'PyTorch-1.11.1_skip-test_init_from_local_shards.patch',
     'PyTorch-1.12.1_add-hypothesis-suppression.patch',
     'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch',
     'PyTorch-1.12.1_fix-TestTorch.test_to.patch',
@@ -22,34 +24,53 @@ patches = [
     'PyTorch-1.13.1_skip-tests-without-fbgemm.patch',
     'PyTorch-2.0.1_avoid-test_quantization-failures.patch',
     'PyTorch-2.0.1_fix-skip-decorators.patch',
+    'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch',
     'PyTorch-2.0.1_fix-vsx-loadu.patch',
+    'PyTorch-2.0.1_no-cuda-stubs-rpath.patch',
     'PyTorch-2.0.1_skip-failing-gradtest.patch',
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
+    'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
+    'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
+    'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
+    'PyTorch-2.1.0_fix-validationError-output-test.patch',
     'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch',
     'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch',
+    'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch',
     'PyTorch-2.1.0_remove-test-requiring-online-access.patch',
     'PyTorch-2.1.0_skip-diff-test-on-ppc.patch',
     'PyTorch-2.1.0_skip-dynamo-test_predispatch.patch',
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
+    'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
+    'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
+    'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
+    'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
+    'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
+    'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
-    'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch',
-    'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch',
-    'PyTorch-2.3.0_skip-test_init_from_local_shards.patch',
-    'PyTorch-2.3.0_no-cuda-stubs-rpath.patch',
-    'PyTorch-2.3.0_disable-gcc12-warning.patch',
-    'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch',
-    'PyTorch-2.3.0_disable_tests_which_need_network_download.patch',
-    'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch',
-    'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch',
-    'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch',
-    'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch',
-    'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch',
 ]
 checksums = [
-    {'pytorch-v2.3.0.tar.gz': '69579513b26261bbab32e13b7efc99ad287fcf3103087f2d4fdf1adacd25316f'},
+    {'pytorch-v2.1.2.tar.gz': '85effbcce037bffa290aea775c9a4bad5f769cb229583450c40055501ee1acd7'},
     {'PyTorch-1.7.0_disable-dev-shm-test.patch': '622cb1eaeadc06e13128a862d9946bcc1f1edd3d02b259c56a9aecc4d5406b8a'},
+    {'PyTorch-1.11.1_skip-test_init_from_local_shards.patch':
+     '4aeb1b0bc863d4801b0095cbce69f8794066748f0df27c6aaaf729c5ecba04b7'},
     {'PyTorch-1.12.1_add-hypothesis-suppression.patch':
      'e71ffb94ebe69f580fa70e0de84017058325fdff944866d6bd03463626edc32c'},
     {'PyTorch-1.12.1_fix-test_cpp_extensions_jit.patch':
@@ -69,16 +90,30 @@ checksums = [
     {'PyTorch-2.0.1_avoid-test_quantization-failures.patch':
      '02e3f47e4ed1d7d6077e26f1ae50073dc2b20426269930b505f4aefe5d2f33cd'},
     {'PyTorch-2.0.1_fix-skip-decorators.patch': '2039012cef45446065e1a2097839fe20bb29fe3c1dcc926c3695ebf29832e920'},
+    {'PyTorch-2.0.1_fix-ub-in-inductor-codegen.patch':
+     '1b37194f55ae678f3657b8728dfb896c18ffe8babe90987ce468c4fa9274f357'},
     {'PyTorch-2.0.1_fix-vsx-loadu.patch': 'a0ffa61da2d47c6acd09aaf6d4791e527d8919a6f4f1aa7ed38454cdcadb1f72'},
+    {'PyTorch-2.0.1_no-cuda-stubs-rpath.patch': '8902e58a762240f24cdbf0182e99ccdfc2a93492869352fcb4ca0ec7e407f83a'},
     {'PyTorch-2.0.1_skip-failing-gradtest.patch': '8030bdec6ba49b057ab232d19a7f1a5e542e47e2ec340653a246ec9ed59f8bc1'},
     {'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch':
      '7047862abc1abaff62954da59700f36d4f39fcf83167a638183b1b7f8fec78ae'},
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
+    {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
+    {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
+     'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
+    {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
+     '84bb51a719abc677031a7a3dfe4382ff098b0cbd8b39b8bed2a7fa03f80ac1e9'},
+    {'PyTorch-2.1.0_fix-validationError-output-test.patch':
+     '7eba0942afb121ed92fac30d1529447d892a89eb3d53c565f8e9d480e95f692b'},
     {'PyTorch-2.1.0_fix-vsx-vector-shift-functions.patch':
      '3793b4b878be1abe7791efcbd534774b87862cfe7dc4774ca8729b6cabb39e7e'},
     {'PyTorch-2.1.0_increase-tolerance-functorch-test_vmapvjpvjp.patch':
      'aef38adf1210d0c5455e91d7c7a9d9e5caad3ae568301e0ba9fc204309438e7b'},
+    {'PyTorch-2.1.0_remove-sparse-csr-nnz-overflow-test.patch':
+     '0ac36411e76506b3354c85a8a1260987f66af947ee52ffc64230aee1fa02ea8b'},
     {'PyTorch-2.1.0_remove-test-requiring-online-access.patch':
      '35184b8c5a1b10f79e511cc25db3b8a5585a5d58b5d1aa25dd3d250200b14fd7'},
     {'PyTorch-2.1.0_skip-diff-test-on-ppc.patch': '394157dbe565ffcbc1821cd63d05930957412156cc01e949ef3d3524176a1dda'},
@@ -86,68 +121,80 @@ checksums = [
      '6298daf9ddaa8542850eee9ea005f28594ab65b1f87af43d8aeca1579a8c4354'},
     {'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch':
      '5229ca88a71db7667a90ddc0b809b2c817698bd6e9c5aaabd73d3173cf9b99fe'},
+    {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
+     '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
+    {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
+    {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
+     'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
+    {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
+     '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
+    {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
+     'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
+    {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
+    {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
-    {'PyTorch-2.3.0_disable_test_linear_package_if_no_half_types_are_available.patch':
-     '23416f2d9d5226695ec3fbea0671e3650c655c19deefd3f0f8ddab5afa50f485'},
-    {'PyTorch-2.3.0_disable_DataType_dependent_test_if_tensorboard_is_not_available.patch':
-     '0dcbdfde6752c3ff54c5376f521b4a742167669feb7f0f1d4e1d4d55f72b664f'},
-    {'PyTorch-2.3.0_skip-test_init_from_local_shards.patch':
-     '90ed9c2870f57ee6dc032d00873a37e2217a2b92a13035ded1c25ad5306455f2'},
-    {'PyTorch-2.3.0_no-cuda-stubs-rpath.patch':
-     '7ba26824b5def7379cff02ae821a080698e6affea0da45bc846e9ecb89939cb1'},
-    {'PyTorch-2.3.0_disable-gcc12-warning.patch':
-     'a8a624e1a2a5f4c82610173e50bd0f853e49bd5621b432f5aac689f9f6eb1514'},
-    {'PyTorch-2.3.0_fix-test_extension_backend-without-vectorization.patch':
-     '36aa2d5ba175be17f4e996f4fb2d544fe477d4a0bd0644cd59a85063779afc8e'},
-    {'PyTorch-2.3.0_disable_tests_which_need_network_download.patch':
-     'b7fd1a5135dfd4098cdc054182f7bf84a23ac98462a00477712182b5442da855'},
-    {'PyTorch-2.3.0_avoid_caffe2_test_cpp_jit.patch':
-     '041adcd91d994b8c2ab57d227f081cd57e572c157117b37171e1eb8eb576f8fc'},
-    {'PyTorch-2.3.0_fix_missing_masked_load_for_int_type.patch':
-     'aa6ff764f3f7bf84372a8a257fe1b4ae6dc4b9744ad35f0f9015f2696c62a41e'},
-    {'PyTorch-2.3.0_skip_test_var_mean_differentiable.patch':
-     '9703fd0f1fca8916f6d79d83e9a7efe8e3f717362a5fdaa8f5d9da90d0c75018'},
-    {'PyTorch-2.3.0_skip_test_sdpa_nn_functional_scaled_dot_product_attention_cpu.patch':
-     '7955f2655db3da18606574fdcbc5990be24098f49ad1db5e86ea756ea1cc506f'},
-    {'PyTorch-2.3.0_fix-mkldnn-avx512-f32-bias.patch':
-     'ee07d21c3ac7aeb0bd0e39507b18a417b9125284a529102929c4b5c6727c2976'},
 ]
 
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
-    ('tlparse', '0.3.5'),
-    ('optree', '0.13.0'),
     ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -166,24 +213,33 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
-        # This test is expected to fail when run in their CI, but won't in our case.
-        # It just checks for a "CI" env variable
-        'test_ci_sanity_check_fail',
-        # This fails consistently and is disabled upstream
-        # See https://github.com/pytorch/pytorch/issues/100152 and
-        # https://github.com/pytorch/pytorch/pull/124712
-        'test_cpp_extensions_open_device_registration',
-
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
-local_test_opts = '--continue-through-error --pipe-logs --verbose %(excluded_tests)s'
-runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py ' + local_test_opts
+runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-through-error  --verbose %(excluded_tests)s'
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 6
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023b.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index b4b25bd33e..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023b.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -1,11 +1,12 @@
 name = 'PyTorch'
 version = '2.1.2'
+versionsuffix = '-CUDA-%(cudaver)s'
 
 homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023b'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -30,6 +31,7 @@ patches = [
     'PyTorch-2.0.1_skip-test_shuffle_reproducibility.patch',
     'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch',
     'PyTorch-2.1.0_disable-gcc12-warning.patch',
+    'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch',
     'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch',
     'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch',
     'PyTorch-2.1.0_fix-validationError-output-test.patch',
@@ -42,13 +44,26 @@ patches = [
     'PyTorch-2.1.0_skip-test_jvp_linalg_det_singular.patch',
     'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch',
     'PyTorch-2.1.0_skip-test_wrap_bad.patch',
+    'PyTorch-2.1.2_add-cuda-skip-markers.patch',
+    'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
+    'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
+    'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
+    'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch',
+    'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch',
+    'PyTorch-2.1.2_relax-cuda-tolerances.patch',
+    'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
+    'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
+    'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -85,6 +100,8 @@ checksums = [
     {'PyTorch-2.0.1_skip-tests-skipped-in-subprocess.patch':
      '166c134573a95230e39b9ea09ece3ad8072f39d370c9a88fb2a1e24f6aaac2b5'},
     {'PyTorch-2.1.0_disable-gcc12-warning.patch': 'c858b8db0010f41005dc06f9a50768d0d3dc2d2d499ccbdd5faf8a518869a421'},
+    {'PyTorch-2.1.0_disable-cudnn-tf32-for-too-strict-tests.patch':
+     'd895018ebdfd46e65d9f7645444a3b4c5bbfe3d533a08db559a04be34e01e478'},
     {'PyTorch-2.1.0_fix-bufferoverflow-in-oneDNN.patch':
      'b15b1291a3c37bf6a4982cfbb3483f693acb46a67bc0912b383fd98baf540ccf'},
     {'PyTorch-2.1.0_fix-test_numpy_torch_operators.patch':
@@ -107,17 +124,40 @@ checksums = [
     {'PyTorch-2.1.0_skip-test_linear_fp32-without-MKL.patch':
      '5dcc79883b6e3ec0a281a8e110db5e0a5880de843bb05653589891f16473ead5'},
     {'PyTorch-2.1.0_skip-test_wrap_bad.patch': 'b8583125ee94e553b6f77c4ab4bfa812b89416175dc7e9b7390919f3b485cb63'},
+    {'PyTorch-2.1.2_add-cuda-skip-markers.patch': 'd007d6d0cdb533e7d01f503e9055218760123a67c1841c57585385144be18c9a'},
+    {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
+     'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
+    {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
+    {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
+     'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
     {'PyTorch-2.1.2_fix-vsx-vector-div.patch': '11f497a6892eb49b249a15320e4218e0d7ac8ae4ce67de39e4a018a064ca1acc'},
+    {'PyTorch-2.1.2_fix-with_temp_dir-decorator.patch':
+     '90bd001e034095329277d70c6facc4026b4ce6d7f8b8d6aa81c0176eeb462eb1'},
+    {'PyTorch-2.1.2_fix-wrong-device-mesh-size-in-tests.patch':
+     '07a5e4233d02fb6348872838f4d69573c777899c6f0ea4e39ae23c08660d41e5'},
+    {'PyTorch-2.1.2_relax-cuda-tolerances.patch': '554ad09787f61080fafdb84216e711e32327aa357e2a9c40bb428eb6503dee6e'},
+    {'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch':
+     'e6a1efe3d127fcbf4723476a7a1c01cfcf2ccb16d1fb250f478192623e8b6a15'},
     {'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch':
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
+    {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
+     '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
+    {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
+     '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -125,32 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.27.6'),
-    ('hypothesis', '6.90.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
-    ('pytest-rerunfailures', '14.0'),
+    ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.5'),
-    ('Python-bundle-PyPI', '2023.10'),
-    ('protobuf', '25.3'),
-    ('protobuf-python', '4.25.3'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.11'),
-    ('PyYAML', '6.0.1'),
-    ('MPFR', '4.2.1'),
-    ('GMP', '6.3.0'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
+    ('PyYAML', '6.0'),
+    ('MPFR', '4.2.0'),
+    ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.2.0'),
-    ('expecttest', '0.2.1'),
-    ('networkx', '3.2.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.13.0',),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -169,6 +213,16 @@ excluded_tests = {
         # intermittent failures on various systems
         # See https://github.com/easybuilders/easybuild-easyconfigs/issues/17712
         'distributed/rpc/test_tensorpipe_agent',
+        # Broken test, can't ever succeed, see https://github.com/pytorch/pytorch/issues/122184
+        'distributed/tensor/parallel/test_tp_random_state',
+        # failures on OmniPath systems, which don't support some optional InfiniBand features
+        # See https://github.com/pytorch/tensorpipe/issues/413
+        'distributed/pipeline/sync/skip/test_gpipe',
+        'distributed/pipeline/sync/skip/test_leak',
+        'distributed/pipeline/sync/test_bugs',
+        'distributed/pipeline/sync/test_inplace',
+        'distributed/pipeline/sync/test_pipe',
+        'distributed/pipeline/sync/test_transparency',
     ]
 }
 
@@ -176,8 +230,16 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 
 # Especially test_quantization has a few corner cases that are triggered by the random input values,
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
+# test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
+
+# The readelf sanity check command can be taken out once the TestRPATH test from
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
+local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
+sanity_check_commands = [
+    "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,
+]
 
 tests = ['PyTorch-check-cpp-extension.py']

Diff against PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb

diff --git a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
index 6432bd1932..8c6cce35e4 100644
--- a/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2023a-CUDA-12.1.1.eb
+++ b/easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb
@@ -6,7 +6,7 @@ homepage = 'https://pytorch.org/'
 description = """Tensors and Dynamic neural networks in Python with strong GPU acceleration.
 PyTorch is a deep learning framework that puts Python first."""
 
-toolchain = {'name': 'foss', 'version': '2023a'}
+toolchain = {'name': 'foss', 'version': '2022b'}
 
 source_urls = [GITHUB_RELEASE]
 sources = ['%(namelower)s-v%(version)s.tar.gz']
@@ -47,9 +47,12 @@ patches = [
     'PyTorch-2.1.2_add-cuda-skip-markers.patch',
     'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch',
     'PyTorch-2.1.2_fix-device-mesh-check.patch',
+    'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch',
     'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch',
+    'PyTorch-2.1.2_fix-test_cuda-non-x86.patch',
     'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch',
     'PyTorch-2.1.2_fix-test_memory_profiler.patch',
+    'PyTorch-2.1.2_fix-test_parallelize_api.patch',
     'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch',
     'PyTorch-2.1.2_fix-vsx-vector-abs.patch',
     'PyTorch-2.1.2_fix-vsx-vector-div.patch',
@@ -59,8 +62,8 @@ patches = [
     'PyTorch-2.1.2_remove-nccl-backend-default-without-gpus.patch',
     'PyTorch-2.1.2_skip-cpu_repro-test-without-vectorization.patch',
     'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch',
-    'PyTorch-2.1.2_skip-memory-leak-test.patch',
     'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch',
+    'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch',
     'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch',
 ]
 checksums = [
@@ -125,12 +128,17 @@ checksums = [
     {'PyTorch-2.1.2_fix-conj-mismatch-test-failures.patch':
      'c164357efa4ce88095376e590ba508fc1daa87161e1e59544eda56daac7f2847'},
     {'PyTorch-2.1.2_fix-device-mesh-check.patch': 'c0efc288bf3d9a9a3c8bbd2691348a589a2677ea43880a8c987db91c8de4806b'},
+    {'PyTorch-2.1.2_fix-fsdp-tp-integration-test.patch':
+     'f583532c59f35f36998851957d501b3ac8c883884efd61bbaa308db55cb6bdcd'},
     {'PyTorch-2.1.2_fix-locale-issue-in-nvrtcCompileProgram.patch':
      'f7adafb4e4d3b724b93237a259797b6ed6f535f83be0e34a7b759c71c6a8ddf2'},
+    {'PyTorch-2.1.2_fix-test_cuda-non-x86.patch': '1ed76fcc87e6c50606ac286487292a3d534707068c94af74c3a5de8153fa2c2c'},
     {'PyTorch-2.1.2_fix-test_extension_backend-without-vectorization.patch':
      'cd1455495886a7d6b2d30d48736eb0103fded21e2e36de6baac719b9c52a1c92'},
     {'PyTorch-2.1.2_fix-test_memory_profiler.patch':
      '30b0c9355636c0ab3dedae02399789053825dc3835b4d7dac6e696767772b1ce'},
+    {'PyTorch-2.1.2_fix-test_parallelize_api.patch':
+     'f8387a1693af344099c806981ca38df1306d7f4847d7d44713306338384b1cfd'},
     {'PyTorch-2.1.2_fix-test_torchinductor-rounding.patch':
      'a0ef99192ee2ad1509c78a8377023d5be2b5fddb16f84063b7c9a0b53d979090'},
     {'PyTorch-2.1.2_fix-vsx-vector-abs.patch': 'd67d32407faed7dc1dbab4bba0e2f7de36c3db04560ced35c94caf8d84ade886'},
@@ -146,9 +154,10 @@ checksums = [
      '7ace835af60c58d9e0754a34c19d4b9a0c3a531f19e5d0eba8e2e49206eaa7eb'},
     {'PyTorch-2.1.2_skip-failing-test_dtensor_ops-subtests.patch':
      '6cf711bf26518550903b09ed4431de9319791e79d61aab065785d6608fd5cc88'},
-    {'PyTorch-2.1.2_skip-memory-leak-test.patch': '8d9841208e8a00a498295018aead380c360cf56e500ef23ca740adb5b36de142'},
     {'PyTorch-2.1.2_skip-test_fsdp_tp_checkpoint_integration.patch':
      '943ee92f5fd518f608a59e43fe426b9bb45d7e7ad0ba04639e516db2d61fa57d'},
+    {'PyTorch-2.1.2_skip-xfailing-test_dtensor_ops.patch':
+     '7f5befddcb006b6ab5377de6ee3c29df375c5f8ef5e42b998d35113585b983f3'},
     {'PyTorch-2.1.2_workaround_dynamo_failure_without_nnpack.patch':
      'fb96eefabf394617bbb3fbd3a7a7c1aa5991b3836edc2e5d2a30e708bfe49ba1'},
 ]
@@ -156,36 +165,36 @@ checksums = [
 osdependencies = [OS_PKG_IBVERBS_DEV]
 
 builddependencies = [
-    ('CMake', '3.26.3'),
-    ('hypothesis', '6.82.0'),
+    ('CMake', '3.24.3'),
+    ('hypothesis', '6.68.2'),
     # For tests
     ('pytest-flakefinder', '1.1.0'),
     ('pytest-rerunfailures', '12.0'),
     ('pytest-shard', '0.1.2'),
+    ('unittest-xml-reporting', '3.1.0'),
 ]
 
 dependencies = [
-    ('CUDA', '12.1.1', '', SYSTEM),
-    ('cuDNN', '8.9.2.26', '-CUDA-%(cudaver)s', SYSTEM),
-    ('magma', '2.7.2', '-CUDA-%(cudaver)s'),
-    ('NCCL', '2.18.3', '-CUDA-%(cudaver)s'),
+    ('CUDA', '12.0.0', '', SYSTEM),
+    ('cuDNN', '8.8.0.121', '-CUDA-%(cudaver)s', SYSTEM),
+    ('magma', '2.7.1', '-CUDA-%(cudaver)s'),
+    ('NCCL', '2.16.2', '-CUDA-%(cudaver)s'),
     ('Ninja', '1.11.1'),  # Required for JIT compilation of C++ extensions
-    ('Python', '3.11.3'),
-    ('Python-bundle-PyPI', '2023.06'),
-    ('protobuf', '24.0'),
-    ('protobuf-python', '4.24.0'),
-    ('pybind11', '2.11.1'),
-    ('SciPy-bundle', '2023.07'),
+    ('Python', '3.10.8'),
+    ('protobuf', '23.0'),
+    ('protobuf-python', '4.23.0'),
+    ('pybind11', '2.10.3'),
+    ('SciPy-bundle', '2023.02'),
     ('PyYAML', '6.0'),
     ('MPFR', '4.2.0'),
     ('GMP', '6.2.1'),
     ('numactl', '2.0.16'),
-    ('FFmpeg', '6.0'),
-    ('Pillow', '10.0.0'),
-    ('expecttest', '0.1.5'),
-    ('networkx', '3.1'),
+    ('FFmpeg', '5.1.2'),
+    ('Pillow', '9.4.0'),
+    ('expecttest', '0.1.3'),
+    ('networkx', '3.0'),
     ('sympy', '1.12'),
-    ('Z3', '4.12.2'),
+    ('Z3', '4.12.2', '-Python-%(pyver)s'),
 ]
 
 buildcmd = '%(python)s setup.py build'  # Run the (long) build in the build step
@@ -223,10 +232,10 @@ runtest = 'cd test && PYTHONUNBUFFERED=1 %(python)s run_test.py --continue-throu
 # those cannot be easily avoided, see https://github.com/pytorch/pytorch/issues/107030
 # test_nn is also prone to spurious failures: https://github.com/pytorch/pytorch/issues/118294
 # So allow a low number of tests to fail as the tests "usually" succeed
-max_failed_tests = 2
+max_failed_tests = 10
 
 # The readelf sanity check command can be taken out once the TestRPATH test from
-# https://github.com/pytorch/pytorch/pull/109493 is accepted, since it is then checked as part of the PyTorch test suite
+# https://github.com/pytorch/pytorch/pull/122318 is accepted, since it is then checked as part of the PyTorch test suite
 local_libcaffe2 = "$EBROOTPYTORCH/lib/python%%(pyshortver)s/site-packages/torch/lib/libcaffe2_nvrtc.%s" % SHLIB_EXT
 sanity_check_commands = [
     "readelf -d %s | egrep 'RPATH|RUNPATH' | grep -v stubs" % local_libcaffe2,

akesandgren · 2025-10-13T23:05:26Z

Test report by @akesandgren
Using easyblocks from PR(s) easybuilders/easybuild-easyblocks#3803
FAILED
Build succeeded for 0 out of 1 (1 easyconfigs in total)
b-cn1611.hpc2n.umu.se - Linux Ubuntu 22.04, x86_64, AMD EPYC 7313 16-Core Processor, 1 x NVIDIA NVIDIA A100 80GB PCIe, 555.58.02, Python 3.10.12
See https://gist.github.com/akesandgren/ec0d3a143d99d137e8cd4fe85507667b for a full test report.

Flamefire · 2025-10-14T06:40:54Z

@akesandgren Hm, multiple segfaults. Maybe try #20520 on the same machine which uses another NCCL version

Flamefire · 2025-10-16T08:34:44Z

Superseded by #20520 which uses the correct NCCL for PyTorch

casparvl reviewed Mar 19, 2024

View reviewed changes

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb Outdated Show resolved Hide resolved

Flamefire commented Mar 27, 2024

View reviewed changes

easybuild/easyconfigs/p/PyTorch/PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb Outdated Show resolved Hide resolved

Flamefire force-pushed the 20240319165333_new_pr_PyTorch212 branch from 5564421 to 8bb0d57 Compare April 16, 2024 10:33

Flamefire mentioned this pull request Feb 24, 2025

Use unittest XML files to parse PyTorch test results easybuilders/easybuild-easyblocks#3633

Merged

3 tasks

Flamefire force-pushed the 20240319165333_new_pr_PyTorch212 branch from ead558a to 7f8e2b5 Compare March 21, 2025 07:15

github-actions bot added the change label Mar 21, 2025

Flamefire added 4 commits March 21, 2025 10:14

Add patch to skip some subtests of test_dtensor_ops

e97aea2

Fix test failures on Power9 and systems with 6 GPUs

1db83e7

Add unittest-xml-reporting

8ba9179

Flamefire force-pushed the 20240319165333_new_pr_PyTorch212 branch from 7f8e2b5 to 8ba9179 Compare March 21, 2025 09:14

github-actions bot removed the change label Mar 21, 2025

Thyre added the 2022b label Aug 18, 2025

akesandgren self-assigned this Oct 13, 2025

Flamefire closed this Oct 16, 2025

Flamefire deleted the 20240319165333_new_pr_PyTorch212 branch October 16, 2025 08:34

{ai}[foss/2022b] PyTorch v2.1.2 w/ CUDA 12.0.0 #20155

{ai}[foss/2022b] PyTorch v2.1.2 w/ CUDA 12.0.0 #20155

Uh oh!

Conversation

Flamefire commented Mar 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

casparvl commented Mar 21, 2024

Uh oh!

Flamefire commented Mar 21, 2024

Uh oh!

casparvl commented Mar 21, 2024

Uh oh!

casparvl commented Mar 21, 2024

Uh oh!

Flamefire commented Mar 21, 2024

Uh oh!

casparvl commented Mar 22, 2024

Uh oh!

Flamefire commented Mar 22, 2024

Uh oh!

casparvl commented Mar 22, 2024

Uh oh!

sassy-crick commented Mar 22, 2024

Uh oh!

casparvl commented Mar 23, 2024

Uh oh!

Flamefire commented Mar 23, 2024

Uh oh!

casparvl commented Mar 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

casparvl commented Mar 24, 2024

Uh oh!

boegelbot commented Mar 24, 2024

Uh oh!

boegelbot commented Mar 24, 2024

Uh oh!

casparvl commented Mar 26, 2024

Uh oh!

boegelbot commented Mar 26, 2024

Uh oh!

boegelbot commented Mar 26, 2024

Uh oh!

Uh oh!

casparvl commented Apr 2, 2024

Uh oh!

Flamefire commented Apr 2, 2024

Uh oh!

casparvl commented Apr 2, 2024

Uh oh!

casparvl commented Apr 2, 2024

Uh oh!

casparvl commented Apr 2, 2024

Uh oh!

boegelbot commented Apr 2, 2024

Uh oh!

boegelbot commented Apr 2, 2024

Uh oh!

casparvl commented Apr 3, 2024

Uh oh!

akesandgren commented Apr 29, 2024

Uh oh!

Flamefire commented Dec 6, 2024

Uh oh!

Flamefire commented Dec 9, 2024

Uh oh!

Flamefire commented Dec 11, 2024

Uh oh!

github-actions bot commented Feb 21, 2025

Updated software PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb

Uh oh!

Flamefire commented Feb 22, 2025

Uh oh!

github-actions bot commented Feb 24, 2025

Updated software PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb

Uh oh!

Flamefire commented Feb 28, 2025

Uh oh!

Flamefire commented Mar 19, 2024 •

edited

Loading

casparvl commented Mar 24, 2024 •

edited

Loading

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`

Updated software `PyTorch-2.1.2-foss-2022b-CUDA-12.0.0.eb`