Skip to content

Conversation

@stigrj
Copy link
Contributor

@stigrj stigrj commented Dec 17, 2021

For whatever reason, this change in the cuda.py easyblock triggered an error in the hipSYCL build.

Before this easyblock change the hipSYCL CMake installation reports

-- Set runtime path of "/cluster/software/hipSYCL/0.9.1-gcccuda-2020b/lib/hipSYCL/librt-backend-cuda.so" to ""

which allows it to pick up the correct runtime CUDA library, i.e. the CUDAcore "stub" library when building on a machine without a CUDA driver, but a proper CUDA lib when running on a CUDA machine. After said easyblock change the hipSYCL installation instead reports

-- Set runtime path of "/cluster/software/hipSYCL/0.9.1-gcccuda-2020b/lib/hipSYCL/librt-backend-cuda.so" to "/cluster/software/CUDAcore/11.1.1/lib64/stubs"

which fixes the runtime path to the stub library, even if a proper CUDA lib is present on the machine.

The proposed change will again set the runtime path to empty "" so that the expected behavior is retained. Not sure if the original correct behavior was a result of "two wrongs make a right", but at least the -DWITH_INSTALL_RPATH_USE_LINK_PATH=OFF options seems appropriate here.

@boegel boegel added the bug fix label Dec 17, 2021
@boegel boegel added this to the next release (4.5.2?) milestone Dec 17, 2021
@boegel
Copy link
Member

boegel commented Jan 9, 2022

@stigrj I doubt this problem is related to the changes in easybuilders/easybuild-easyblocks#2561, but more with the changes in easybuilders/easybuild-easyblocks#2373, which make changes to ensure that the stubs library is considered before the libcuda.so.1 in /lib64 (which is exactly what you don't want here).

It seems like the -DWITH_INSTALL_RPATH_USE_LINK_PATH=OFF that you added doesn't make any difference at all though, I still still a line like this in the log file (on a CentOS 7 system that has /lib64/libcuda.so.1 installed in the OS)

-- Set runtime path of "/software/hipSYCL/0.9.1-gcccuda-2020b/lib/hipSYCL/librt-backend-cuda.so" to "/software/CUDAcore/11.1.1/lib64/stubs"

I also see this:

CMake Warning:
  Manually-specified variables were not used by the project:

    CMAKE_Fortran_COMPILER
    CMAKE_Fortran_FLAGS
    WITH_INSTALL_RPATH_USE_LINK_PATH

So the added configure option is not used at all?

I was also going to propose to add a sanity check command that checks the output produced by ldd lib/hipSYCL/librt-backend-cuda.so to ensure that the CUDA stubs is not picked up, but that's tricky because if there's no libcuda.so.1 installed in the OS, then you'll probably always get the linking to stubs (since there's no other option then)?

@boegel boegel changed the title Fix RPATH issue in hipSYCL avoid that path to CUDA stubs library is linked via RPATH to hipSYCL's librt-backend-cuda.so library Jan 9, 2022
@stigrj
Copy link
Contributor Author

stigrj commented Jan 10, 2022

Thanks, @boegel. I tested this successfully on two different HPC systems a month ago, but now trying to reproduce I don't see the same behavior as I remember. I need some time to load the problem back into mental cache

@stigrj
Copy link
Contributor Author

stigrj commented Jan 10, 2022

Ok, there's definitely a typo, should be -DCMAKE_INSTALL_RPATH_USE_LINK_PATH (not -DWITH_INSTALL_...), but I'm still not reproducing the correct behavior...

@stigrj
Copy link
Contributor Author

stigrj commented Mar 3, 2022

Working patch for this issue presented here: #15074

@boegel boegel added this to the release after 4.8.0 milestone Jul 6, 2023
@boegel boegel modified the milestones: 4.8.1, release after 4.8.1 Sep 9, 2023
@boegel boegel modified the milestones: 4.9.1, release after 4.9.1 Apr 3, 2024
@boegel boegel modified the milestones: 4.9.2, release after 4.9.2 Jun 8, 2024
@boegel boegel modified the milestones: 4.9.3, release after 4.9.3 Sep 11, 2024
@boegel boegel modified the milestones: release after 4.9.4, release after 5.0.0 Mar 18, 2025
@boegel boegel modified the milestones: next release (5.1.0), 5.x May 23, 2025
@boegel
Copy link
Member

boegel commented Jun 16, 2025

Closing this since gcccuda/2020b is no longer supported, see https://docs.easybuild.io/policies/toolchains

Sorry for not getting back to this @stigrj

If this is still relevant, please consider opening a new pull request using a more recent toolchain

@boegel boegel closed this Jun 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants