-
Notifications
You must be signed in to change notification settings - Fork 5.8k
CUDA Toolkit 12.4.0 tuple
incompatibility
#3690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Just to confirm your suspicion that this affects cross-platform builds, getting the same errors on Linux with GCC 13:
^ one such error |
I have the same issue when building latest OpenCV 4 from source with Cuda 12.4,, cudnn 9 and gcc 13, on Fedora 39
|
Having the same Issue when building latest OpenCV 4 from Source on Windows 11. |
I agree, this should be fixable the way you describe it. However:
Here one instance is compiled with
This seems to work for the case mentioned above. I am not sure however, if this will give correct result in all cases. Maybe someone can give some feedback? Or any ideas how this could be solved more elegantly? |
Alternatively, Thrust's // placed at the end of modules/cudev/include/opencv2/cudev/ptr2d/zip.hpp, in the global namespace
_LIBCUDACXX_BEGIN_NAMESPACE_STD
template<class Ptr0, class Ptr1>
struct tuple_size<cv::cudev::ZipPtr<tuple<Ptr0, Ptr1>>> : tuple_size<tuple<Ptr0, Ptr1>> {};
template<class Ptr0, class Ptr1, class Ptr2>
struct tuple_size<cv::cudev::ZipPtr<tuple<Ptr0, Ptr1, Ptr2>>> : tuple_size<tuple<Ptr0, Ptr1, Ptr2>> {};
template<class Ptr0, class Ptr1, class Ptr2, class Ptr3>
struct tuple_size<cv::cudev::ZipPtr<tuple<Ptr0, Ptr1, Ptr2, Ptr3>>> : tuple_size<tuple<Ptr0, Ptr1, Ptr2, Ptr3>> {};
template<class Ptr0, class Ptr1>
struct tuple_size<cv::cudev::ZipPtrSz<tuple<Ptr0, Ptr1>>> : tuple_size<tuple<Ptr0, Ptr1>> {};
template<class Ptr0, class Ptr1, class Ptr2>
struct tuple_size<cv::cudev::ZipPtrSz<tuple<Ptr0, Ptr1, Ptr2>>> : tuple_size<tuple<Ptr0, Ptr1, Ptr2>> {};
template<class Ptr0, class Ptr1, class Ptr2, class Ptr3>
struct tuple_size<cv::cudev::ZipPtrSz<tuple<Ptr0, Ptr1, Ptr2, Ptr3>>> : tuple_size<tuple<Ptr0, Ptr1, Ptr2, Ptr3>> {};
template<size_t N, class Ptr0, class Ptr1>
struct tuple_element<N, cv::cudev::ZipPtr<tuple<Ptr0, Ptr1>>> : tuple_element<N, tuple<Ptr0, Ptr1>> {};
template<size_t N, class Ptr0, class Ptr1, class Ptr2>
struct tuple_element<N, cv::cudev::ZipPtr<tuple<Ptr0, Ptr1, Ptr2>>> : tuple_element<N, tuple<Ptr0, Ptr1, Ptr2>> {};
template<size_t N, class Ptr0, class Ptr1, class Ptr2, class Ptr3>
struct tuple_element<N, cv::cudev::ZipPtr<tuple<Ptr0, Ptr1, Ptr2, Ptr3>>> : tuple_element<N, tuple<Ptr0, Ptr1, Ptr2, Ptr3>> {};
_LIBCUDACXX_END_NAMESPACE_STD Thrust does this for backwards compatibility with the old style of tuples as well. It also appears that In addition to the parameter packing changes mentioned above, I've successfully compiled OpenCV using this method. |
Also limit cuda interaction to ABI_X86_64. Bug: opencv/opencv_contrib#3690 Signed-off-by: Paul Zander <[email protected]>
Also limit cuda interaction to ABI_X86_64. Bug: opencv/opencv_contrib#3690 Signed-off-by: Paul Zander <[email protected]> Closes: #36020 Signed-off-by: Joonas Niilola <[email protected]>
I am on of the maintainers of the cccl libraries at NVIDIA. We recently updated our old This has been fixed after this issue was raised here. There are different potential ways of working around this issue in the near / mid term:
|
how to replace?pull and cmake? which the cmake parameters? |
You could use CPM like:
|
Well... Still NOT quite get it... Do we have the solution already??? Have cccl built and replaced with the default ones installed with CUDA-Toolkit 12.4?? Thanks |
I was able to build the library using CUDA Toolkit 12.3.2 in my environment(through vcpkg). This is one way to use it. Also, the above cccl fixes seem to be going into v2.4.0. |
CUDA Toolkit 12.5 still has the bug. |
Do you have any additional information as to how to use these commands correctly to patch the CMake files in OpenCV? I have tried adding them to the CMakeLists.txt without much success so far. |
Added CUDA 12.4+ support #3744 Tries to fix #3690 for CUDA 12.4+ Related patch to main repo: opencv/opencv#25658 Changes: - Added branches to support new variadic implementation of thrust::tuple - Added branch with std::array instead of std::tuple in split-merge and grid operations. The new branch got rid of namespace clash: cv::cuda in OpenCV and ::cuda in CUDA standard library (injected by Thrust). Old tuple branches presumed for compatibility with old code and CUDA versions before 12.4. ### Pull Request Readiness Checklist See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request - [x] I agree to contribute to the project under Apache 2 License. - [x] To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV - [x] The PR is proposed to the proper branch - [x] There is a reference to the original bug report and related work - [ ] There is accuracy test, performance test and test data in opencv_extra repository, if applicable Patch to opencv_extra has the same branch name. - [ ] The feature is well documented and sample code can be built with the project CMake
I have the same issue when building opencv 4.8.0 with CUDA 12.4.0 on Ubuntu 22.04 LST |
@Cupcc either use CUDA < 12.4 or the latest commits from the 4.x branch which address this issue. |
Hi, @asmorkalov . I just tried CUDA 12.5 with CUDNN 9.2.0, OpenCV-4.10.0 is built successfully. ^_^ |
I have the same issue when building openCV 5.x with CUDA 12.6.0,on Win11 |
See #3773 |
Unfortunately, @miscco 's answer does not actually answer your question, which also made confusion to public as it is still a persisting error. You should install lit python package by typing |
System information (version)
Detailed description
opencv with CUDA support cannot be built using CUDA Toolkit 12.4.0.
While CUDA Toolkit 12.3.2 uses thrust version 2.2.0 (https://docs.nvidia.com/cuda/archive/12.3.2/cuda-toolkit-release-notes/index.html), CUDA Toolkit 12.4.0 updates to thrust version 2.3.1 (https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html). In thrust version 2.3.0, the tuple implementation was replaced with a standard tuple implementaton (NVIDIA/cccl#262). Notably, this changes the definition from a 10-parameter template to a variable-parameter template. So instead of a tuple of n items being padded out with 10 - n null types to always have 10 template parameters, it now only has n template parameters. This makes the function templates in cudev specified with 10 template parameters per tuple no longer viable for tuples not of size 10.
An example of one such function template that's no longer viable,
cv::cudev::blockReduce
:opencv_contrib/modules/cudev/include/opencv2/cudev/block/reduce.hpp
Lines 68 to 81 in 6b5142f
An example of an error I encounter:
The first candidate but nonviable function template shown in the error message is the one linked above, which was viable and selected in previous CUDA Toolkit versions.
I think that all templates specifying 10 template parameters per tuple can be updated to work with the new tuple definition by replacing each set of 10 template parameters with a parameter pack. I think this should still be compatible with the old tuple definition, as well. For example, I think this would be a viable implementation of
cv::cudev::blockReduce
:Steps to reproduce
Attempt to build cudev using CUDA Toolkit 12.4.0. I suspect that this error will be observed with any combination of OpenCV version, OS, platform, and compiler (that are modern enough to not encounter some other error first).
Issue submission checklist
forum.opencv.org, Stack Overflow, etc and have not found any solution
The text was updated successfully, but these errors were encountered: