Try to Fix #3725: cudaarithm: fix the compile faiure of CUDA 12. #3726

sdy623 · 2024-04-22T15:46:07Z

A slight API change of nppiMeanStdDevGetBufferHostSize_8u_C1R and nppiMeanStdDevGetBufferHostSize_32f_C1R in NPP of CUDA 12 has caused the #3725. I will try to fix this. I found that the type of bufSize is size_t instead of int in the reductions.cpp in the NPP header file, where the nppi_statistics_functions.h changed the type of second parameter from * int to * size_t.

nppi_statistics_functions.h 5392:5408

NppStatus 
nppiMeanStdDevGetBufferHostSize_32f_C1R_Ctx(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
/** 
 * Buffer size for \ref nppiMean_StdDev_32f_C1R.
 * 
 * For common parameter descriptions, see \ref CommonMeanStdDevGetBufferHostSizeParameters.
 *
 */
NppStatus 
nppiMeanStdDevGetBufferHostSize_32f_C1R(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */);

/** 
 * Buffer size for \ref nppiMean_StdDev_8u_C1MR.
 * 
 * For common parameter descriptions, see \ref CommonMeanStdDevGetBufferHostSizeParameters.
 *
 */

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

opencv-alalek · 2024-04-22T16:28:54Z

/cc @cudawarped

modules/cudaarithm/src/reductions.cpp

sdy623

I am pretty sure it was introduced in CUDA 12.4 (12040).

Is it worth including an assert before it is used because BufferPool.getBuffer() expects an int? e.g.

CV_Assert(bufSize <= std::numeric_limits<int>::max());

What about the next call to get buffer size?

Have you tested this on 12.4? I think that it will still fail because of the other bug so I am not sure if it will pass any CI tests built against CUDA 12.4.

I checked the denfition of CUDA_VERSION in CUDA 12 which is the 12.XX.XX instad of a int number, so I edited the CmakeLists.txt to add a new denfination of CUDA_12_OR_HIGHER and fixed the compile of reductions.cpp, however It went other errors still need to slove

C:\opencv-bld\opencv_contrib\modules\cudev\include\opencv2\cudev\grid\detail/reduce.hpp(379): error: no instance of overloaded function "cv::cudev::blockReduce" matches the argument list
            argument types are: (cuda::std::__4::tuple<volatile int *, volatile int *>, cuda::std::__4::tuple<int &, int &>, int, cuda::std::__4::tuple<cv::cudev::minimum<int>, cv::cudev::maximum<int>>)
              blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval), tie(mymin, mymax), tid, make_tuple(minOp, maxOp))

cudawarped · 2024-04-23T05:04:12Z

I checked the denfition of CUDA_VERSION in CUDA 12 which is the 12.XX.XX instad of a int number

@sdy623

I think you are confusing CMake generation and compilation. Adding that definition which is still for the incorrect verison of CUDA into the CMake file is unecessary.

Version Info

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda.h

> /**
>  * CUDA API version number
>  */
> #define CUDA_VERSION 12040

Function definitions

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\nppi_statistics_functions.h

NppStatus 
nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, int * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include\nppi_statistics_functions.h

NppStatus 
nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);

sdy623 · 2024-04-23T05:35:04Z

<int &, int &>, int, cuda::std::__4::tuple<cv::cudev::minimum, cv::cudev::maximum>)
blockReduce<BLOCK_SIZE>(smem_tuple(sminval, smaxval),

I checked the denfition of CUDA_VERSION in CUDA 12 which is the 12.XX.XX instad of a int number

@sdy623

I think you are confusing CMake generation and compilation. Adding that definition which is still for the incorrect verison of CUDA into the CMake file is unecessary.

Version Info

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\cuda.h
> /**
>  * CUDA API version number
>  */
> #define CUDA_VERSION 12040
Function definitions

C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.3\include\nppi_statistics_functions.h
NppStatus 
nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, int * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include\nppi_statistics_functions.h
NppStatus 
nppiMeanStdDevGetBufferHostSize_8u_C1R_Ctx(NppiSize oSizeROI, size_t * hpBufferSize/* host pointer */, NppStreamContext nppStreamCtx);

I deleted the edit of the CMake and find the correct CUDA_VERSION is 12040 in CUDA 12.4. I found a way to check the cuda.h without install it, just use 7-Zip to open the cuda_12.4.1_551.78_windows.exe and find this directory cuda_12.4.1_551.78_windows.exe\cuda_cudart\cudart\include\cuda.h and found the macro denfination

/**
 * CUDA API version number
 */
#define CUDA_VERSION 12040

cudawarped · 2024-04-23T07:17:25Z

I found a way to check the cuda.h without install it, just use 7-Zip to open the cuda_12.4.1_551.78_windows.exe and find this directory cuda_12.4.1_551.78_windows.exe\cuda_cudart\cudart\include\cuda.h and found the macro denfination

Whilst this is a completely valid way to check the header I would advise you to install CUDA 12.4 when submitting a PR which fixes something that it breaks.

If you do that you will realize that

Your conversions from size_t to int are throwing warnings

warning C4267: 'argument': conversion from 'size_t' to 'int', possible loss of data

which you should address when using the new buffer size type as I mentioned above. e.g.
```
CV_Assert(bufSize <= std::numeric_limits<int>::max());
GpuMat buf = pool.getBuffer(1, static_cast<int>(bufSize), gsrc.type());
```
Because of the existing bug I mentioned before CUDA Toolkit 12.4.0 tuple incompatibility #3690 you will be unable to build the cudaarithm module even with this fix. i.e.

cmake --build . --target opencv_cudaarithm

still fails.

If I was authoring this PR I would install both CUDA 12.3 and 12.4 and check that this builds on both without errors.

modules/cudaarithm/src/reductions.cpp

A slight API change of NPP nppiMeanStdDevGetBufferHostSize_8u_C1R The type of bufSize is size_t instead of int in CUDA 12.4.x

cudawarped · 2024-04-23T18:21:59Z

@opencv-alalek this resolves the issue mentioned but still results in build errors because cudaarithm uses cudev ( #3690). I suggest this is revisited when cudev can be built against CUDA 12.4.

jiapei100 · 2024-04-28T22:29:16Z

I still have an issue #3728

cudawarped · 2024-04-29T04:33:33Z

I still have an issue #3728

@jiapei100 Your issue is related to cudev (#3690) not cudaarithm.

johnnynunez · 2024-05-09T22:21:06Z

great job! thank you

asmorkalov

The patch looks reasonable. I'm looking on #3690 if we can do something on OpenCV side.

opencv-alalek added the category: cuda label Apr 22, 2024

cudawarped reviewed Apr 22, 2024

View reviewed changes

modules/cudaarithm/src/reductions.cpp Outdated Show resolved Hide resolved

sdy623 force-pushed the 4.x branch from 391ea10 to 69e3537 Compare April 23, 2024 02:46

sdy623 commented Apr 23, 2024

View reviewed changes

sdy623 force-pushed the 4.x branch 2 times, most recently from 39073a2 to 34bd538 Compare April 23, 2024 05:27

sdy623 force-pushed the 4.x branch from 34bd538 to 8478d82 Compare April 23, 2024 07:26

cudawarped reviewed Apr 23, 2024

View reviewed changes

modules/cudaarithm/src/reductions.cpp Outdated Show resolved Hide resolved

cudaarithm: fix the compile faiure of CUDA 12.4.x .

4e766a0

A slight API change of NPP nppiMeanStdDevGetBufferHostSize_8u_C1R The type of bufSize is size_t instead of int in CUDA 12.4.x

sdy623 force-pushed the 4.x branch from 8478d82 to 4e766a0 Compare April 23, 2024 12:46

asmorkalov approved these changes May 22, 2024

View reviewed changes

asmorkalov merged commit 9358ad2 into opencv:4.x May 22, 2024

mshabunin mentioned this pull request Jun 14, 2024

Merge 4.x -> 5.x #3758

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Try to Fix #3725: cudaarithm: fix the compile faiure of CUDA 12. #3726

Try to Fix #3725: cudaarithm: fix the compile faiure of CUDA 12. #3726

Uh oh!

sdy623 commented Apr 22, 2024 •

edited

Loading

Uh oh!

opencv-alalek commented Apr 22, 2024

Uh oh!

Uh oh!

sdy623 left a comment

Uh oh!

cudawarped commented Apr 23, 2024 •

edited

Loading

Uh oh!

sdy623 commented Apr 23, 2024

Uh oh!

cudawarped commented Apr 23, 2024 •

edited

Loading

Uh oh!

Uh oh!

cudawarped commented Apr 23, 2024

Uh oh!

jiapei100 commented Apr 28, 2024

Uh oh!

cudawarped commented Apr 29, 2024

Uh oh!

johnnynunez commented May 9, 2024

Uh oh!

asmorkalov left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Try to Fix #3725: cudaarithm: fix the compile faiure of CUDA 12. #3726

Try to Fix #3725: cudaarithm: fix the compile faiure of CUDA 12. #3726

Uh oh!

Conversation

sdy623 commented Apr 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

opencv-alalek commented Apr 22, 2024

Uh oh!

Uh oh!

sdy623 left a comment

Choose a reason for hiding this comment

Uh oh!

cudawarped commented Apr 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sdy623 commented Apr 23, 2024

Uh oh!

cudawarped commented Apr 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

cudawarped commented Apr 23, 2024

Uh oh!

jiapei100 commented Apr 28, 2024

Uh oh!

cudawarped commented Apr 29, 2024

Uh oh!

johnnynunez commented May 9, 2024

Uh oh!

asmorkalov left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

sdy623 commented Apr 22, 2024 •

edited

Loading

cudawarped commented Apr 23, 2024 •

edited

Loading

cudawarped commented Apr 23, 2024 •

edited

Loading