Skip to content

CUDA: Hessian computation is not bit-exact #2587

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
4 tasks done
tomoaki0705 opened this issue Jul 4, 2020 · 0 comments
Closed
4 tasks done

CUDA: Hessian computation is not bit-exact #2587

tomoaki0705 opened this issue Jul 4, 2020 · 0 comments

Comments

@tomoaki0705
Copy link
Contributor

System information (version)
  • OpenCV => recent 3.4 ( 49497d8 ) + contrib 3.4 branch ( d3ea23c )
  • Operating System / Platform => Jetson (Aarch64)
  • Compiler => GCC 5.4.0 (Jetson TX1, TX2), 7.5.0 (Jetson Nano, Xavier)
Detailed description
  • Some combinations of test CUDA_Features2D/CUDA_SURF.Detector_Masked fail on Jetson
[  FAILED  ] 32 tests, listed below:
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/0, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/1, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/2, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/3, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/4, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/5, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/6, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/7, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/8, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/9, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/10, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/11, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/12, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/13, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/14, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/15, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/16, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/17, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/18, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/19, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/20, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/21, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/22, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/23, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(3), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/24, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/25, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/26, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/27, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(2), SURF_Extended(true), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/28, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/29, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(false), SURF_Upright(true))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/30, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(false))
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/31, where GetParam() = (SURF_HessianThreshold(500), SURF_Octaves(4), SURF_OctaveLayers(3), SURF_Extended(true), SURF_Upright(true))
  • The detail of each single error is as following
[ RUN      ] CUDA_Features2D/CUDA_SURF.Detector_Masked/0, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(false))
/home/nvidia/opencv_contrib/modules/xfeatures2d/test/test_surf.cuda.cpp:133: Failure
Expected equality of these values:
  keypoints_gold.size()
    Which is: 696
  keypoints.size()
    Which is: 697
[  FAILED  ] CUDA_Features2D/CUDA_SURF.Detector_Masked/0, where GetParam() = (SURF_HessianThreshold(100), SURF_Octaves(3), SURF_OctaveLayers(2), SURF_Extended(false), SURF_Upright(false)) (31 ms)
  • This error message indicates that the detected # of key points differs between CPU and GPU.
  • Computation of Hessian has floating point operation and it's not bit-exact

{
float dx = calcHaarPattern( sum_ptr, Dx , 3 );
float dy = calcHaarPattern( sum_ptr, Dy , 3 );
float dxy = calcHaarPattern( sum_ptr, Dxy, 4 );
sum_ptr += sampleStep;
det_ptr[j] = dx*dy - 0.81f*dxy*dxy;
trace_ptr[j] = dx + dy;
}

Steps to reproduce
cmake -DWITH_CUDA
make all
./bin/opencv_test_xfeatures2d --gtest_filter=CUDA_Features2D/CUDA_SURF.Detector_Masked*
How to fix
  • Technically, it's not realistic to make Hessian computation bit-exact
  • The other way round is to make the test to tolerate with rounding error
  • This corresponds to increase the HessianThreshold here

testing::Values(SURF_HessianThreshold(100.0), SURF_HessianThreshold(500.0), SURF_HessianThreshold(1000.0)),

  • The test has already hard-coded threshold 100, 500 and 1000
  • Above error message indicates that threshold 100 and 500 requires bit-exactness of Hessian computation.
  • I'd like to add an ifdef for this hard coded value, so it will keep bit-exact computation for Intel, but not for the other platforms (i.e. Jetson)
  • I'll send a PR shortly
Issue submission checklist
  • I report the issue, it's not a question
  • I checked the problem with documentation, FAQ, open issues,
    answers.opencv.org, Stack Overflow, etc and have not found solution
  • I updated to latest OpenCV version and the issue is still there
  • There is reproducer code and related data files: videos, images, onnx, etc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants