Skip to content

Improve sparse pyrLK optical flow #5771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 4, 2016
Merged

Conversation

dtmoodie
Copy link
Contributor

@dtmoodie dtmoodie commented Dec 8, 2015

Modified sparse pyrlk optical flow to allow input of an image pyramid, which thus allows caching of image pyramids on successive calls.

  • Can significantly reduce number of pyramid calculation calls.

Added uchar support for 1 and 4 channel images.

  • Very important for lowering memory usage in large images

Preliminary 3 channel support, unsigned short and int support.

  • The datatypes work however the accuracy tests fail. So the cuda code exists for them but the c++ code wont let you get that far yet.

float x = prevPt.x + xBase + 0.5f;
float y = prevPt.y + yBase + 0.5f;

I_patch[i][j] = ToFloat<T>(I(x, y));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be I(y, x), here and below. PtrStepSz uses (y, x) coordinates order.

@vpisarev
Copy link
Contributor

vpisarev commented Dec 9, 2015

@Jet47, since you are looking at it anyway, let me assign it to you

@dtmoodie
Copy link
Contributor Author

dtmoodie commented Dec 9, 2015

Thanks for reviewing this! I believe I've addressed all of your comments except for conversion to [0,1] range. It's not clear if you're referring to the 3 channel non texture fetch code path or the int32_t textures.
It would appear that cude doesn't support normalized float for int32_t, which is why I used a different filter method. I suppose for this datatype I could use the non-texture code path, but it's a currently disabled datatype because pyrDown doesn't support int32_t.

texture<ushort4, cudaTextureType2D, cudaReadModeNormalizedFloat> tex_I16UC4(false, cudaFilterModeLinear, cudaAddressModeClamp);

texture<int, cudaTextureType2D, cudaReadModeElementType> tex_I32S(false, cudaFilterModePoint, cudaAddressModeClamp);
texture<int4, cudaTextureType2D, cudaReadModeElementType> tex_I32SC4(false, cudaFilterModePoint, cudaAddressModeClamp);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove textures that are not used (ushort, int and int4).

@vinograd47
Copy link
Contributor

I've got the following compilation errors on my machine:

pyrlk.cu(211): error: explicit type is missing ("int" assumed)

pyrlk.cu(211): error: no suitable conversion function from "int4" to "int" exists

pyrlk.cu(212): error: expression must have class type

pyrlk.cu(212): error: expression must have class type

pyrlk.cu(212): error: expression must have class type

pyrlk.cu(212): error: expression must have class type

pyrlk.cu(348): error: explicit type is missing ("int" assumed)

pyrlk.cu(348): error: no suitable conversion function from "int4" to "int" exists

pyrlk.cu(349): error: expression must have class type

pyrlk.cu(349): error: expression must have class type

pyrlk.cu(349): error: expression must have class type

pyrlk.cu(349): error: expression must have class type

pyrlk.cu(625): error: expected a ">"

pyrlk.cu(818): error: template parameter "PATCH_X" may not be redeclared in this scope

pyrlk.cu(818): error: template parameter "PATCH_Y" may not be redeclared in this scope

pyrlk.cu(818): error: template parameter "T" may not be redeclared in this scope

pyrlk.cu(627): error: expected a "," or ">"

pyrlk.cu(625): error: default argument not at end of parameter list

pyrlk.cu(838): error: type name is not allowed

pyrlk.cu(838): error: invalid combination of type specifiers

pyrlk.cu(862): warning: parsing restarts here after previous syntax error

pyrlk.cu(864): error: this declaration has no storage class or type specifier

pyrlk.cu(864): error: declaration is incompatible with "void cv::cuda::checkCudaError(cudaError_t, const char *, int, const char *)"
opencv/modules/core/include/opencv2/core/cuda/common.hpp(66): here

pyrlk.cu(864): error: expected a ")"

pyrlk.cu(866): error: expected a declaration

pyrlk.cu(868): error: expected a declaration

pyrlk.cu(871): error: a template argument list is not allowed in a declaration of a primary template

pyrlk.cu(903): error: a template argument list is not allowed in a declaration of a primary template

pyrlk.cu(936): error: a template argument list is not allowed in a declaration of a primary template

pyrlk.cu(1118): error: expression preceding parentheses of apparent call must have (pointer-to-) function type

pyrlk.cu(1118): error: identifier "c_winSize_x" is undefined

pyrlk.cu(1119): error: expression preceding parentheses of apparent call must have (pointer-to-) function type

pyrlk.cu(1119): error: identifier "c_winSize_y" is undefined

pyrlk.cu(1122): error: expression preceding parentheses of apparent call must have (pointer-to-) function type

pyrlk.cu(1122): error: identifier "c_halfWin_x" is undefined

pyrlk.cu(1123): error: expression preceding parentheses of apparent call must have (pointer-to-) function type

pyrlk.cu(1123): error: identifier "c_halfWin_y" is undefined

pyrlk.cu(1125): error: expression preceding parentheses of apparent call must have (pointer-to-) function type

pyrlk.cu(1125): error: identifier "c_iters" is undefined

pyrlk.cu(1199): error: expected a declaration

pyrlk.cu(1139): error: identifier "sparse_caller" is undefined

pyrlk.cu(1139): error: type name is not allowed

pyrlk.cu(1139): error: the global scope has no "call"

pyrlk.cu(1139): error: type name is not allowed

pyrlk.cu(1139): error: the global scope has no "call"

pyrlk.cu(1139): error: type name is not allowed

pyrlk.cu(1139): error: the global scope has no "call"

pyrlk.cu(1139): error: type name is not allowed

pyrlk.cu(1139): error: the global scope has no "call"

pyrlk.cu(1139): error: type name is not allowed

pyrlk.cu(1139): error: the global scope has no "call"

pyrlk.cu(1140): error: type name is not allowed

pyrlk.cu(1140): error: the global scope has no "call"

pyrlk.cu(1140): error: type name is not allowed

pyrlk.cu(1140): error: the global scope has no "call"

pyrlk.cu(1140): error: type name is not allowed

pyrlk.cu(1140): error: the global scope has no "call"

pyrlk.cu(1140): error: type name is not allowed

pyrlk.cu(1140): error: the global scope has no "call"

pyrlk.cu(1140): error: type name is not allowed

pyrlk.cu(1140): error: the global scope has no "call"

pyrlk.cu(1141): error: type name is not allowed

pyrlk.cu(1141): error: the global scope has no "call"

pyrlk.cu(1141): error: type name is not allowed

pyrlk.cu(1141): error: the global scope has no "call"

pyrlk.cu(1141): error: type name is not allowed

pyrlk.cu(1141): error: the global scope has no "call"

pyrlk.cu(1141): error: type name is not allowed

pyrlk.cu(1141): error: the global scope has no "call"

pyrlk.cu(1141): error: type name is not allowed

pyrlk.cu(1141): error: the global scope has no "call"

pyrlk.cu(1142): error: type name is not allowed

pyrlk.cu(1142): error: the global scope has no "call"

pyrlk.cu(1142): error: type name is not allowed

pyrlk.cu(1142): error: the global scope has no "call"

pyrlk.cu(1142): error: type name is not allowed

pyrlk.cu(1142): error: the global scope has no "call"

pyrlk.cu(1142): error: type name is not allowed

pyrlk.cu(1142): error: the global scope has no "call"

pyrlk.cu(1142): error: type name is not allowed

pyrlk.cu(1142): error: the global scope has no "call"

pyrlk.cu(1143): error: type name is not allowed

pyrlk.cu(1143): error: the global scope has no "call"

pyrlk.cu(1143): error: type name is not allowed

pyrlk.cu(1143): error: the global scope has no "call"

pyrlk.cu(1143): error: type name is not allowed

pyrlk.cu(1143): error: the global scope has no "call"

pyrlk.cu(1143): error: type name is not allowed

pyrlk.cu(1143): error: the global scope has no "call"

pyrlk.cu(1143): error: type name is not allowed

pyrlk.cu(1143): error: the global scope has no "call"

pyrlk.cu(1139): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

pyrlk.cu(1139): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

pyrlk.cu(1139): error: too many initializer values

pyrlk.cu(1140): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

pyrlk.cu(1140): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

pyrlk.cu(1140): error: too many initializer values

pyrlk.cu(1141): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

pyrlk.cu(1141): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

pyrlk.cu(1141): error: too many initializer values

pyrlk.cu(1142): error: a value of type "int" cannot be used to initialize an entity of type "const func_t"

@dtmoodie
Copy link
Contributor Author

Could you tell me a bit more about your machine?
I removed the dead code and C++11 features. But a few things don't add up, like you have an error on line 864:
cudaSafeCall(cudaGetLastError());
Which I haven't changed.
Also you have errors inside of the dense kernel, which I haven't touched either.

@vinograd47
Copy link
Contributor

I was able to build it on my machine (Ubuntu 12.04, GCC 4.6, CUDA 7.0) after some modifications. See
pyrlk.patch.txt.

But now I have sanity test failures. To reproduce them download opencv_extra repository from https://github.com/Itseez/opencv_extra and run the following command:

$ export OPENCV_TEST_DATA_PATH=<path to opencv_extra>/testdata
$ ./opencv_perf_cudaoptflow --perf_min_samples=1 --perf_force_samples=1 --perf_verify_sanity

Test log:

[----------] 8 tests from ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/0
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 0.12185859680175781, which exceeds eps, where
expect_min evaluates to 0.85204124450683594,
actual_min evaluates to 0.97389984130859375, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 1, 1)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =   47192916 = 47.19ms
median    =   47192916 = 47.19ms
gmean     =   47192916 = 47.19ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =   47192916 = 47.19ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/0, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 1, 1) (93 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/1
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 0.064218521118164062, which exceeds eps, where
expect_min evaluates to 0.92028141021728516,
actual_min evaluates to 0.98449993133544922, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 1, 30)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =   29338617 = 29.34ms
median    =   29338617 = 29.34ms
gmean     =   29338617 = 29.34ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =   29338617 = 29.34ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/1, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 1, 30) (64 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/2
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 0.096344947814941406, which exceeds eps, where
expect_min evaluates to 0.89337730407714844,
actual_min evaluates to 0.98972225189208984, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 3, 1)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =   56599094 = 56.60ms
median    =   56599094 = 56.60ms
gmean     =   56599094 = 56.60ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =   56599094 = 56.60ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/2, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 3, 1) (85 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/3
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 0.065741539001464844, which exceeds eps, where
expect_min evaluates to 0.91993236541748047,
actual_min evaluates to 0.98567390441894531, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 3, 30)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =   76048678 = 76.05ms
median    =   76048678 = 76.05ms
gmean     =   76048678 = 76.05ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =   76048678 = 76.05ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/3, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), false, 8000, 21, 3, 30) (106 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/4
[ PERFSTAT ]    (samples = 1, mean = 1.89, median = 1.89, stddev = 0.00 (0.0%))
[ VALUE    ]    (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 1, 1)
[       OK ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/4 (30 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/5
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 3.4332275390625e-05, which exceeds eps, where
expect_min evaluates to 0.91812229156494141,
actual_min evaluates to 0.91815662384033203, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 1, 30)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =    2928387 = 2.93ms
median    =    2928387 = 2.93ms
gmean     =    2928387 = 2.93ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =    2928387 = 2.93ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/5, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 1, 30) (30 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/6
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 0.89392757415771484, which exceeds eps, where
expect_min evaluates to -0.0052967071533203125,
actual_min evaluates to 0.88863086700439453, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 3, 1)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =    4825479 = 4.83ms
median    =    4825479 = 4.83ms
gmean     =    4825479 = 4.83ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =    4825479 = 4.83ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/6, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 3, 1) (34 ms)
[ RUN      ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/7
/home/vlad/opencv3.0/modules/ts/src/ts_perf.cpp:359: Failure
The difference between expect_min and actual_min is 0.79467868804931641, which exceeds eps, where
expect_min evaluates to 0.125,
actual_min evaluates to 0.91967868804931641, and
eps evaluates to 2.2204460492503131e-16.
Argument "gpu_nextPts" has unexpected minimal value

params    = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 3, 30)
termination reason:  reached maximum number of iterations
bytesIn   =          0
bytesOut  =          0
samples   =          1
outliers  =          0
frequency = 1000000000
min       =    8783498 = 8.78ms
median    =    8783498 = 8.78ms
gmean     =    8783498 = 8.78ms
gstddev   = 0.00000000 = 0.00ms for 97% dispersion interval
mean      =    8783498 = 8.78ms
stddev    =          0 = 0.00ms
[  FAILED  ] ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse.PyrLKOpticalFlowSparse/7, where GetParam() = (("gpu/opticalflow/frame0.png", "gpu/opticalflow/frame1.png"), true, 8000, 21, 3, 30) (36 ms)
[----------] 8 tests from ImagePair_Gray_NPts_WinSz_Levels_Iters_PyrLKOpticalFlowSparse (478 ms total)

@dtmoodie
Copy link
Contributor Author

I've integrated the patch that you created. It seems to work fine so I'll push that soon.

If I understand the sanity check code correctly, it's basically doing some statistics on the tracked points as detected by optical flow. If so, is it reasonable to assume there might be a bit of a difference in values due to the different texture fetch methods? IE cudaReadModeElement vs cudaReadModeNormalizedFloat. If I change the sanity test to pass in a float image and a float BGRA image then the test passes. Would that be sufficient?

How were the expected values determined in the performance test? Was it generated by the cpu or gpu optical flow code? Was it verified to be correct? I ask because one of the expected min values is -0.0052967071533203125 which is outside of the image. Should I generate new truth data and more tests for the different datatypes?

return ret;
}

template <int cn, int PATCH_X, int PATCH_Y, bool calcErr, typename T = float>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not use default template parameters for functions (typename T = float). It is C++0x feature. Just remove = float, the code should compile without it.

@vinograd47
Copy link
Contributor

Would that be sufficient?

Yes, your solution is OK.

How were the expected values determined in the performance test? Was it generated by the cpu or gpu optical flow code? Was it verified to be correct? I ask because one of the expected min values is -0.0052967071533203125 which is outside of the image. Should I generate new truth data and more tests for the different datatypes?

The expected values were generated by the same GPU code for previous run. It is a simple regression test that compares current results with previous launches.

@dtmoodie
Copy link
Contributor Author

Is there anything left that I need to do for this?

@vinograd47
Copy link
Contributor

👍

@vinograd47
Copy link
Contributor

Thank you for your contribution!

@alalek
Copy link
Member

alalek commented Dec 29, 2015

@dtmoodie Could you please squash commits into one to remove useless intermediate changes?

In case of problems try to follow this opencv/opencv_contrib#290 (comment). Also feel free to ask for help.

@dtmoodie
Copy link
Contributor Author

Sure thing, working on squashing now. It looks like tortoiseGit doesn't want to squash due to the way I had the repo setup. I'll have the squash finished and double checked in a bit.

… which thus allows caching of image pyramids on successive calls.

Added unsigned char support for 1, 3, 4 channel images.
@alalek
Copy link
Member

alalek commented Dec 30, 2015

Well done! 👍

@opencv-pushbot opencv-pushbot merged commit 66738d7 into opencv:master Jan 4, 2016
@paroj
Copy link
Contributor

paroj commented Jan 6, 2016

now we only need cuda::buildOpticalFlowPyramid to really take advantage of this..

@dtmoodie dtmoodie deleted the pyrlk branch March 6, 2017 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants