gpu: Fix compilation errors in NVIDIA GPUs with cuDNN 8 #952

Dr-Noob · 2021-01-18T15:43:33Z

Description

The current version does not build with cudDNN 8 since there are some changes in the API. Building with cuDNN 8 results in a failure, as I described in issue #885.

Deprecated calls in cuDNN 8 used in oneDNN:

cudnnGetConvolutionBackwardDataAlgorithm
cudnnGetConvolutionBackwardFilterAlgorithm
cudnnGetConvolutionForwardAlgorithm

All of them were replaced by _v7 calls. They could have been replaced by cuddnFind... counterparts, but because they do not behave in the same way, I decided to use the _v7 functions. API changed, and now cuDNN functions return a list of the possible algorithms to choose instead of returning the fastest one. Therefore, I implemented a simple algorithm to choose the fastest suitable algorithm from the list.

Because _v7 functions build successfully with cuDNN 7, I did not write any conditional to check the version of cuDNN; both cuDNN 7 and 8 builds successfully with the same code.

I am unable to run the tests for GPU with an untouched master branch, so I did not include any new test.

Checklist

Code-change submissions

Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally?
Have you formatted the code using clang-format?

Bug fixes

Have you added relevant regression tests?
Have you included information on how to reproduce the issue (either in a github issue or in this PR)?

dzarukin · 2021-01-19T20:29:15Z

Hi @Dr-Noob, thank you for your contribution. We will take this changes to test them internally with both cuDNN v7 and v8 and come back to you with our results. Does it sound good for you? Thanks.

Dr-Noob · 2021-01-20T08:00:13Z

Great, please let me know if you find any problem or if there is further modification needed

src/gpu/nvidia/cudnn_conv_inner_product_impl.hpp

dzarukin · 2021-01-22T02:50:27Z

Hi @Dr-Noob, I've got good news and bad news.
So good news are proposed changes are fully legit with both v7 and v8 cuDNN versions of the library and work as expected (at least on ci dataset our team monitors regularly). It means that we will promote them.
Bad news are:

It looks like we have the cuda backend flow broken for 1,5 months for convolutions, when de-tangled a service stream from a computational stream inside the library. This results in UNIMPLEMENTED cases if running them in batch (like --batch=test_conv_gpu_ci file does in benchdnn) starting from a second one. At some point it just explodes. We decided to revert the change for cuda backend and will investigate it later, though the fact it explodes might be either a compiler problem or integration of cuda code problem, we don't have a specific conclusion yet. This fix will happen before we promote this PR.
It seems to me cuDNN v8 has removed or broken some support for convolutions/deconvolutions. This is one of 58 shapes caught unimplemented:

320:UNIMPLEMENTED __REPRO: --conv --engine=gpu --cfg=f16 g1ic3id5oc17od5kd3pd1n"3d_tail_conv:1st"

With v7 it dispatched to CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM, it doesn't do that any more. Moreover, neither of other 7 algs available for such problem do. I'm not sure what that means but to me it looks like a regression and I don't know whether it's appropriate for you.

Saying all that, we will inform you when we promote this PR. I would kindly request you to rebase this PR on top of master (we updated it today and it contains a fix for broken names) and squash the last commit into a second one. I've also left some minor comments for changes. Thank you.

Dr-Noob · 2021-01-25T15:14:49Z

Hi,

I'm not sure if I understand you correctly. I have rebased the PR on top of master and later pushed a new commit with the changes you suggested, but I don't know how to squash my commits. Which commits do you want me to squash? You want to squash commit 1, 2 and 3 (fe6a2b1, f083f0c and 10ced24) into a single commit and then leave the last one (356315f) as the second commit?

dzarukin · 2021-01-26T01:11:07Z

Hi @Dr-Noob, sorry for not being clear. We prefer to have commits that don't break any checks. Currently, commit three "Fix style with clang-format" fixes the issue introduces in commit two "gpu: Choose best suitable algorithm for convolutions". I'm asking to merge commit three into commit two. I'm fine with having rest three logically separated commits. You may use git push --force in your branch when requested commits will be squashed. Thanks.

…orithm is found

Dr-Noob · 2021-01-26T09:17:01Z

A little bit hacky, but I think I did what you requested. I hope it looks better now!

src/gpu/nvidia/cudnn_conv_inner_product_impl.hpp

dzarukin · 2021-01-26T23:33:31Z

Thanks for updating, looks good to me. I left one more small comment. Thanks @echeresh for spotting it.

Dr-Noob · 2021-01-27T08:18:38Z

Whoops...fixed

dzarukin · 2021-01-28T04:55:43Z

Thanks for your contribution!

Replace previous calls with _v7 counterparts. Improved best algorithm pick up.

vpirogov added this to the v2.2 milestone Jan 19, 2021

dzarukin self-assigned this Jan 19, 2021

dzarukin approved these changes Jan 22, 2021

View reviewed changes

src/gpu/nvidia/cudnn_conv_inner_product_impl.hpp Outdated Show resolved Hide resolved

src/gpu/nvidia/cudnn_conv_inner_product_impl.hpp Outdated Show resolved Hide resolved

gpu: Fix compilation issues in cuDNN 8

fe6a2b1

Dr-Noob force-pushed the master branch from fd3617a to 356315f Compare January 25, 2021 15:11

Dr-Noob added 2 commits January 26, 2021 10:07

gpu: Choose best suitable algorithm for convolutions

6f1e0c2

Use break instead of found_algo. Do not print error message if no alg…

5cb9660

…orithm is found

Dr-Noob force-pushed the master branch from 356315f to 5cb9660 Compare January 26, 2021 09:14

dzarukin reviewed Jan 26, 2021

View reviewed changes

src/gpu/nvidia/cudnn_conv_inner_product_impl.hpp Outdated Show resolved Hide resolved

gpu: Fix small mistake where wrong algo max count was used

55c1a87

Dr-Noob force-pushed the master branch from 47e0c65 to 55c1a87 Compare January 27, 2021 08:16

dzarukin merged commit dcf444e into uxlfoundation:master Jan 28, 2021

vpirogov pushed a commit that referenced this pull request Feb 1, 2021

gpu: nvidia: fix compilation errors for cuDNN 8 (#952)

d873f6f

Replace previous calls with _v7 counterparts. Improved best algorithm pick up.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gpu: Fix compilation errors in NVIDIA GPUs with cuDNN 8 #952

gpu: Fix compilation errors in NVIDIA GPUs with cuDNN 8 #952

Uh oh!

Dr-Noob commented Jan 18, 2021 •

edited

Loading

Uh oh!

dzarukin commented Jan 19, 2021

Uh oh!

Dr-Noob commented Jan 20, 2021

Uh oh!

Uh oh!

Uh oh!

dzarukin commented Jan 22, 2021

Uh oh!

Dr-Noob commented Jan 25, 2021 •

edited

Loading

Uh oh!

dzarukin commented Jan 26, 2021

Uh oh!

Dr-Noob commented Jan 26, 2021

Uh oh!

Uh oh!

dzarukin commented Jan 26, 2021

Uh oh!

Dr-Noob commented Jan 27, 2021

Uh oh!

dzarukin commented Jan 28, 2021

Uh oh!

Uh oh!

gpu: Fix compilation errors in NVIDIA GPUs with cuDNN 8 #952

gpu: Fix compilation errors in NVIDIA GPUs with cuDNN 8 #952

Uh oh!

Conversation

Dr-Noob commented Jan 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Code-change submissions

Bug fixes

Uh oh!

dzarukin commented Jan 19, 2021

Uh oh!

Dr-Noob commented Jan 20, 2021

Uh oh!

Uh oh!

Uh oh!

dzarukin commented Jan 22, 2021

Uh oh!

Dr-Noob commented Jan 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dzarukin commented Jan 26, 2021

Uh oh!

Dr-Noob commented Jan 26, 2021

Uh oh!

Uh oh!

dzarukin commented Jan 26, 2021

Uh oh!

Dr-Noob commented Jan 27, 2021

Uh oh!

dzarukin commented Jan 28, 2021

Uh oh!

Uh oh!

Dr-Noob commented Jan 18, 2021 •

edited

Loading

Dr-Noob commented Jan 25, 2021 •

edited

Loading