Skip to content

Conversation

@chengduoZH
Copy link
Contributor

@chengduoZH chengduoZH commented May 2, 2018

fix #10323

CUDA 8:

T __shfl(T var, int srcLane, int width=warpSize);
T __shfl_down(T var, unsigned int delta, int width=warpSize);   

CUDA 9:

T __shfl_sync(unsigned mask, T var, int srcLane, int width=warpSize);
T __shfl_down_sync(unsigned mask, T var, unsigned detla,  int width=warpSize);

@chengduoZH chengduoZH force-pushed the fix_shfl_sync branch 2 times, most recently from 47e4a20 to 044f86d Compare May 2, 2018 03:29
Copy link
Contributor

@dzhwinter dzhwinter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dzhwinter
Copy link
Contributor

I forget to change the wrapper in old paddle, but how it passes the CI test? So weird.

@chengduoZH chengduoZH merged commit 3222cf1 into PaddlePaddle:develop May 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error: identifier "__shfl_sync" is undefined

2 participants