Skip to content

NMS behaviour w.r.t. fp16 vs fp32 #3371

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SiftingSands opened this issue Feb 10, 2021 · 4 comments · Fixed by #3383
Closed

NMS behaviour w.r.t. fp16 vs fp32 #3371

SiftingSands opened this issue Feb 10, 2021 · 4 comments · Fixed by #3383

Comments

@SiftingSands
Copy link

SiftingSands commented Feb 10, 2021

🐛 Bug

NMS gives significantly different outputs when switching boxes from FP32 to FP16. I couldn't find any related issue here or on the discussion board, and I didn't see an obvious cause from reading the docs.

To Reproduce

Call torchvision.ops.nms(boxes, scores, iou_threshold=0.2)
boxes and scores (FP32) -> data.zip
NMS has one output.

Change boxes to float16 (I used .to(torch.float16)), and NMS gives 37 outputs (no suppression is performed?)
I wasn't expecting type conversion from FP32 to FP16 to dramatically alter results. Let me know if this is just a case of user error.

Environment

PyTorch version: 1.7.1+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: Tesla V100-SXM2-16GB
GPU 1: Tesla V100-SXM2-16GB
GPU 2: Tesla V100-SXM2-16GB
GPU 3: Tesla V100-SXM2-16GB
GPU 4: Tesla V100-SXM2-16GB
GPU 5: Tesla V100-SXM2-16GB
GPU 6: Tesla V100-SXM2-16GB
GPU 7: Tesla V100-SXM2-16GB

Nvidia driver version: 440.33.01
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.7.1+cu101
[pip3] torchaudio==0.7.2
[pip3] torchvision==0.8.2+cu101
[conda] Could not collect

Possible related issue?

@voldemortX
Copy link
Contributor

voldemortX commented Feb 11, 2021

@SiftingSands I'm not familiar with the C codes in torchvision, but I think float16 is not supported yet for nms. If you cast the float16 tensors to CPU you'll see an error throw RuntimeError: "nms" not implemented for 'Half'.

@SiftingSands
Copy link
Author

Understood. I would have expected that error to show up on the GPU as well if FP16 is not supported. Is torchvision.ops.nms intended to be ran on the CPU only?

Feel free to close this issue if this behavior is expected.

@voldemortX
Copy link
Contributor

It should be not CPU only, and probably an error throw is indeed needed? @datumbox

@datumbox
Copy link
Contributor

@SiftingSands thanks for reporting the issue. This is a indeed a bug and it's caused by numeric overflows. I'm going to send a PR to fix this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants