NMS behaviour w.r.t. fp16 vs fp32 #3371

SiftingSands · 2021-02-10T20:53:51Z

🐛 Bug

NMS gives significantly different outputs when switching boxes from FP32 to FP16. I couldn't find any related issue here or on the discussion board, and I didn't see an obvious cause from reading the docs.

To Reproduce

Call torchvision.ops.nms(boxes, scores, iou_threshold=0.2)
boxes and scores (FP32) -> data.zip
NMS has one output.

Change boxes to float16 (I used .to(torch.float16)), and NMS gives 37 outputs (no suppression is performed?)
I wasn't expecting type conversion from FP32 to FP16 to dramatically alter results. Let me know if this is just a case of user error.

Environment

PyTorch version: 1.7.1+cu101
Is debug build: False
CUDA used to build PyTorch: 10.1
ROCM used to build PyTorch: N/A

OS: Ubuntu 18.04.5 LTS (x86_64)
GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Clang version: Could not collect
CMake version: version 3.10.2

Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration: 
GPU 0: Tesla V100-SXM2-16GB
GPU 1: Tesla V100-SXM2-16GB
GPU 2: Tesla V100-SXM2-16GB
GPU 3: Tesla V100-SXM2-16GB
GPU 4: Tesla V100-SXM2-16GB
GPU 5: Tesla V100-SXM2-16GB
GPU 6: Tesla V100-SXM2-16GB
GPU 7: Tesla V100-SXM2-16GB

Nvidia driver version: 440.33.01
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.19.5
[pip3] torch==1.7.1+cu101
[pip3] torchaudio==0.7.2
[pip3] torchvision==0.8.2+cu101
[conda] Could not collect

Possible related issue?

The text was updated successfully, but these errors were encountered:

voldemortX · 2021-02-11T04:13:16Z

@SiftingSands I'm not familiar with the C codes in torchvision, but I think float16 is not supported yet for nms. If you cast the float16 tensors to CPU you'll see an error throw RuntimeError: "nms" not implemented for 'Half'.

SiftingSands · 2021-02-11T14:15:52Z

Understood. I would have expected that error to show up on the GPU as well if FP16 is not supported. Is torchvision.ops.nms intended to be ran on the CPU only?

Feel free to close this issue if this behavior is expected.

voldemortX · 2021-02-11T14:24:08Z

It should be not CPU only, and probably an error throw is indeed needed? @datumbox

datumbox · 2021-02-11T17:59:45Z

@SiftingSands thanks for reporting the issue. This is a indeed a bug and it's caused by numeric overflows. I'm going to send a PR to fix this soon.

datumbox added the module: ops label Feb 11, 2021

datumbox mentioned this issue Feb 11, 2021

Rewrite generalized_box_iou and box_iou to share code #3382

Merged

datumbox added the bug label Feb 11, 2021

datumbox mentioned this issue Feb 11, 2021

Fix NMS and IoU overflows for fp16 #3383

Merged

developer0hye mentioned this issue Feb 12, 2021

check inf value for computing iou #3385

Closed

fmassa closed this as completed in #3383 Feb 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NMS behaviour w.r.t. fp16 vs fp32 #3371

NMS behaviour w.r.t. fp16 vs fp32 #3371

SiftingSands commented Feb 10, 2021 •

edited

Loading

voldemortX commented Feb 11, 2021 •

edited

Loading

SiftingSands commented Feb 11, 2021

voldemortX commented Feb 11, 2021

datumbox commented Feb 11, 2021

NMS behaviour w.r.t. fp16 vs fp32 #3371

NMS behaviour w.r.t. fp16 vs fp32 #3371

Comments

SiftingSands commented Feb 10, 2021 • edited Loading

🐛 Bug

To Reproduce

Environment

voldemortX commented Feb 11, 2021 • edited Loading

SiftingSands commented Feb 11, 2021

voldemortX commented Feb 11, 2021

datumbox commented Feb 11, 2021

SiftingSands commented Feb 10, 2021 •

edited

Loading

voldemortX commented Feb 11, 2021 •

edited

Loading