[inductor][cpu] fastNLP_Bert, hf_BigBird, hf_Reformer, soft_actor_critic fp32 Dynamic shape CPP wrapper accuracy crashed #122292

WeizhuoZhang-intel · 2024-03-20T08:00:56Z

🐛 Describe the bug

new_failures in 2024-01-31

suite	name	thread	accuracy	perf	reason(reference only)
torchbench	fastNLP_Bert	multiple	X	√	fastNLP_Bert, torch._dynamo.exc.TorchRuntimeError: Failed running call_method new_full((FakeTensor(... size=(s0 473) dtype=torch.int64) (4.0 475)) *{'fill_value': 3667}):

Versions

name	target_branch	target_commit
torchbench	main	ff42d907
torch	main	`e3cde68`
torchvision	main	0.18.0a0+0be6c7e
torchtext	main	0.16.0a0+b0ebddc
torchaudio	main	2.2.0a0+02586da
torchdata	main	0.7.1a0+0790338
dynamo_benchmarks	main	nightly

Error:

loading model: 0it [00:01, ?it/s]cpu  eval  fastNLP_Bert                       

ERROR:common:Failed running call_method new_full(*(FakeTensor(..., size=(s0, 473), dtype=torch.int64), (4.0, 475)), **{'fill_value': 3667}):
new_full(): argument 'size' (position 1) must be tuple of ints, but found element of type float at pos 0

from user code:
   File "/opt/conda/lib/python3.8/site-packages/fastNLP/embeddings/bert_embedding.py", line 458, in torch_dynamo_resume_in_forward_at_445
    word_pieces = words.new_full((batch_size, min(max_word_piece_length + 2, self._max_position_embeddings)),

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True
Traceback (most recent call last):
  File "/workspace/pytorch/benchmarks/dynamo/common.py", line 2441, in check_accuracy
    new_result = optimized_model_iter_fn(model_copy, example_inputs)
  File "/workspace/pytorch/torch/_dynamo/eval_frame.py", line 452, in _fn
    return fn(*args, **kwargs)
  File "/workspace/pytorch/benchmarks/dynamo/common.py", line 2174, in run_n_iterations
    self.model_iter_fn(mod, inputs, collect_outputs=False)
  File "benchmarks/dynamo/torchbench.py", line 469, in forward_pass
    return mod(*inputs)
  File "/workspace/pytorch/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/pytorch/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/fastNLP/models/bert.py", line 265, in forward
    sequence_output = self.bert(words)
  File "/workspace/pytorch/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/pytorch/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/fastNLP/embeddings/bert_embedding.py", line 137, in forward
    outputs = self.model(words)
  File "/workspace/pytorch/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/workspace/pytorch/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/fastNLP/embeddings/bert_embedding.py", line 445, in forward
    max_word_piece_length = batch_word_pieces_length.sum(dim=-1).max().item()  # è¡¨ç¤ºword pieceçš„é•¿åº¦(åŒ…æ‹¬padding)
  File "/workspace/pytorch/torch/_dynamo/eval_frame.py", line 614, in catch_errors
    return callback(frame, cache_entry, hooks, frame_state)
  File "/workspace/pytorch/torch/_dynamo/convert_frame.py", line 748, in _convert_frame
    result = inner_convert(frame, cache_entry, hooks, frame_state)
  File "/workspace/pytorch/torch/_dynamo/convert_frame.py", line 390, in _convert_frame_assert
    return _compile(
  File "/opt/conda/lib/python3.8/contextlib.py", line 75, in inner
    return func(*args, **kwds)
  File "/workspace/pytorch/torch/_dynamo/convert_frame.py", line 650, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 248, in time_wrapper
    r = func(*args, **kwargs)
  File "/workspace/pytorch/torch/_dynamo/convert_frame.py", line 531, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/workspace/pytorch/torch/_dynamo/bytecode_transformation.py", line 1033, in transform_code_object
    transformations(instructions, code_options)
  File "/workspace/pytorch/torch/_dynamo/convert_frame.py", line 155, in _fn
    return fn(*args, **kwargs)
  File "/workspace/pytorch/torch/_dynamo/convert_frame.py", line 496, in transform
    tracer.run()
  File "/workspace/pytorch/torch/_dynamo/symbolic_convert.py", line 2125, in run
    super().run()
  File "/workspace/pytorch/torch/_dynamo/symbolic_convert.py", line 787, in run
    and self.step()
  File "/workspace/pytorch/torch/_dynamo/symbolic_convert.py", line 750, in step
    getattr(self, inst.opname)(inst)
  File "/workspace/pytorch/torch/_dynamo/symbolic_convert.py", line 469, in wrapper
    return inner_fn(self, inst)
  File "/workspace/pytorch/torch/_dynamo/symbolic_convert.py", line 1249, in CALL_FUNCTION_KW
    self.call_function(fn, args, kwargs)
  File "/workspace/pytorch/torch/_dynamo/symbolic_convert.py", line 651, in call_function
    self.push(fn.call_function(self, args, kwargs))
  File "/workspace/pytorch/torch/_dynamo/variables/misc.py", line 583, in call_function
    return self.obj.call_method(tx, self.name, args, kwargs)
  File "/workspace/pytorch/torch/_dynamo/variables/tensor.py", line 772, in call_method
    return wrap_fx_proxy(
  File "/workspace/pytorch/torch/_dynamo/variables/builder.py", line 1285, in wrap_fx_proxy
    return wrap_fx_proxy_cls(target_cls=TensorVariable, **kwargs)
  File "/workspace/pytorch/torch/_dynamo/variables/builder.py", line 1370, in wrap_fx_proxy_cls
    example_value = get_fake_value(proxy.node, tx, allow_non_graph_fake=True)
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 1653, in get_fake_value
    raise TorchRuntimeError(str(e)).with_traceback(e.__traceback__) from None
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 1599, in get_fake_value
    ret_val = wrap_fake_exception(
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 1140, in wrap_fake_exception
    return fn()
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 1600, in <lambda>
    lambda: run_node(tx.output, node, args, kwargs, nnmodule)
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 1720, in run_node
    raise RuntimeError(fn_str + str(e)).with_traceback(e.__traceback__) from e
  File "/workspace/pytorch/torch/_dynamo/utils.py", line 1701, in run_node
    return getattr(args[0], node.target)(*args[1:], **kwargs)
torch._dynamo.exc.TorchRuntimeError: Failed running call_method new_full(*(FakeTensor(..., size=(s0, 473), dtype=torch.int64), (4.0, 475)), **{'fill_value': 3667}):
new_full(): argument 'size' (position 1) must be tuple of ints, but found element of type float at pos 0

from user code:
   File "/opt/conda/lib/python3.8/site-packages/fastNLP/embeddings/bert_embedding.py", line 458, in torch_dynamo_resume_in_forward_at_445
    word_pieces = words.new_full((batch_size, min(max_word_piece_length + 2, self._max_position_embeddings)),

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

TorchDynamo optimized model failed to run because of following error
fail_to_run

Repro:
inductor_single_run.sh

bash inductor_single_run.sh multiple inference accuracy torchbench fastNLP_Bert float32 first dynamic cpp

Suspected guilty commit: 4e456fd

torchbench-fastNLP_Bert-inference-float32-dynamic-cpp-multiple-accuracy-crash_guilty_commit.log.txt

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov @ColinPeppler @amjames @desertfire @chauhang @ezyang @msaroufim @bdhirsh @anijain2305 @zou3519 @WeizhuoZhang-intel @chuanqi129 @zxd1997066

The text was updated successfully, but these errors were encountered:

chunyuan-w · 2024-03-21T07:57:01Z

From the error message, may be related to the issue that I'm fixing in #122297.

desertfire · 2024-03-29T16:20:06Z

Assign to myself as a reminder

chuanqi129 · 2024-04-17T15:22:25Z

According to latest report, the issue has been fixed

malfet added oncall: pt2 oncall: cpu inductor CPU Inductor issues for Intel team to triage labels Mar 20, 2024

WeizhuoZhang-intel changed the title ~~[inductor][cpu] fastNLP_Bert fp32 Dynamic shape CPP wrapper accuracy crashed~~ [inductor][cpu] fastNLP_Bert, hf_BigBird, hf_Reformer, soft_actor_critic fp32 Dynamic shape CPP wrapper accuracy crashed Mar 21, 2024

jansel added module: inductor and removed oncall: pt2 labels Mar 23, 2024

chunyuan-w mentioned this issue Mar 25, 2024

Inductor cpp wrapper: fix dtype of ShapeAsConstantBuffer #122297

Closed

desertfire self-assigned this Mar 29, 2024

This was referenced Apr 1, 2024

[CherryPick] Inductor cpp wrapper: fix dtype of ShapeAsConstantBuffer (#122297) #123064

Merged

[v.2.3.0] Release Tracker #121760

Closed

chuanqi129 closed this as completed Apr 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[inductor][cpu] fastNLP_Bert, hf_BigBird, hf_Reformer, soft_actor_critic fp32 Dynamic shape CPP wrapper accuracy crashed #122292

[inductor][cpu] fastNLP_Bert, hf_BigBird, hf_Reformer, soft_actor_critic fp32 Dynamic shape CPP wrapper accuracy crashed #122292

WeizhuoZhang-intel commented Mar 20, 2024 •

edited by pytorch-bot bot

Loading

chunyuan-w commented Mar 21, 2024

desertfire commented Mar 29, 2024

chuanqi129 commented Apr 17, 2024

[inductor][cpu] fastNLP_Bert, hf_BigBird, hf_Reformer, soft_actor_critic fp32 Dynamic shape CPP wrapper accuracy crashed #122292

[inductor][cpu] fastNLP_Bert, hf_BigBird, hf_Reformer, soft_actor_critic fp32 Dynamic shape CPP wrapper accuracy crashed #122292

Comments

WeizhuoZhang-intel commented Mar 20, 2024 • edited by pytorch-bot bot Loading

🐛 Describe the bug

Versions

chunyuan-w commented Mar 21, 2024

desertfire commented Mar 29, 2024

chuanqi129 commented Apr 17, 2024

WeizhuoZhang-intel commented Mar 20, 2024 •

edited by pytorch-bot bot

Loading