Skip to content

❓ [Question] How do you find the exact line of python code that triggers a backend compiler error? #2356

Open
@BDHU

Description

@BDHU

I was trying to compile the huggingface Llama 2 model using the following code:

import os
import torch
import torch_tensorrt
import torch.backends.cudnn as cudnn
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch._dynamo as dynamo
from optimum.onnxruntime import ORTModelForCausalLM

base_model = 'llama-2-7b'
comp_method = 'magnitude_unstructured'
comp_degree = 0.2

model_path = f'vita-group/{base_model}_{comp_method}'
model = AutoModelForCausalLM.from_pretrained(
       model_path,
       revision=f's{comp_degree}',
       torch_dtype=torch.float16,
       low_cpu_mem_usage=True,
       device_map="auto")
model.save_pretrained("model_ckpt/")
model.eval()

# setting
# torch._dynamo.config.suppress_errors = True
enabled_precisions = {torch.float, torch.int, torch.long}
debug = False
workspace_size = 20 << 30
min_block_size = 7
torch_executed_ops = {}

compilation_kwargs = {
    "enabled_precisions": enabled_precisions,
    "debug": debug,
    "workspace_size": workspace_size,
    "min_block_size": min_block_size,
    "torch_executed_ops": torch_executed_ops,
}


with torch.no_grad():
    optimized_model = torch.compile(
            model.generate,
            backend="torch_tensorrt",
            dynamic=True,
            options=compilation_kwargs,
            )

    tokenizer = AutoTokenizer.from_pretrained('meta-llama/Llama-2-7b-hf')
    input_ids = tokenizer('Hello! I am a VITA-compressed-LLM chatbot!', return_tensors='pt').input_ids.cuda()

    #outputs = model.generate(input_ids, max_new_tokens=128)
    outputs = optimized_model(input_ids, max_new_tokens=128)

And here is the complete log:

INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:0 supported operations detected in subgraph containing 0 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:0 supported operations detected in subgraph containing 0 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:0 supported operations detected in subgraph containing 0 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:0 supported operations detected in subgraph containing 0 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:0 supported operations detected in subgraph containing 0 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:1 supported operations detected in subgraph containing 2 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

/usr/local/lib/python3.10/dist-packages/torch/overrides.py:111: UserWarning: 'has_cuda' is deprecated, please use 'torch.backends.cuda.is_built()'
  torch.has_cuda,
/usr/local/lib/python3.10/dist-packages/torch/overrides.py:112: UserWarning: 'has_cudnn' is deprecated, please use 'torch.backends.cudnn.is_available()'
  torch.has_cudnn,
/usr/local/lib/python3.10/dist-packages/torch/overrides.py:118: UserWarning: 'has_mps' is deprecated, please use 'torch.backends.mps.is_built()'
  torch.has_mps,
/usr/local/lib/python3.10/dist-packages/torch/overrides.py:119: UserWarning: 'has_mkldnn' is deprecated, please use 'torch.backends.mkldnn.is_available()'
  torch.has_mkldnn,
Traceback (most recent call last):
  File "/workspace/workspace/scripts/vita/test.py", line 59, in <module>
    outputs = optimized_model(input_ids, max_new_tokens=128)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 333, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/external_utils.py", line 17, in inner
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1408, in generate
    self._validate_model_class()
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1415, in <resume in generate>
    new_generation_config = GenerationConfig.from_model_config(self.config)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1426, in <resume in generate>
    generation_config = copy.deepcopy(generation_config)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1429, in <resume in generate>
    self._validate_model_kwargs(model_kwargs.copy())
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1429, in <resume in generate>
    self._validate_model_kwargs(model_kwargs.copy())
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1432, in <resume in generate>
    logits_processor = logits_processor if logits_processor is not None else LogitsProcessorList()
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1433, in <resume in generate>
    stopping_criteria = stopping_criteria if stopping_criteria is not None else StoppingCriteriaList()
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1602, in <resume in generate>
    return self.greedy_search(
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2404, in greedy_search
    eos_token_id_tensor = torch.tensor(eos_token_id).to(input_ids.device) if eos_token_id is not None else None
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2450, in <resume in greedy_search>
    outputs = self(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/eval_frame.py", line 493, in catch_errors
    return callback(frame, cache_size, hooks, frame_state)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 624, in _convert_frame
    result = inner_convert(frame, cache_size, hooks, frame_state)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 132, in _fn
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 370, in _convert_frame_assert
    return _compile(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 554, in _compile
    guarded_code = compile_inner(code, one_graph, hooks, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 180, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 465, in compile_inner
    out_code = transform_code_object(code, transform)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/bytecode_transformation.py", line 1028, in transform_code_object
    transformations(instructions, code_options)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/convert_frame.py", line 432, in transform
    tracer.run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2071, in run
    super().run()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 724, in run
    and self.step()
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 688, in step
    getattr(self, inst.opname)(inst)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/symbolic_convert.py", line 2159, in RETURN_VALUE
    self.output.compile_subgraph(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 836, in compile_subgraph
    self.compile_and_call_fx_graph(tx, pass2.graph_output_vars(), root)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 936, in compile_and_call_fx_graph
    compiled_fn = self.call_user_compiler(gm)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 180, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 992, in call_user_compiler
    raise BackendCompilerFailed(self.compiler_fn, e).with_traceback(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/output_graph.py", line 988, in call_user_compiler
    compiled_fn = compiler_fn(gm, self.example_inputs())
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/repro/after_dynamo.py", line 117, in debug_wrapper
    compiled_gm = compiler_fn(gm, example_inputs)
  File "/usr/local/lib/python3.10/dist-packages/torch/__init__.py", line 1586, in __call__
    return self.compiler_fn(model_, inputs_, **self.kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/backend/backends.py", line 36, in torch_tensorrt_backend
    compiled_mod: torch.nn.Module = DEFAULT_BACKEND(gm, sample_inputs, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/backend/backends.py", line 55, in aot_torch_tensorrt_aten_backend
    return aot_module_simplified(
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3795, in aot_module_simplified
    compiled_fn = create_aot_dispatcher_function(
  File "/usr/local/lib/python3.10/dist-packages/torch/_dynamo/utils.py", line 180, in time_wrapper
    r = func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 3333, in create_aot_dispatcher_function
    compiled_fn = compiler_fn(flat_fn, fake_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2120, in aot_wrapper_dedupe
    return compiler_fn(flat_fn, leaf_flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 2300, in aot_wrapper_synthetic_base
    return compiler_fn(flat_fn, flat_args, aot_config, fw_metadata=fw_metadata)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 1574, in aot_dispatch_base
    compiled_fw = compiler(fw_module, flat_args)
  File "/usr/local/lib/python3.10/dist-packages/torch/_functorch/aot_autograd.py", line 1492, in f
    out_f = compiler(fx_g, inps)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/backend/backends.py", line 80, in _pretraced_backend
    trt_compiled = compile_module(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/compile.py", line 220, in compile_module
    trt_mod = convert_module(
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/conversion.py", line 40, in convert_module
    Input.from_tensors(inputs, disable_memory_format_check=True),
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 376, in from_tensors
    return [
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 377, in <listcomp>
    cls.from_tensor(t, disable_memory_format_check=disable_memory_format_check)
  File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
    return cls(shape=t.shape, dtype=t.dtype, format=frmt)
torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
AttributeError: 'SymInt' object has no attribute 'shape'

Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information


You can suppress this exception and fall back to eager by setting:
    import torch._dynamo
    torch._dynamo.config.suppress_errors = True

If I add torch._dynamo.config.suppress_errors = True, it will show the folloowing message:

[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT forward /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 772
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:30:11,458] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT forward /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 614
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:04,803] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT _prepare_decoder_attention_mask /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 591
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:05,499] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT _make_causal_mask /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 43
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:05,658] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT _expand_mask /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 61
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:05,856] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

INFO:torch_tensorrt.dynamo.conversion._TRTInterpreter:TRT INetwork construction elapsed time: 0:00:00.019075
INFO:torch_tensorrt.dynamo.conversion._TRTInterpreter:Build TRT engine elapsed time: 0:00:09.090515
INFO:torch_tensorrt.dynamo.conversion._TRTInterpreter:TRT Engine uses: 360710144 bytes of Memory
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT forward /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 396
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/conversion.py", line 37, in <listcomp>
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING]     output_dtypes = [output.dtype for output in module_outputs]
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING] AttributeError: 'int' object has no attribute 'dtype'
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:23,793] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT forward /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 292
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/dynamo/conversion/conversion.py", line 37, in <listcomp>
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING]     output_dtypes = [output.dtype for output in module_outputs]
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'dtype'
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:25,133] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:6 supported operations detected in subgraph containing 6 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT apply_rotary_pos_emb /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 180
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:26,538] torch._dynamo.convert_frame: [WARNING]
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:5 supported operations detected in subgraph containing 6 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

WARNING:torch_tensorrt.dynamo.compile:0 supported operations detected in subgraph containing 0 computational nodes. Skipping this subgraph, since min_block_size was detected to be 7
INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

INFO:torch_tensorrt.dynamo.utils:Using Default Torch-TRT Runtime (as requested by user)
INFO:torch_tensorrt.dynamo.utils:Compilation Settings: CompilationSettings(precision=torch.float32, debug=False, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=False, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False)

[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING] WON'T CONVERT forward /usr/local/lib/python3.10/dist-packages/transformers/models/llama/modeling_llama.py line 202
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING] due to:
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING] Traceback (most recent call last):
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING]   File "/usr/local/lib/python3.10/dist-packages/torch_tensorrt/_Input.py", line 357, in from_tensor
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING]     return cls(shape=t.shape, dtype=t.dtype, format=frmt)
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING] torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt' raised:
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING] AttributeError: 'SymInt' object has no attribute 'shape'
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING] Set TORCH_LOGS="+dynamo" and TORCHDYNAMO_VERBOSE=1 for more information
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING]
[2023-10-02 01:31:27,967] torch._dynamo.convert_frame: [WARNING]

I'd like to know which specific line of code causes this problem, and what the error message means. This warning AttributeError: 'SymInt' object has no attribute 'shape' was present throughout the compilation process and most forward functions are avoided as the result, which really compromises the performance gain from compilation. It seems this error has something to do with the dynamic shape and possible represent a symbolic variable but I'm not sure of the specifics.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions