TensorrtExecutionProvider documentation

### System Info

```shell
main, docs
```


### Who can help?

@fxmarty 

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction (minimal, reproducible, runnable)

The method described in the docs for [TRT engine building](https://huggingface.co/docs/optimum/onnxruntime/usage_guides/gpu#tensorrt-engine-build-and-warmup) is outdated, first mentioned [here](https://github.com/huggingface/optimum/issues/842#issuecomment-1568766399), I tested the dynamic shapes method in `optimum-benchmark` [here](https://github.com/huggingface/optimum-benchmark/pull/55#issuecomment-1721180586). 

### Expected behavior

We can update the docs with this snippet:

```python
provider_options = {
    "trt_engine_cache_enable": True,
    "trt_engine_cache_path": "tmp/trt_cache_gpt2_example",
    "trt_profile_min_shapes": "input_ids:1x16,attention_mask:1x16",
    "trt_profile_max_shapes": "input_ids:1x64,attention_mask:1x64",
    "trt_profile_opt_shapes": "input_ids:1x32,attention_mask:1x32",
}

ort_model = ORTModelForCausalLM.from_pretrained(
    "gpt2",
    export=True,
    use_cache=False,
    provider="TensorrtExecutionProvider",
    provider_options=provider_options,
)

ort_model.generate(
    input_ids=torch.tensor([[1] * 16]).to("cuda"),
    max_new_tokens=64-16,
    min_new_tokens=64-16,
    pad_token_id=0,
    eos_token_id=0,
)
```

though it's still not clear to me what's the effect of `trt_profile_opt_shapes`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorrtExecutionProvider documentation #1395

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

TensorrtExecutionProvider documentation #1395

Description

System Info

Who can help?

Information

Tasks

Reproduction (minimal, reproducible, runnable)

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions