Skip to content

✨[Feature] Weight specific engine caching #3146

Closed
@narendasan

Description

@narendasan

Is your feature request related to a problem? Please describe.

Caching right now is weight agnostic, but at the cost of creating lower performance engines.

Describe the solution you'd like

If we know that weights would be identical, then we can cache engines that are higher performance. The caching system would need to be able to distinguish these two caches and based on user settings select the right one

TensorRT has a flag called kREFIT_IDENTICAL for this workflow

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions