Skip to content

Serializing int32_t dtypes #15669

@AdrianLundell

Description

@AdrianLundell

🚀 The feature, motivation and pitch

Problem statement

I am working on integrating CMSIS-NN, an external library of optimized kernels, into executorch in the Cortex-M backend. These kernels are implemented with integers for e.g. quant parameters and dimensions using int32_t.

However, if I define the operator registration with ints using the .yaml-api:

- func: cortex_m::quantized_conv2d.out(Tensor input,  
  Tensor weight, 
  Tensor? bias, 
  int[] stride, 
  int[] padding,
  int[] dilation
  int input_offset
  int output_offset
  int[] requantize_multiplier
  int[] requantize_shifts
  Scalar activation_min
  Scalar activation_max, *, Tensor(a!) out) -> Tensor(a!)

The resulting call signature will end up with int64_t types. Looking into the program.fbs schema, it generally seems that integers in the exir graph are serialized to int64:

table Int {
  int_val: long;
}

For single ints this is not a big issue since it is reasonably safe to cast to int32_t at runtime. For list of integers with unknown length on the other hand, this would require dynamic memory allocation or pre-allocating of a maximum length, both of which are non-optimal.

Proposed solution

My suggestion is to add support for explicitly using int32 in the exir-graph and in the kernel-registration.
It seems to me that this problem is general enough to motivate a common solution rather than having multiple backends implementing workarounds.

@lucylq @JacobSzwejbka @psiddh @SS-JIA

Alternatives

Using a Tensor for IntLists

Using a tensor seems to be a viable workaround for the IntLists, but it comes with some overhead compared to just having a list. If this will be used we must first make sure that this has a negligible performance impact.

Using Scalar

Using scalar should only be done when all dtypes are supported according to https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/native/README.md, not for supporting one single type. There are also no ScalarLists so it does not solve the main issue.

Use a modified schema rather than extending it.

I believe that introducing parallel ways of serializing the graph runs the risk of creating more complex issues where the runtime has to know which version of serialization was used.

Additional context

No response

RFC (Optional)

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions