feat(atenlib): add ops(layer_norm)#459
feat(atenlib): add ops(layer_norm)#459xiaowuhu merged 20 commits intomicrosoft:mainfrom xiaowuhu:xiaowu/addOps(layer_norm)
Conversation
Codecov Report
@@ Coverage Diff @@
## main #459 +/- ##
==========================================
+ Coverage 72.37% 72.46% +0.08%
==========================================
Files 109 109
Lines 10580 10603 +23
Branches 1089 1094 +5
==========================================
+ Hits 7657 7683 +26
+ Misses 2616 2613 -3
Partials 307 307
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
| """layer_norm(Tensor input, int[] normalized_shape, Tensor? weight=None, Tensor? bias=None, float eps=1e-05, bool cudnn_enable=True) -> Tensor""" | ||
|
|
||
| raise NotImplementedError() | ||
| axes = [-i for i in range(len(normalized_shape), 0, -1)] |
There was a problem hiding this comment.
So, the actual value specified in normalized_shape is irrelevant for aten_layer_norm? That sounds odd, so checking.
| "convolution": core_ops.aten_convolution, | ||
| "empty_like": core_ops.aten_empty_like, | ||
| "index_select": core_ops.aten_index_select, | ||
| # "native_layer_norm": core_ops.aten_layer_norm, # FIXME: 3 outputs != 1 output |
There was a problem hiding this comment.
I would duplicate the opinfo and update its op
…hu/onnx-script into xiaowu/addOps(layer_norm)
| skips=(), | ||
| supports_out=False, | ||
| ), | ||
| opinfo_core.OpInfo( |
There was a problem hiding this comment.
Let's move this on top of "nn.functional.conv3d" for alphabetical order.
|
There is actually an opset for LayerNorm |
| @@ -3984,18 +3995,18 @@ def _aten_native_layer_norm_onnx( | |||
| ) -> Tuple[TReal, TReal, TReal]: | |||
There was a problem hiding this comment.
axes should be input now.
axes: Sequence[INT64],And other places using ReduceMax/ReduceMean/ReduceMin should all be updated, otherwise, the model would pop errors saying ReduceXXX having unexpected input/attribute axes depending on opset_verison 17/18.
There was a problem hiding this comment.
Not necessarily. The interface of an aten op/function shouldn't change because of its implementation. Since an attribute can be promoted to be an input, it should be okay to leave the interface as is. But it is important to ensure that the implementation works correctly. Eg., use the noop_with_empty_axes attribute as appropriate to ensure correct behavior for when axes is empty, etc.
There was a problem hiding this comment.
Are you referring to the annotation (Sequence[INT64]) or usage (op.ReduceMean(input, axes=axes))? I think both of them needed to be modified to be compatible with opset18?
There was a problem hiding this comment.
I am saying that the annotation should ideally depend on the aten specification (and not on the onnx opset). The call to onnx ops like ReduceMean, of course, will depend on the onnx opset.
There was a problem hiding this comment.
Make sense, but for onnx function (aten interface), I think we still rely on function_ir to differentiate attrs/inputs. I guess I am not sure if function_ir changed due to the opset version bump 17 -> 18 in this case?
There was a problem hiding this comment.
I am saying that the annotation should ideally depend on the aten specification (and not on the onnx opset). The call to onnx ops like ReduceMean, of course, will depend on the onnx opset.
I agree. We could change it to INT64 (instead of Sequence[INT64]) prepare for symint inputs. The evaluator should be able to handle this (maybe already)
| @@ -3984,18 +3995,18 @@ def _aten_native_layer_norm_onnx( | |||
| ) -> Tuple[TReal, TReal, TReal]: | |||
|
|
|||
| # FIXME(justinchuby): Use opset18 when it is supported by onnxruntime | |||
No description provided.