CoreML CPU InstanceNorm3d behaves differently depending on track_running_stats

### 🐛 Describe the bug

# CoreML CPU `InstanceNorm3d` behaves differently depending on `track_running_stats`

## Summary

I found this with [opdiff](https://github.com/0xShug0/opdiff) while comparing PyTorch modules across backends.

For `torch.nn.InstanceNorm3d(256, affine=False)` on the CoreML CPU FP32 path (aslo FP16 but the example script only use FP32), I see different behavior depending on `track_running_stats`:

- `track_running_stats=False`:
  export succeeds, CoreML inference succeeds, and output matches PyTorch closely
- `track_running_stats=True`:
  export begins the same way, but `ct.convert(...)` fails with:
  `NotImplementedError: Unsupported fx node alias, kind alias`

So this looks like an inconsistency triggered only by enabling running stats on the same module family / backend configuration.

------------------------------------------------------------------------

## Minimal Repro

Self-contained repro script:

[coreml_instancenorm3d_running_stats.py](https://github.com/0xShug0/opdiff/blob/main/reprod/coreml_instancenorm3d_running_stats.py)

Core setup:

```python
model = nn.InstanceNorm3d(
    256,
    affine=False,
    track_running_stats=track_running_stats,
).eval()
```

Input shape:

```python
x = torch.randn(1, 256, 32, 28, 28, dtype=torch.float32)
```

CoreML config:

```python
ct.convert(
    exported,
    inputs=[ct.TensorType(shape=x.shape)],
    convert_to="mlprogram",
    minimum_deployment_target=ct.target.iOS18,
    compute_units=ct.ComputeUnit.CPU_ONLY,
    compute_precision=ct.precision.FLOAT32,
)
```

Export path:

```python
exported = torch.export.export(model, (x,))
exported = exported.run_decompositions()
```

------------------------------------------------------------------------

## Observed Behavior

### `track_running_stats=False`

This succeeds end to end:

- `torch.export.export` OK
- `run_decompositions()` OK
- `ct.convert(...)` OK
- `mlmodel.predict(...)` OK

Output parity is good:

- `max_abs_diff = 1.430511474609375e-06`
- `mean_abs_diff = 5.992006890664925e-08`

### `track_running_stats=True`

This fails at conversion:

```text
NotImplementedError: Unsupported fx node alias, kind alias
```

So the “off” mode works, while the “on” mode fails for the same backend and otherwise similar setup.

------------------------------------------------------------------------

## Why this seems notable

This does not look like a generic `InstanceNorm3d` failure, because the same repro succeeds when `track_running_stats=False`.

That makes it seem specifically tied to the running-stats version of the module, possibly due to how buffers / aliasing are represented after export + decomposition.

------------------------------------------------------------------------

## Expected Behavior

I’d expect one of these:

- both configurations convert and run, or
- the unsupported case is rejected more explicitly/documentedly

Right now it looks surprising that toggling `track_running_stats` flips the CoreML CPU FP32 result from “works with good parity” to “conversion failure”.

------------------------------------------------------------------------

## Environment

- PyTorch 2.10.0
- coremltools 9.0
- Python 3.11
- macOS / Apple Silicon




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CoreML CPU InstanceNorm3d behaves differently depending on track_running_stats #2666

🐛 Describe the bug

CoreML CPU `InstanceNorm3d` behaves differently depending on `track_running_stats`

Summary

Minimal Repro

Observed Behavior

`track_running_stats=False`

`track_running_stats=True`

Why this seems notable

Expected Behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

CoreML CPU InstanceNorm3d behaves differently depending on track_running_stats #2666

Description

🐛 Describe the bug

CoreML CPU InstanceNorm3d behaves differently depending on track_running_stats

Summary

Minimal Repro

Observed Behavior

track_running_stats=False

track_running_stats=True

Why this seems notable

Expected Behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

CoreML CPU `InstanceNorm3d` behaves differently depending on `track_running_stats`

`track_running_stats=False`

`track_running_stats=True`