Would be really great to get a speedup on DirectML/ONNX pipelines from Diffusers. E.g. OnnxStableDiffusionPipeline, OnnxStableDiffusionImg2ImgPipeline, OnnxStableDiffusionInpaintPipeline.