Skip to content

[Bug]: All Reference Preprocessors are generating errors for SD1.5 and SDXL models #2329

@Shangooriginal

Description

@Shangooriginal

Is there an existing issue for this?

  • I have searched the existing issues and checked the recent builds/commits of both this extension and the webui

What happened?

All the Reference Preprocessors are generating errors when I try to use them either with SD1.5 and SDXL models.

Steps to reproduce the problem

I upload picture for reference and adjust the settings, click generate and error message appears. See below

RuntimeError: shape '[81920, 8, 40]' is invalid for input of size 3276800

What should have happened?

It should just work and generate a new reference image output

Commit where the problem happens

webui: version: [v1.7.0-RC-4-g120a84bd]
controlnet: [a13bd2f]

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

x-formers

List of enabled extensions

Non applicable

Console logs

RuntimeError: shape '[81920, 8, 40]' is invalid for input of size 3276800

*** Error completing request
*** Arguments: ('task(2bl0opahhbql4kh)', 'Fashion model, Asian ethnicity, futuristic urban wear, metallic accents, sleek lines, white background', '', ['Easy_Bad_NegPrompt'], 30, 'DPM++ 2M SDE Karras', 1, 1, 7, 640, 512, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', '', '', [], <gradio.routes.Request object at 0x000001CA12ADA560>, 0, -1, False, -1, 0, 0, 0, False, '', 0.8, False, False, False, False, 'base', False, False, {'ad_model': 'face_yolov8n.pt', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'Euler a', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'Euler a', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, False, 'MultiDiffusion', False, True, 1024, 1024, 96, 96, 48, 4, 'None', 2, False, 10, 1, 1, 64, False, False, False, False, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 0.4, 0.4, 0.2, 0.2, '', '', 'Background', 0.2, -1.0, False, 3072, 192, True, True, True, False, False, 7, 100, 'Constant', 0, 'Constant', 0, 4, True, 'MEAN', 'AD', 1, UiControlNetUnit(enabled=True, module='reference_adain+attn', model='None', weight=1, image={'image': array([[[193, 193, 194],
***         [193, 193, 196],
***         [191, 192, 194],
***         ...,
***         [183, 184, 183],
***         [182, 184, 184],
***         [183, 182, 183]],
***
***        [[192, 191, 193],
***         [193, 193, 195],
***         [193, 191, 195],
***         ...,
***         [183, 183, 184],
***         [183, 183, 184],
***         [182, 182, 184]],
***
***        [[191, 190, 193],
***         [191, 192, 193],
***         [193, 193, 194],
***         ...,
***         [184, 184, 185],
***         [183, 184, 183],
***         [184, 182, 183]],
***
***        ...,
***
***        [[225, 224, 229],
***         [224, 223, 228],
***         [224, 224, 227],
***         ...,
***         [223, 225, 228],
***         [222, 224, 227],
***         [223, 223, 228]],
***
***        [[225, 225, 229],
***         [224, 224, 227],
***         [224, 224, 228],
***         ...,
***         [221, 223, 225],
***         [221, 222, 225],
***         [222, 223, 226]],
***
***        [[225, 224, 229],
***         [224, 224, 229],
***         [225, 223, 230],
***         ...,
***         [223, 223, 227],
***         [222, 224, 226],
***         [222, 222, 224]]], dtype=uint8), 'mask': array([[[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        ...,
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]],
***
***        [[0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0],
***         ...,
***         [0, 0, 0],
***         [0, 0, 0],
***         [0, 0, 0]]], dtype=uint8)}, resize_mode='Crop and Resize', low_vram=False, processor_res=-1, threshold_a=0.5, threshold_b=-1, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=64, threshold_a=64, threshold_b=64, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=64, threshold_a=64, threshold_b=64, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), UiControlNetUnit(enabled=False, module='none', model='None', weight=1, image=None, resize_mode='Crop and Resize', low_vram=False, processor_res=64, threshold_a=64, threshold_b=64, guidance_start=0, guidance_end=1, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), False, True, 3, 4, 0.15, 0.3, 'bicubic', 0.5, 2, True, False, False, False, 'Matrix', 'Columns', 'Mask', 'Prompt', '1,1', '0.2', False, False, False, 'Attention', [False], '0', '0', '0.4', None, '0', '0', False, None, False, '0', '0', 'inswapper_128.onnx', 'CodeFormer', 1, True, 'None', 1, 1, False, True, 1, 0, 0, False, 0.5, True, False, 'CUDA', False, 0, 'None', '', None, False, False, 0, None, [], 0, False, [], [], False, 0, 1, False, False, 0, None, [], -2, False, [], False, 0, None, None, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', [], 0, '', [], 0, '', [], True, False, False, False, 0, False, None, None, False, None, None, False, None, None, False, None, None, False, 50, [], 30, '', 4, [], 1, '', '', '', '') {}
    Traceback (most recent call last):
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\call_queue.py", line 57, in f
        res = list(func(*args, **kwargs))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\call_queue.py", line 36, in f
        res = func(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\txt2img.py", line 55, in txt2img
        processed = processing.process_images(p)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\processing.py", line 734, in process_images
        res = process_images_inner(p)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\processing.py", line 868, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 423, in process_sample
        return process.sample_before_CN_hack(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\processing.py", line 1142, in sample
        samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 235, in sample
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_samplers_common.py", line 261, in launch_sampling
        return func()
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_samplers_kdiffusion.py", line 235, in <lambda>
        samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\sampling.py", line 626, in sample_dpmpp_2m_sde
        denoised = model(x, sigmas[i] * s_in, **extra_args)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_samplers_cfg_denoiser.py", line 169, in forward
        x_out = self.inner_model(x_in, sigma_in, cond=make_condition_dict(cond_in, image_cond_in))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 112, in forward
        eps = self.get_eps(input * c_in, self.sigma_to_t(sigma), **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\k-diffusion\k_diffusion\external.py", line 138, in get_eps
        return self.inner_model.apply_model(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_hijack_utils.py", line 17, in <lambda>
        setattr(resolved_obj, func_path[-1], lambda *args, **kwargs: self(*args, **kwargs))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_hijack_utils.py", line 28, in __call__
        return self.__orig_func(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 858, in apply_model
        x_recon = self.model(x_noisy, t, **cond)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\models\diffusion\ddpm.py", line 1335, in forward
        out = self.diffusion_model(x, t, context=cc)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 840, in forward_webui
        raise e
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 837, in forward_webui
        return forward(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 722, in forward
        outer.original_forward(
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_unet.py", line 91, in UNetModel_forward
        return original_forward(self, x, timesteps, context, *args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 797, in forward
        h = module(h, emb, context)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\openaimodel.py", line 84, in forward
        x = layer(x, context)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\attention.py", line 334, in forward
        x = block(x, context=context[i])
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\attention.py", line 269, in forward
        return checkpoint(self._forward, (x, context), self.parameters(), self.checkpoint)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\util.py", line 121, in checkpoint
        return CheckpointFunction.apply(func, len(inputs), *args)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\autograd\function.py", line 506, in apply
        return super().apply(*args, **kwargs)  # type: ignore[misc]
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\repositories\stable-diffusion-stability-ai\ldm\modules\diffusionmodules\util.py", line 136, in forward
        output_tensors = ctx.run_function(*ctx.input_tensors)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\hook.py", line 884, in hacked_basic_transformer_inner_forward
        self_attn1 = self.attn1(x_norm1, context=self_attention_context)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\extensions-builtin\hypertile\hypertile.py", line 307, in wrapper
        out = params.forward(x, *args[1:], **kwargs)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\modules\sd_hijack_optimizations.py", line 496, in xformers_attention_forward
        out = xformers.ops.memory_efficient_attention(q, k, v, attn_bias=None, op=get_xformers_flash_attention_op(q, k, v))
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 192, in memory_efficient_attention
        return _memory_efficient_attention(
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 290, in _memory_efficient_attention
        return _memory_efficient_attention_forward(
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\__init__.py", line 310, in _memory_efficient_attention_forward
        out, *_ = op.apply(inp, needs_gradient=False)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\flash.py", line 235, in apply
        ) = _convert_input_format(inp)
      File "C:\Users\xxxxx\Documents\WebUI-Stable-Diffusion\stable-diffusion-webui\venv\lib\site-packages\xformers\ops\fmha\flash.py", line 177, in _convert_input_format
        key=key.reshape([batch * seqlen_kv, num_heads, head_dim_q]),
    RuntimeError: shape '[81920, 8, 40]' is invalid for input of size 3276800

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Missing informationSome key information is missed in the bug reportNot reproducableThe issue cannot be reproduced

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions