Hello, in https://github.com/HiDream-ai/HiDream-I1/blob/main/hi_diffusers/models/attention_processor.py#L71, I notice that the attention mask is applied by multiplying mask and key.
Can you please explain why not fill -inf to the attention like pytorch?
Thanks.