Hi, I was wondering how can we support Distrifuser for Flux models by Blackforest?
They use FluxTransformerBlock and FluxSingleTransformerBlock as their attention. Can we support this by pp?
You divide latents of batch size 2 into 1 if number of gpus is 2. But for flux, latents are (1,4096,64) already :(((