This proposes adding an `IWAE_DReG` loss similar to the `ELBO` class but implementing the [doubly reparametrized gradient estimator](https://arxiv.org/abs/1810.04152). @iffsid points out https://github.com/pytorch/pytorch/issues/25783 that this would be much easier if `Distribution` objects had a `.detach()` method.