Fully vectorized sample posterior predictive #3603

lucianopaz · 2019-08-23T11:54:41Z

I opened this PR to show a working fully vectorized implementation of sample_posterior_predictive. It is based on the proposal I did on this gist

It leaves all of the draw_values, _draw_value and generate_samples infrastructure as is.
It makes a single call to draw_values and passes the entire content held in the passed trace as a point dictionary.
To allow for the vectorization, it takes advantage of the size aware broadcasting rules introduced in Improve shape handling in generate_samples #3456. It makes a view of each variable's sampled chain held in the trace so that its shape prepend matches the number of posterior predictive draws desired.
I had to patch up DensityDist.random to make it work properly when the distribution is observed.

This proposal is intended only as a wishful implementation that can serve as comparison to #3597, because it is likely that models that have observed MultiVariate distributions will raise various exceptions. However, none of the tests present at the moment of writing this PR failed (at least locally), so I'll later try to make a model with observed MultiVariates that fail to draw samples from the posterior predictive.

As such, the notes added into RELEASE-NOTES.md are kind of a mock. If we discuss and agree that this approach is worth it, I can make a better note and also add the proper deprecations to sample_posterior_predictive's arguments

lucianopaz · 2019-08-23T12:39:01Z

It seems I hadn't run the whole testsuite locally and there is a single failing test. I was a bit puzzled at first because it tested mixture distributions, so apparently no observed multivariate distributions, but the failing model looks like:

with pm.Model() as model:
    pi = pm.Dirichlet('pi', np.ones(K))
    comp_dist = []
    mu = []
    packed_chol = []
    chol = []
    for i in range(K):
        mu.append(pm.Normal('mu%i' % i, 0, 10, shape=D))
        packed_chol.append(
            pm.LKJCholeskyCov('chol_cov_%i' % i,
                              eta=2,
                              n=D,
                              sd_dist=pm.HalfNormal.dist(2.5))
        )
        chol.append(pm.expand_packed_triangular(D, packed_chol[i],
                                                lower=True))
        comp_dist.append(pm.MvNormal.dist(mu=mu[i], chol=chol[i]))

    pm.Mixture('x_obs', pi, comp_dist, observed=X)

As you see, the model has an observed Mixture of MvNormal distributions, so this model actually shows the kind of problems we could encounter with observed multivariate rvs (or mixture of multivariates)!

junpenglao · 2019-08-23T13:53:17Z

RELEASE-NOTES.md

@@ -9,6 +9,7 @@
 - Added `Matern12` covariance function for Gaussian processes. This is the Matern kernel with nu=1/2.
 - Progressbar reports number of divergences in real time, when available [#3547](https://github.com/pymc-devs/pymc3/pull/3547).
 - Sampling from variational approximation now allows for alternative trace backends [#3550].
+- Included `fast_sample_posterior_predictive`, which draws samples from the posterior predictive distribution taking the maximum advantage possible from operation vectorization. This will implementation is likely to run into problems with multivariate distributions for now.


Suggested change

- Included `fast_sample_posterior_predictive`, which draws samples from the posterior predictive distribution taking the maximum advantage possible from operation vectorization. This will implementation is likely to run into problems with multivariate distributions for now.

- Included `fast_sample_posterior_predictive`, which draws samples from the posterior predictive distribution taking the maximum advantage possible from vectorizing. This implementation will likely run into problems with multivariate distributions.

Made minor corrections to @junpenglao 's edit. Should we add a phrase to explain why there are likely to be problems with multivariate distributions? Is this a shape issue, or is this a problem with the way these distributions' random methods work, or both?

ColCarroll

this is really careful and useful code! thanks for the extensive tests, they make me trust this a lot more -- i did my best to understand what was going on, and it looks good to me (also thanks for the comments!)

ColCarroll · 2019-08-27T01:46:25Z