-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fully vectorized sample posterior predictive #3603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It seems I hadn't run the whole testsuite locally and there is a single failing test. I was a bit puzzled at first because it tested mixture distributions, so apparently no observed multivariate distributions, but the failing model looks like: with pm.Model() as model:
pi = pm.Dirichlet('pi', np.ones(K))
comp_dist = []
mu = []
packed_chol = []
chol = []
for i in range(K):
mu.append(pm.Normal('mu%i' % i, 0, 10, shape=D))
packed_chol.append(
pm.LKJCholeskyCov('chol_cov_%i' % i,
eta=2,
n=D,
sd_dist=pm.HalfNormal.dist(2.5))
)
chol.append(pm.expand_packed_triangular(D, packed_chol[i],
lower=True))
comp_dist.append(pm.MvNormal.dist(mu=mu[i], chol=chol[i]))
pm.Mixture('x_obs', pi, comp_dist, observed=X) As you see, the model has an observed |
@@ -9,6 +9,7 @@ | |||
- Added `Matern12` covariance function for Gaussian processes. This is the Matern kernel with nu=1/2. | |||
- Progressbar reports number of divergences in real time, when available [#3547](https://github.com/pymc-devs/pymc3/pull/3547). | |||
- Sampling from variational approximation now allows for alternative trace backends [#3550]. | |||
- Included `fast_sample_posterior_predictive`, which draws samples from the posterior predictive distribution taking the maximum advantage possible from operation vectorization. This will implementation is likely to run into problems with multivariate distributions for now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Included `fast_sample_posterior_predictive`, which draws samples from the posterior predictive distribution taking the maximum advantage possible from operation vectorization. This will implementation is likely to run into problems with multivariate distributions for now. | |
- Included `fast_sample_posterior_predictive`, which draws samples from the posterior predictive distribution taking the maximum advantage possible from vectorizing. This implementation will likely run into problems with multivariate distributions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made minor corrections to @junpenglao 's edit. Should we add a phrase to explain why there are likely to be problems with multivariate distributions? Is this a shape issue, or is this a problem with the way these distributions' random
methods work, or both?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really careful and useful code! thanks for the extensive tests, they make me trust this a lot more -- i did my best to understand what was going on, and it looks good to me (also thanks for the comments!)
] | ||
tail = [] | ||
offset = len(size) + len(extra_self_shape) | ||
for i, (cs, ss) in enumerate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the comments in the code here are really good -- could i ask you to replace cs
and ss
with short names, like core_shape
and self_shape
?
# shape == (extended_size + size + core_output_shape) | ||
output = np.reshape(output, extended_size + size + core_draw_shape) | ||
# Now we move the axis to the final destination and reshape | ||
output = np.moveaxis( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have never seen np.moveaxis
, cool!
self.rand, | ||
size=None, | ||
not_broadcast_kwargs=not_broadcast_kwargs, | ||
# We don't know anything about what self.rand returns so we |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are capturing all exceptions here when trying to generate a test sample. Should we:
- Log the exception? and/or
- Signal a warning about the exception?
Finally, should we simply give up at this point? As I understand things, we are generating this single sample as a way to try to determine the shape of the RV in question. I believe that this is already a fall-back position. If we have not been able to identify the shape here, should we give up and raise an exception, instead of trying to continue? Couldn't this cause us to compute "garbage out" from "garbage in"? Or cause us to raise a cryptic exception later that the user will have a harder time diagnosing?
# means that we must trust that self.shape is correct | ||
single_draw_shape = tuple() | ||
if single_draw_shape[:len(size)] == size and point is not None: | ||
# The test shape has the size prepend and the supplied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"prepend" should be "prepended."
if single_draw_shape[:len(size)] == size and point is not None: | ||
# The test shape has the size prepend and the supplied | ||
# point is not empty. This can happen when sampling from | ||
# the posterior predictive so we must take special care |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will you please say why we must take special care and/or what is the thing that we need to take special care about?
Thanks!
if keep_size: | ||
for k, ary in ppc_trace.items(): | ||
ppc_trace[k] = ary.reshape((nchain, len_trace, *ary.shape[1:])) | ||
# Now we extract all the data from the trace and build a single dictionary, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a mod on master
I revised this so we don't go through the intermediate step of building a Python list/array, and jump directly into populating the numpy array of samples. Could we pull that over here? I believe there are memory issues in building these temporary python data structures.
My alternative attempt introduced an alternative name for the faster ppc which lets me test the vectorized implementation and the legacy implementation, which I believe we need to keep because it handles multivariate distributions better. Should we pull this over here? |
…Dist to make it work properly.
32c32bb
to
2d979b6
Compare
@lucianopaz Is this one still active? |
@twiecki, no. It was just an alternative implementation of vectorised posterior predictive sampling, but given the status of our multivariate distributions and mixture distribution, it would require fixing a bunch of other stuff elsewhere in the code base first. We can safely close this PR, given that Robert merged his vectorised posterior predictive sampling in #3597 already. |
Thanks! |
I opened this PR to show a working fully vectorized implementation of
sample_posterior_predictive
. It is based on the proposal I did on this gistdraw_values
,_draw_value
andgenerate_samples
infrastructure as is.draw_values
and passes the entire content held in the passedtrace
as apoint
dictionary.size
aware broadcasting rules introduced in Improve shape handling in generate_samples #3456. It makes a view of each variable's sampled chain held in the trace so that its shape prepend matches the number of posterior predictive draws desired.DensityDist.random
to make it work properly when the distribution is observed.This proposal is intended only as a wishful implementation that can serve as comparison to #3597, because it is likely that models that have observed
MultiVariate
distributions will raise various exceptions. However, none of the tests present at the moment of writing this PR failed (at least locally), so I'll later try to make a model with observedMultiVariate
s that fail to draw samples from the posterior predictive.As such, the notes added into RELEASE-NOTES.md are kind of a mock. If we discuss and agree that this approach is worth it, I can make a better note and also add the proper deprecations to
sample_posterior_predictive
's arguments