Skip to content

Invalid logic in sample prior predictive for transformed variables #4484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ricardoV94 opened this issue Feb 23, 2021 · 3 comments · Fixed by #4983
Closed

Invalid logic in sample prior predictive for transformed variables #4484

ricardoV94 opened this issue Feb 23, 2021 · 3 comments · Fixed by #4983
Labels

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Feb 23, 2021

Prior predictive sampling draws values from the untransformed variables and then attempts to recreate the transformed values via transformation.forward_val(untransformed_values)

However, this logic is not sufficient to recover the correct values when there are stochastic bounds (e.g., pm.Uniform('x', lower=0, upper=y), because we also need to know what the values of the stochastic bounds were in each sample (i.e., the y).

People don't usually care about transformed variables, but this is critical for example in sample_smc() as it creates it's particles from an initial prior_predictive call.

Hopefully this will be gone in V4.0, but I thought best to document it just in case (and in the meanwhile).

Minimal reproducible example:

np.random.seed(1)
with pm.Model() as m:
    y = pm.Uniform('y', 0, 1)
    x = pm.Uniform('x', 0, y)
    prior = pm.sample_prior_predictive()

print(np.mean(np.isnan(prior['x_interval__'])))
/home/ricardo/Documents/Projects/pymc3/pymc3/distributions/transforms.py:294: RuntimeWarning: invalid value encountered in log
  return floatX(np.log(x - a) - np.log(b - x))

0.314

The 31.4% NANs correspond to cases where a and b were either above or below x due to random sampling in the block below. The NANs are the most obvious symptom, but all values are in fact incorrect:

https://github.com/pymc-devs/pymc3/blob/32b5c941c8f43b485afd3da1ae5544c3c50b9e77/pymc3/distributions/transforms.py#L289-L294

Which is called at the end of prior_predictive_sampling, without providing a point or any other contextual info:

https://github.com/pymc-devs/pymc3/blob/37ca5ea9b25aa00ed4d460f0fb417712d125d248/pymc3/sampling.py#L1941-L1957

@ricardoV94 ricardoV94 added the bug label Feb 23, 2021
@ricardoV94 ricardoV94 changed the title invalid logic in sample prior predictive for auto-transformed variables Invalid logic in sample prior predictive for transformed variables Feb 23, 2021
@brandonwillard
Copy link
Contributor

Let's try this out on the v4 branch and fix any issues from there.

@ricardoV94
Copy link
Member Author

Possibly behind this commonly failing test: #4346 ?

@ricardoV94
Copy link
Member Author

A similar issue might arise when using pymc3.util.update_start_vals():

https://github.com/pymc-devs/pymc3/blob/ea1b03811f7d58a38c09a56a61144294f3732d71/pymc3/util.py#L182-L197

In that function the transformation is being conditioned on the old values of b which might have been changed in a.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants