Idea: Add Measure-Valued Pólya Urn (MVPU) Distribution for dependent sequences #8191
gl-alexander
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
The Problem$j$ only increases the probability of $j$ ). When modeling various spatial, temporal, or semantic sequences, we often need non-local reinforcement, where observing category $j$ increases the probability of neighboring or correlated categories via a transition kernel.
Currently, PyMC supports discrete Bayesian nonparametrics primarily through stick-breaking representations of the Dirichlet Process. This enforces only localized reinforcement (observing category
The Proposal
I propose adding a
MeasureValuedPolyaUrnsequence distribution topymc.distributions.discrete. It generalizes the standard Pólya urn by replacing the unit-mass reinforcement with a stochastic transition kernel.Mathematically, the predictive probability at step$n$ for $K$ discrete categories is:
where$\mu_0$ is the initial measure and $R$ is the $K \times K$ reinforcement matrix.
Current Prototype Status
To avoid
pytensor.scancompilation bottlenecks, I've prototyped the recursive sequence by vectorizing it usingpt.cumsumand indexing. NUTS can sample the continuous hyperparameters over the marginalized discrete sequence without divergences, provided the scale is anchored to ensure identifiability (e.g., usingpm.Dirichletfor the initial measure and kernel proportions).PyTensor
logplogic:Questions for the Core Team
Before I draft an official Feature Request Issue and PR with the
RandomVariableOp, thecheck_pymcs_draws_match_referencetests, and the Sphinx docstrings:pymc.distributions.discrete, or is a distribution of this nature better suited forpymc-experimentalfirst?Beta Was this translation helpful? Give feedback.
All reactions