add utility method to plot the pdf of a variable #5032

drbenvincent · 2021-09-28T16:58:23Z

In discussions in PyMC Labs, a client asked if there was a simple utility function to plot the pdf of a PyMC variable.

As a quick hack I came up with the following:

Continuous

def plot_cont(self):
    """Plot pdf of continuous dist, with semi clever setting of range"""
    samples = self.random(size=10_000)
    x = np.linspace(np.min(samples), np.max(samples), 1000)
    plt.plot(x, np.exp(self.logp(x)).eval())

# add plot method to abstract class Continuous
pm.Continuous.plot = plot_cont

# example use
pm.Normal.dist(mu=1, sd=2).plot()
pm.Laplace.dist(mu=0, b=1).plot()

which results in the following

Discrete

def plot_disc(self):
    """Plot pdf of discrete dist, with semi clever setting of range"""
    samples = self.random(size=10_000)
    x = np.arange(np.min(samples), np.max(samples)+1)
    plt.plot(x, np.exp(self.logp(x)).eval(), "-o")

# add plot method to abstract class Continuous
pm.Discrete.plot = plot_disc

# example use
pm.Binomial.dist(n=10, p=0.2).plot()
pm.Geometric.dist(p=0.5).plot()

which results in this

So this issue is to see what people think about adding this functionality.

I've contributed example notebooks, but not to core PyMC code before, so it may be naive but... Maybe some more polished version could be added as methods of pm.Continuous and pm.Discrete?

Thoughts and feedback welcome.

The text was updated successfully, but these errors were encountered:

michaelosthege · 2021-09-28T18:23:21Z

In the past I used np.exp(pm. Normal().logp(x).eval()) but that API is now a hard-to-find functional API of pm.logp.

The bigger question though is what to do with multidimensional RVs?

drbenvincent · 2021-09-28T18:39:13Z

The bigger question though is what to do with multidimensional RVs?

In the first instance I think it makes sense to implement for univariate distributions. And override the method in distributions which are inherently multidimensional, potentially with a not implemented exception.

At a later stage you could implement custom plot functions for bivariate distributions, for example.

michaelosthege · 2021-09-28T22:33:07Z

If you're looking for a place to put such a plotting function, that'd probably be pm.plots.

bwengals · 2021-10-04T04:54:47Z

Can't help but chime in and +1 this, I at least make plots like these just about every single time I use pymc3. Though I call plt.hist on samples and then fiddle with the x limits, so this is much better. I also think it would be extra nice if arviz could be leveraged for visual continuity if possible.

aloctavodia · 2021-10-04T07:59:36Z

Somewhat related, in Bambi we have a method to plot prior samples. https://bambinos.github.io/bambi/main/api_reference.html#bambi.models.Model.plot_priors

drbenvincent · 2021-10-04T08:11:58Z

I also think it would be extra nice if arviz could be leveraged for visual continuity if possible.

I'm assuming we don't want to PyMC dependent upon arviz? In which case we'd have to just emulate the style? But it would be presumably be pretty easy by optionally doing

import arviz as az
az.style.use("arviz-darkgrid")

Overall I think we could make this pretty minimal. Users may want to compose their own complex and possibly multi-panel plots.

michaelosthege · 2021-10-04T09:30:39Z

We have a dependency on ArviZ and since ArviZ no longer depends on PyMC that's fine.

But IMO any change of default style is something the user needs to do. I for example don't like the "arviz-darkgrid" style 😱

drbenvincent · 2021-10-04T09:36:09Z

But IMO any change of default style is something the user needs to do. I for example don't like the "arviz-darkgrid" style 😱

I agree. I prefer simpler styles more suitable for publication for example

ferrine · 2021-10-30T14:01:22Z

I like that proposal. We can raise an error on multidimensional variables. Additionally, a more flexible API is needed here. As any plotting function in the plt namespace, this should take optional axes as input and output the used axes

drbenvincent · 2021-10-30T15:43:44Z

Yes. Was planning on full axes luxury

ricardoV94 · 2021-10-30T20:21:08Z

This should probably not me a method, because we don't really have class instances in V4 anymore, but something that accepts a RV.

rv1 = pm.Normal.dist(0, 1)
with pm.Model() as m:
    rv2 = pm.Normal("rv2", 0, 1)

pm.plot(rv1)
pm.plot(rv2)

Both would yield the same. This has the advantage that you can plot the variables you defined in the model without having to rewrite with the .dist API

In V4, rv1 and rv2 are identical for this purpose.

lancechua · 2021-11-08T06:27:00Z

Hi, wanted to add to this since I'd find this feature quite useful as well.
The approach I took was to determine the plot domain based on the inverse cdf.
This has the advantage of not needing to sample, but does require implementing the quantile function.
I'd say it's decent by default but needs tweaking for longer tailed and positive only distributions.

Here's a colab demo notebook for proof of concept.
https://colab.research.google.com/drive/1yLczeGk09eeEnrlUoaJ57yQKJgCnPNdg

aloctavodia · 2022-01-21T11:00:59Z

To help interpret the pdf we can complement it with statistics such as the mean, HDI and/or quantiles. For example in this picture the dot is the mean, the thick line the IQR and the think line the 94% HDI, this is similar as the information presented in a forestplot.

ricardoV94 · 2022-01-21T11:03:08Z

Should this be an Arviz feature? Prior predictive is pretty fast anyway

drbenvincent · 2022-01-21T11:14:36Z

Not sure if you just mean moments/HDI should be part of Arviz or not?

But as a defence of the idea of having distribution plotting in PyMC...

While it does make sense to separate plotting, I think the core attraction here would be ultra-rapid one-liner visualisation when building PyMC models.
You could also say, why not just do it with plotting SciPy distributions, but they often don't have the same parameterisation, making it a headache.

ricardoV94 · 2022-01-21T11:16:26Z

I think it's a slippery slope to re-introducing plots in PyMC xD

ricardoV94 · 2022-01-21T11:17:24Z

Would these plot utilities need anything else other than draws?

aloctavodia · 2022-01-21T11:19:00Z

I am planning on adding the statistic/forest-plot stuff into ArviZ. I guess in az.plot_posterior (which name is a misnomer). But as ArviZ works with samples that is going to return KDEs no PDFs. Of course a user could do something like...

az.plot_posterior(pm.draw(rv, draws=1000))

Having pm.plot() is nicer because the user get THE prior instead of an approximation, but with enough samples, that's general a detail.

ricardoV94 · 2022-01-21T12:00:54Z

I see we would be plotting the pdf itself... yeah that is not arvizable :D

ricardoV94 · 2022-01-21T12:06:12Z

By the way, pm.draw is taken already!

aloctavodia · 2022-01-21T12:10:54Z

Sorry I meant pm.plot

ghost · 2022-01-21T20:29:44Z

I am thinking similarly to @ricardoV94 -- I'd be hesitant to bring plotting back. However, it is something that I do quite a lot, especially:

sns.distplot(pm.Normal.dist(mu=0, sigma=1, shape=(1000)).eval())

So my heart says yes, but my head says no. What about something for pymc-experimental?

ricardoV94 added the request discussion label Sep 28, 2021

ricardoV94 added the feature request label Nov 28, 2021

hottwaj mentioned this issue May 24, 2022

Distribution variable API difficult to use for basic debugging: logp, logcdf, random samples, ppf #5798

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add utility method to plot the pdf of a variable #5032

add utility method to plot the pdf of a variable #5032

drbenvincent commented Sep 28, 2021 •

edited

Loading

michaelosthege commented Sep 28, 2021

drbenvincent commented Sep 28, 2021

michaelosthege commented Sep 28, 2021

bwengals commented Oct 4, 2021 •

edited

Loading

aloctavodia commented Oct 4, 2021

drbenvincent commented Oct 4, 2021

michaelosthege commented Oct 4, 2021

drbenvincent commented Oct 4, 2021

ferrine commented Oct 30, 2021

drbenvincent commented Oct 30, 2021

ricardoV94 commented Oct 30, 2021 •

edited

Loading

lancechua commented Nov 8, 2021

aloctavodia commented Jan 21, 2022

ricardoV94 commented Jan 21, 2022

drbenvincent commented Jan 21, 2022

ricardoV94 commented Jan 21, 2022 •

edited

Loading

ricardoV94 commented Jan 21, 2022

aloctavodia commented Jan 21, 2022 •

edited

Loading

ricardoV94 commented Jan 21, 2022 •

edited

Loading

ricardoV94 commented Jan 21, 2022

aloctavodia commented Jan 21, 2022

ghost commented Jan 21, 2022

add utility method to plot the pdf of a variable #5032

add utility method to plot the pdf of a variable #5032

Comments

drbenvincent commented Sep 28, 2021 • edited Loading

Continuous

Discrete

michaelosthege commented Sep 28, 2021

drbenvincent commented Sep 28, 2021

michaelosthege commented Sep 28, 2021

bwengals commented Oct 4, 2021 • edited Loading

aloctavodia commented Oct 4, 2021

drbenvincent commented Oct 4, 2021

michaelosthege commented Oct 4, 2021

drbenvincent commented Oct 4, 2021

ferrine commented Oct 30, 2021

drbenvincent commented Oct 30, 2021

ricardoV94 commented Oct 30, 2021 • edited Loading

lancechua commented Nov 8, 2021

aloctavodia commented Jan 21, 2022

ricardoV94 commented Jan 21, 2022

drbenvincent commented Jan 21, 2022

ricardoV94 commented Jan 21, 2022 • edited Loading

ricardoV94 commented Jan 21, 2022

aloctavodia commented Jan 21, 2022 • edited Loading

ricardoV94 commented Jan 21, 2022 • edited Loading

ricardoV94 commented Jan 21, 2022

aloctavodia commented Jan 21, 2022

ghost commented Jan 21, 2022

drbenvincent commented Sep 28, 2021 •

edited

Loading

bwengals commented Oct 4, 2021 •

edited

Loading

ricardoV94 commented Oct 30, 2021 •

edited

Loading

ricardoV94 commented Jan 21, 2022 •

edited

Loading

aloctavodia commented Jan 21, 2022 •

edited

Loading

ricardoV94 commented Jan 21, 2022 •

edited

Loading