Skip to content

Create guide for writing custom distributions #185

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Sep 1, 2021

Conversation

ally-lee
Copy link
Contributor

Description

Addresses issue #184 and aims to advance it to

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,425 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And tests?

And maybe to consider contributing it with a PR ?


Reply via ReviewNB

@@ -0,0 +1,425 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next cell (implementation) should be introduced by the text. For example, that it should be an implementation of the above formula using just Aesara operations.


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find confusing that it uses np.log and tt.switch 

@@ -0,0 +1,425 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With https://github.com/pymc-devs/pymc/pull/4867 we're adding a mechanism for setting initial values. The code example doesn't need to include that, but maybe it should be mentioned somewhere.


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this guide supposed to be for 3.x or for 4.x as of now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I'm still working with 3.x

@@ -0,0 +1,425 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #3.                          chains=2, cores=1, init='adapt_diag', random_seed=42)

For v4 the typical variable name would be idata for the returned InferenceData object.


Reply via ReviewNB

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if using v3, all notebooks should already use return_inferencedata=True, and if possible use coords and dims instead of shape.

@@ -0,0 +1,425 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pm.traceplot is deprecated in favor of either arviz.plot_trace or its alias pm.plot_trace.


Reply via ReviewNB

@michaelosthege
Copy link
Member

michaelosthege commented Jul 30, 2021

@ally-lee nice work! I left some comments of which some may be nitpicky.

I really like the practical application example!!

@@ -0,0 +1,425 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking in terms of https://github.com/pymc-devs/pymc/issues/4899 (so it doesn't need to be done in this PR, we can add this to the issue and nothing else). pmf/pdf maybe also over/underdispersion look like terms that should go in the glossary.


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful if I added these terms/definitions in a comment on that issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great yes! Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@OriolAbril
Copy link
Member

This looks great, sorry it took so long to review. I have added some comments related to latest updates and best practices with pymc3 and arviz which will require some changes to the plotting. I am happy to help with those changes

@ally-lee
Copy link
Contributor Author

ally-lee commented Aug 8, 2021

@michaelosthege @OriolAbril Thank you both for your reviews! I made a bunch of updates based on your comments. Do you mind taking it over to make the more technical changes? I think that would be best since you're the experts on this technology. I'm feeling good about what I've submitted so far and am okay with whatever changes you need to make

@@ -0,0 +1,471 @@
{
Copy link
Member

@ricardoV94 ricardoV94 Aug 8, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two extra details that are worth mentioning:

  1. For continuous distributions you also have to define the default transform, or inherit from a more specific class like PositiveContinuous which specifies what the default transform should be.
  2. On V3 you are supposed to also specify at least one ”default value" for the distribution during init such as self.mode, self.median and self.mean (the latter only for continuous distributions). This is used by some samplers or other compound distributions. It didn't show up in your case because you are not sampling from the distribution (as it it's observed)

Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing these out! I'll add these notes to the text block. I also realized the reason I needed to set a testval when sampling was because I hadn't specified a "default value" for the distribution, so I can take that out now.

@@ -0,0 +1,471 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest adding a prior predictive step, not only because it's part of the standard Bayesian worflow, but also because it uses your distribution in a different way than in sample or posterior_predictive, as an unobserved random variable. That in itself of a good sanity check that the distribution is well implemented and can be sampled from (with mcmc that is)


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to add this, but I think it won't work because pm.AR doesn't have a random method. Is it ok if I leave this step out?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually what I had in mind was prior predictive sampling with pm.sample by simply removing the observed keyword.

But that's not necessarily useful for this example...

Feel free to ignore it then

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like your idea of doing a sanity check though! I think it would be nice to compare what samples look like from the standard Poisson vs generalized Poisson

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point!

@ally-lee
Copy link
Contributor Author

Hey @OriolAbril just wanted to check in, are you ok to make those final changes related to latest updates/best practices? Is there anything else I can do to help get this ready to release?

@chiral-carbon
Copy link
Collaborator

hey @ally-lee, I would be up for making the final changes to get this notebook to best practices, if that's alright with you?
@MarcoGorelli would that be okay?

@MarcoGorelli
Copy link
Contributor

sure, no objections from me

@chiral-carbon
Copy link
Collaborator

@MarcoGorelli would you merge this notebook then, if everything else (except using return_inferencedata=True in pm.sample and plotting using ArviZ) seems alright? because then I could open a new PR with the new changes.

@MarcoGorelli
Copy link
Contributor

@MarcoGorelli would you merge this notebook then, if everything else (except using return_inferencedata=True in pm.sample and plotting using ArviZ) seems alright? because then I could open a new PR with the new changes.

I haven't had a chance to look at this yet, but you could always make a new branch from this one so as to keep @ally-lee 's commits and to add your ones on top

@ally-lee
Copy link
Contributor Author

Hey @chiral-carbon thanks for picking this up! Just curious if you have a status update on this

@chiral-carbon
Copy link
Collaborator

hey @ally-lee I will be able to push some commits in a day or two. hope that's alright!

Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to merge this PR so that then @chiral-carbon can work on the notebook and open a new PR directly to main. This will avoid PRs on PRs which I don't like by no reason at all and given that we squash merge PRs here, will leave two commits related to this notebook in the history, one by @ally-lee and one by @chiral-carbon

Unless someone opposes I'll merge tomorrow

@chiral-carbon
Copy link
Collaborator

I made a few changes to this NB by adding allylee's remote locally and creating a new branch but I could always save that notebook and just move it so no problem by me @OriolAbril

@OriolAbril
Copy link
Member

Then it's probably easier for you to open the PR to this branch/PR I think 🤔

@chiral-carbon
Copy link
Collaborator

either is fine tbh. I would also have doubts and probably make some changes incrementally, so if having lots of comments under this one PR would not be too messy then I'll just push here. otherwise, I can open a new PR, no issues there.

@chiral-carbon
Copy link
Collaborator

hey @ally-lee I tried pushing commits to this PR but I get a git error saying permission denied. could you check if the option "Allow edits from maintainers" is checked in the PR? it should be a checkbox on the right side of this page!

@ally-lee
Copy link
Contributor Author

Hmm it's already checked
Screen Shot 2021-08-31 at 4 22 11 PM

@chiral-carbon
Copy link
Collaborator

oh okay 🤔 not sure what's wrong then

@MarcoGorelli
Copy link
Contributor

oh okay 🤔 not sure what's wrong then

can you show the command you ran, and the output?

@OriolAbril
Copy link
Member

I think the problem is that @chiral-carbon only has triage permissions so she isn't seen as "maintainer" by github

@chiral-carbon
Copy link
Collaborator

chiral-carbon commented Sep 1, 2021

@OriolAbril @MarcoGorelli I searched online and I think only the owners/maintainers of a project can push commits to a PR opened by someone else in the project, or if I had write access to @ally-lee's fork then also it might work.
this is the error message I get on doing git push allylee HEAD:custom-distribution ( I had added @ally-lee's fork as a remote locally with git remote add allylee https://github.com/ally-lee/pymc-examples.git):

remote: Permission to ally-lee/pymc-examples.git denied to chiral-carbon.
fatal: unable to access 'https://github.com/ally-lee/pymc-examples.git/': The requested URL returned error: 403

will you consider merging this PR in that case? and I'll open a new one?

Copy link
Contributor

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, looks good to me - I've finally had a chance to read through the notebook, and I think it's really good!

@MarcoGorelli MarcoGorelli merged commit 2002ebd into pymc-devs:main Sep 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants