-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add icdf functions for Lognormal, Half Cauchy and Half Normal distributions #6766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6766 +/- ##
=======================================
Coverage 91.89% 91.90%
=======================================
Files 95 95
Lines 16185 16197 +12
=======================================
+ Hits 14874 14886 +12
Misses 1311 1311
|
I am having an issue when developing the test for this function. So, I just got the already implemented icdf function for the normal and added np.exp() to it. But it looks like the implemented function for the normal distribution icdf has a tiny difference if compared to the scipy implementation : But when this is passed to the exponential function, the difference becomes larger than the tolerance of 6 decimals and the test fail. ![]() Using the values of the failed test above, I get the same values of error locally, and the result for the icdf differs both from scipy and R stats ( Investigating it further, the Scipy implementation uses a different way to compute the icdf by approximating it with a routine described by this script: https://github.com/scipy/scipy/blob/2f3831503aff159994eafa75745c9537f8db060f/scipy/special/cephes/ndtri.c#L1 Which is also different from the R stats qnorm function: Hence the difference in the results. Any ideas on how to handle this divergences on implementations during the tests? |
Small numerical differences are fine. You can get around them by either tuning Those are pretty extreme values you're getting with For testing locally, don't forget to set We have been thinking about changing the precision criteria because the one used now is pretty dumb: #6159 |
Ricardo, thank you for the comments! I will follow your suggestions and submit a PR soon. |
@ricardoV94, thank you for the comments on handling the precision issues. I have successfully added tests for the lognormal icdf function.
Instead I used the formulas consistent with the implementations found in SciPy and R extraDistr packages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, only some small changes needed
You're right, thanks for checking. Opened related PR to fail explicitly for the icdf |
pymc/distributions/continuous.py
Outdated
@@ -856,6 +856,15 @@ def logcdf(value, loc, sigma): | |||
msg="sigma > 0", | |||
) | |||
|
|||
def icdf(value, loc, sigma): | |||
res = Normal.icdf((value + 1.0) / 2.0, loc, sigma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit annoying because we don't allow users to create a HalfNormal
with loc != 0
but in theory they could call icdf
or reach this function with a HalfNormal that has a custom non-zero loc
.
My question is, will this expression work for non-zero loc
?
This is also a question for the HalfCauchy
.
I wonder if the best solution is to reimplement these RandomVariables directly in PyMC in a way that they don't accept a loc
argument (just like the HalfStudentTRV
in this file). This way we don't have to worry about loc
in the logp/logcdf/icdf/moment functions.
If we go down that path, we can remove the current HalfNormal
and HalfCauchy
RandomVariables from PyTensor as they aren't really needed there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implemented formula work well even when loc != 0
and is consistent with SciPy results:
Allow me some more time to bring some screenshots calling the implemented functions with pm.HalfNormal.icdf()
and pm.HalfCauchy.icdf()
If you think the best solution is still to reimplement RandomVariables directly in PyMC in a way that they don't accept a loc argument, let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to do it in this PR, but the fact that it isn't being tested in our suite (and not just the new icdf) is already a good argument to drop it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good news that it works though!
Thanks @amyoshino! |
Thanks for your guidance @ricardoV94 ! |
My pleasure @amyoshino. Looking forward to you next PRs! |
What is this PR about?
Adds ICDF functions to Lognormal Distribution
Issue #6612
Comment on issues found while working on Issue #6747
References:
Lognormal:
HalfCauchy: matches formulas used in either SciPy and R extraDistr
HalfNormal: matches formulas used in either SciPy and R extraDistr
Checklist
Major / Breaking Changes
New features
Bugfixes
Documentation
Maintenance
📚 Documentation preview 📚: https://pymc--6766.org.readthedocs.build/en/6766/