-
-
Notifications
You must be signed in to change notification settings - Fork 60
rjmcmc stepper and example test case #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @TeaWolf
I'm sorry my feedback is rather generic and mostly code organization / OOP, but I think addressing this first will make it easier to read and understand what the stepper is doing.
# Generate Data from the true model | ||
timepoints = np.linspace(0, 5, 1000) | ||
true_deltas = np.array([[0, 1],[0, 0]]) | ||
true_ks = np.array([[0, 2],[0, 0]]) | ||
initial_values = np.array([5, 0]) | ||
true_model = get_vectorized_solution_for_matrix(true_deltas,true_ks, initial_values) | ||
true_data = [true_model[i](timepoints) for i in range(2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General advice
ne shouldn't put execution code at the 0th indentation level.
That's because the Python interpreter will start executing this upon import, before even having parsed the entire file.
Also it leads to recursion problems with multiprocessing, because the child processes import the code again..
The typical pattern is this:
# Place this at the end of the file:
if __name__ == "__main__":
# Now call other functions defined above:
run_main_example()
The __name__ == "__main__"
is only True
if the script is primary entry point: python my_script.py
You can start adoping this pattern by simply indenting most of your code and putting it into a function.
How you can make this into a test
If you also name the file to start with test_
- for example test_rjmcmc.py
- then pytest
will recognize it as a file potentially containing tests.
Any top-level function named test_*
will then be recognized as a test case.
For example: def test_rjmcmc_example_one():
Instead of python custom_rjmcmc_sketch.py
you can then do pytest --verbose test_rjmcmc.py
.
...And that's how you can have your first test case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was't aware of the problem with multiprocessing and 0 indentation, thanks for that!
I wasn't really meaning to make a unittest out of this since I'm not sure what I can assert at the moment. I meant it more as a usage example to clarify my intention. Is there a better place for me to put this script then ? Or should I just make it assert there's been no error ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider it an integration test. You could run it with fewer iterations to make it faster.
A basic assert could be if it can sample to prior/posterior correctly.
You probably also have attributes on the step method instance that you could assert.
Also a seeded test that asserts individual posterior samples is an option.
So I've started doing some better unit-testing on the stepper to be sure of it's correctness (I'll have to refactor it before putting it here). But I've realized the logp I'm using in the jumps doesn't seem to be correct. I'm trying to base myself off the way delta_logp was written for the metropolis stepper but the aesara api is confusing me. Is compile_logp() not providing me with the posterior I'm expecting ? (potential, and all prior terms) |
Could there be variable transforms going on? What model are you testing with? |
Yes there are variable transforms going on. The model parameters have uniform priors so they're transformed using the interval transform. But in comparing logp to my anlytical one i reverse the transformation using this
I think I got this method of obtaining the transformation from some of the pymc test code. is there a better way now? |
I just wondered if the differences you are seeing between your analytical and pymc logp is because you are forgetting the jacobian of the transformations. You can ignore them with |
Ah I see the documentation wasn't clear on which jacobian terms these were or how they were included. I've been fetching the jacobian terms myself when I needed them through
So if |
Yes
Usually you always want the jacobian, except in select cases like optimization (such as find_MAP). |
But for instance if I were to reverse the value_var transformation in order to make my new point proposal in the original space, the value_var transformation's jacobian would not appear in the acceptance fraction. So there I would need |
Somewhat related, I realize that I need to have multivariate (continuous, discrete) priors, as in |
Sorry for the delay,
There are two things: 1) value transformation and 2) jacobian adjustments
The jacobian is there to ensure that a sampling in the unconstrained space actually behaves as if sampling in the constrained space. If you sample Does this help? |
I don't understand this question. We have some multivariate discrete and continuous distributions, but is there something specific you need? Is this for the model specification or for the sampler? For the later we usually just use scipy/ numpy random distribution utilities. |
Thanks for the reply!
So in this formalism the prior is a joint function of both model (specified as a vector of The fix I applied above seems to work fine, but is there a better way of getting this done in pymc? |
Hi,
It was recommended to me to make a PR to track progress on this and get help. I'm new to this process so I apologize in advance for my newbieness.
I'm trying to build an RJMCMC stepper in pymc for a Bayesian model selection project I'm working on.
The structure is currently based largely on CompoundStep, but the pymc API is still a little fuzzy to me.
The basic premise is that a supermodel is specified as pm.Model(), that contains all possible sub-models and distinguishes these with label parameters (delta parameters) .
The RJMCMC kernel then has:
It then randomly chooses to stay in the sub-model parameter space, or to jump.
If staying in the sup-model parameter space, it will delegate to a NUTS sampler on those variables.
Else it will randomly choose a destination space and delegate to the corresponding Jump stepper.
The Jump stepper updates the label parameters indicating that the current sub-model has been changed as well as the continuous parameters according to some intuition on how the models should be mapped together.
The implementation I currently have seems to work for the most part. But I still need to fix stat collection and initial tuning. As well as a problem with the final trace that prevents me from using
az.plot_trace(trace, ...)
, having instead to useaz.plot_trace(trace.posterior, ...)
. The interface I came up with for creating a Jump is also more than a little awkward...