-
Notifications
You must be signed in to change notification settings - Fork 2
[WIP] Blogpost for do
and observe
functionality
#3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
CC @lucianopaz, the inline comment doesn't seem to do it: https://app.reviewnb.com/pymc-labs/research/blob/do-observe-blogpost/blog-posts%2Fdo_operator%2Fdo_operator_blogpost.ipynb/discussion/#comment-1ec9f806 |
Just finished reading through the first draft and holy shit, this is shaping up to be a post for the ages. I think you introduce the new concepts perfectly and take the reader from where they are step by step to where they want to be. Readers will feel like that scene in the matrix where they get uploaded Kung Fu skills. |
Committed a ton of updates. But I'd perhaps hold off giving any more feedback for the moment. After some discussion with Ricardo, going through the full workflow (including simulated data generation) creates some problems which may just confuse the core message. Going to come back to this tomorrow with some perspective and potentially simplify the structure of the post a bit. I am learning towards extracting out the simulated data generation / parameter recovery study approach into a separate blog post. Kind of like, "hey, you know how we told you about |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-08T18:27:54Z new is twice, I think you can remove the second new as it's implied by introducing. drbenvincent commented on 2023-06-12T17:08:12Z fixed |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-08T18:28:40Z Maybe also add that they will be included in the main-repo after a little while. drbenvincent commented on 2023-06-12T17:08:22Z Done |
fixed View entire conversation on ReviewNB |
Done View entire conversation on ReviewNB |
@drbenvincent I just wanna say I am very exited about this (and subsequent) posts! amazing team effort to bring pymc even closer to causal inference 💪 ! |
The weird arrows happen when you set multiple nodes to equivalent constants (they get merged). If you set 0.00001, 0.00002, and so on, instead of 0, it looks alright. This hack should fix it for both It also explains why sampling was still working. |
Actually graphviz is the culprit! It triggers some graph rewrite that modifies the variables. Here is a snippet that triggers it, without any pymc experimental stuff: import pymc as pm
with pm.Model(coords_mutable={'i': [0]}) as m:
beta_z = pm.ConstantData("beta_z", 0)
beta_y = pm.ConstantData("beta_y", 0)
z = pm.Bernoulli("z", p=pm.invlogit(beta_z), dims="i")
y = pm.Normal("y", mu=beta_y + z, dims="i")
pm.model_to_graphviz(m) Any operation would actually do this, |
There was another similar issue, for instance if one of the replacement values was 1.0, because the Sigma also had one input as 1.0 and it created an artificial dependency on graphviz. Again you can fix this by making the replacements odd like 1.000001 or 0.00001 (and all different from each other). This should allow you to sketch the blog until that PyTensor PR is in. We can cut a release soon for the blogpost. Also you should be able to use |
Some value must somehow still be repeated but it worked for me after the fixes so I wouldn't worry. |
Sounds like a good idea to wait for the new release before going ahead |
The graphviz issue is now fixed with the latest package versions |
Though this does not work |
What doesn't work? The issue with sample_posterior_predictive mentioned above is still present (i.e., posterior_predictive will always sample observed variables from the prior). That is not an "observe" issue but the expected behavior of pymc so far. Is there another problem besides that? |
At the moment we have this kind of workflow:
This final step is the one that 'doesn't work' in the sense that it doesn't give expected results given this workflow. So to get the results I wanted I switched to this workflow:
So when you wrote
I took that to mean the the first workflow would work. But I think I interpreted that wrong. But now I'm not clear on what you did mean. Am I right in thinking that the first workflow above is not going to be possible? If so, then I think the post should forget about creating an empty model with no data and just start from a 'traditional' pymc model where we do provide simulated data. In which case the workflow would be like this:
Which is fine with me, but then there will be no examples in the post where we use |
I just meant the arrows issue.
Still have to discuss that with @lucianopaz, I am unsure about whether it makes conceptual sense or not. Technically, there is no challenge to opting for the "new" behavior. In any case, you don't need to recreate a new model to go get the kind of results you want right now. You can use a Even if you don't want to do this, there's no reason to create a model for |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:01:00Z I don't find it trivial that |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:15Z I think I would move this to the top. drbenvincent commented on 2023-06-23T12:34:56Z Done. I've re-ordered the examples |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:16Z "In this post" sounds like this post just beginning. drbenvincent commented on 2023-06-23T12:36:21Z fixed |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:17Z Any story / example you can come up with for this? drbenvincent commented on 2023-06-23T13:13:37Z Done :) |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:17Z *kniown
Maybe also say that usually we wouldn't know that and want to infer. drbenvincent commented on 2023-06-23T13:19:07Z Done |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:18Z Maybe add that in this model, the parameters are not fixed like before, because here we want to infer them from data to see if we can recover the true values. The workflow is a bit implicit and readers might get lost. drbenvincent commented on 2023-06-23T13:48:09Z Added some clarification. Also added a schematic plot at the end of the example for people to recap what we've done |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:19Z Rather than referring to future blog posts, you could just explain what's going on intuitively, for example with an example of a drug and we care about the difference between giving and not giving the drug. drbenvincent commented on 2023-06-23T13:51:26Z Done. This was a bit lazy of me. |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-22T22:13:20Z the all the values drbenvincent commented on 2023-06-23T13:53:24Z fixed |
Done. I've re-ordered the examples View entire conversation on ReviewNB |
fixed View entire conversation on ReviewNB |
Done :) View entire conversation on ReviewNB |
Done View entire conversation on ReviewNB |
Added some clarification. Also added a schematic plot at the end of the example for people to recap what we've done View entire conversation on ReviewNB |
Done. This was a bit lazy of me. View entire conversation on ReviewNB |
fixed View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-23T14:08:48Z We've... ? |
View / edit / reply to this conversation on ReviewNB twiecki commented on 2023-06-23T14:08:49Z Maybe example 2 can just go, I don't know what it adds. |
There are notable errors here. Yet to pin down whether this is down to my use of
observe
, or if my examples are highlighting bugs that need to be fixed.