-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Variable selection for pm.model_to_graphviz
#5527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@michaelosthege Please can I work on this |
Sure, go ahead |
Thinking out loud: when you say plot variables and their dependencies, say that we have a simple hierarchical model x | mu ~ N(mu, 1) with mu ~ N(0, 1). Would we want the graph to include mu if we just specify |
Yes, always include ancestors. With the other kwarg I just want to exclude data variables that don't have edges |
I would suggest reusing the same API that arviz uses for including / excluding variables, instead of creating new specialized keyword arguments like the disconnected data thing. |
Without
The second option is risky - if you forget a downstream variable it will just not show up. Instead of a boolean kwarg maybe a setting for all data variables is better: |
Why? |
Maybe, I would just suggest to start as simple as possible. Always easier to add complexity than to remove it. |
I'm never in favor of complex solutions and having the code to filter data nodes in PyMC's The thing is: I need this feature. Soon. As in before next Wednesday, actually. No pressure, I'm happy that you @larryshamalama want to pick it up. |
I can give it a shot and ask questions along the way since it will be a bit out of my comfort zone :) My first question is: any insights why |
In the above example,
|
Thanks! I'll give it this a shot today when I'll have a bit more time |
I've been able to make some progress. On a related note, should descendants be included? Say we have Z -> Y -> X and we specify I have yet to address some of the other discussion points |
I'd say no. Only ancestors and the node itself |
Description of your problem
I have a big big model and would like to make
pm.model_to_graphviz
plots of only some of the variables inside.For example after adding
gp.conditional
s and then only wanting to plot the nodes my new variable depends on.My model also includes some
ConstantData
variables I use to store indexing information.They aren't relevant for model understanding and I'd like to hide them.
Proposal
The
pm.model_to_graphviz
function could take additional kwargs to customize the plot:var_names
(orvars
?) to plot only certain variables and their dependenciesshow_disconnected_data: bool
to hideConstantData
/MutableData
nodes that don't contribute to (selected) model variablesImplementation
Variable selection should be straightforward since it's already in the constructor, just not accessible via a kwarg:
pymc/pymc/model_graph.py
Lines 33 to 36 in a3bab7d
The text was updated successfully, but these errors were encountered: