Skip to content

Variable selection for graphviz visualization via kwarg #5634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
May 22, 2022

Conversation

larryshamalama
Copy link
Member

@larryshamalama larryshamalama commented Mar 21, 2022

Fixes #5527. Some elements are hackish and currently need more work.

  • Do not plot all descendants when specifying an intermediate rv unless the variable is observed (plot observed node)
  • Add var_names as kwarg
  • Rename previous var_names class instance by _all_var_names

@codecov
Copy link

codecov bot commented Mar 21, 2022

Codecov Report

Merging #5634 (2d67473) into main (a95dd71) will increase coverage by 0.07%.
The diff coverage is 97.91%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #5634      +/-   ##
==========================================
+ Coverage   89.28%   89.36%   +0.07%     
==========================================
  Files          74       74              
  Lines       13824    13804      -20     
==========================================
- Hits        12343    12336       -7     
+ Misses       1481     1468      -13     
Impacted Files Coverage Δ
pymc/model_graph.py 93.60% <97.91%> (+8.08%) ⬆️
pymc/distributions/censored.py 94.73% <0.00%> (ø)

@larryshamalama larryshamalama changed the title Graphviz filter vars Variable selection for graphviz visualization via kwarg Mar 21, 2022
@larryshamalama
Copy link
Member Author

@michaelosthege Can you have a look at these changes on a model or two? Happy to hear comments. It seems to work on my model in this notebook but we can't see the figures on GitHub

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's going into the right direction.
Here's a minimal example that unveils two problems:

  1. Selecting nodes causes some of them to change formatting
  2. Nodes of observed data have an inverted dependency - this could be tricky.. One can get the desired result by selecting the observed RV, but because of the reversed arrow direction it's counterintuitive.
import pymc as pm
from IPython.display import display

with pm.Model() as pmodel2:
    a = pm.Normal("a")
    b = pm.Normal("b")
    pm.Normal("c", a * b)
    intermediate = pm.Deterministic("intermediate", a + b)
    pred = pm.Deterministic("pred", intermediate * 3)

    obs = pm.ConstantData("obs", 1.75)

    pm.Normal("L", mu=1 + 0.5 * pred, observed=obs)

display(pm.model_to_graphviz(pmodel2))
display(pm.model_to_graphviz(pmodel2, selected_vars=["L"]))
display(pm.model_to_graphviz(pmodel2, selected_vars=["obs"]))

grafik

@larryshamalama
Copy link
Member Author

Thanks for the pointers and example. Will look into it.

Do you know approximately when the code has been written? My guess is that _get_ancestors was written sometime during v3 and doesn't work well anymore

@michaelosthege
Copy link
Member

Thanks for the pointers and example. Will look into it.

Do you know approximately when the code has been written? My guess is that _get_ancestors was written sometime during v3 and doesn't work well anymore

You can check the git blame: https://github.com/pymc-devs/pymc/blame/main/pymc/model_graph.py

But generally the code works. Also for my biggest model all variables are shown

@larryshamalama
Copy link
Member Author

I believe that the plots now display desired graphs in your example above. Can you verify?

I am using ancestors from aesara.graph.basic.py and there seems to many tools in that file that can be beneficial for this file. get_parents in ModelGraph still seems to work but I'm not sure about _get_ancestors, which is why I am using the Aesara one.

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the plots now display desired graphs in your example above. Can you verify?

Yes, it's looking good for my big model too! Tomorrow morning I can then finally plot my model and print it on an A3 sheet of paper for the meeting on thursday 😃 So thank you for getting it to work 💯

Maybe a bit unexpected that selecting "L" also shows "obs", but it's probably a good idea to do that. It's a feature.

I added some type hint suggestions, and renaming of variables. My suggestions are incomplete, but I think you get the idea.

You can add tests to https://github.com/pymc-devs/pymc/blob/main/pymc/tests/test_model_graph.py. These tests work by comparing the dict representation of the grpah/plates with the expected outcome.

@larryshamalama
Copy link
Member Author

Thanks so much for the suggestions! I'll address them one by one. Just wanted you to have the figures for your meeting and work on polishing the code more slowly

@larryshamalama larryshamalama force-pushed the graphviz-filter-vars branch 2 times, most recently from bbc5e33 to a706229 Compare March 29, 2022 02:08
@larryshamalama
Copy link
Member Author

Reminder for myself to add a test

@larryshamalama
Copy link
Member Author

larryshamalama commented Mar 31, 2022

The solution currently is very ad-hoc as I do not understand _get_ancestors and get_parents. However, the code seems to work. Things to consider for next for this PR:

  • Remove variables from model with ~
  • Add a test to catch error when variable is not in the model

Things not sure if I should look into more

@michaelosthege can I get your input?

@larryshamalama
Copy link
Member Author

Quick comment that I did not forget about this PR, just busy with many other things. There's ongoing work with networkx integration, but I need to read up more on that.

@larryshamalama
Copy link
Member Author

larryshamalama commented May 18, 2022

I'm revisiting this PR and would like to push it to the finish line after a long hiatus 😅 In summary, this PR:

  • adds the possibility to specify a subgraph to plot;
  • adds a corresponding test for the above;
  • refactors get_parents and removes (dangerously...) code that is no longer in use.

I'm hoping that all tests pass. Happy to hear any comments about this progress as it would be worthwhile to address issues #5616 and #5766 shortly afterwards. I imagine that I should add type hints to the methods?

@larryshamalama larryshamalama marked this pull request as ready for review May 18, 2022 05:27
@larryshamalama larryshamalama requested a review from ricardoV94 May 18, 2022 05:28
@larryshamalama larryshamalama self-assigned this May 18, 2022
@michaelosthege
Copy link
Member

Type hints are always good 👍

But also make sure to explain the new var_names kwarg in the docstring of model_to_graphviz.

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@larryshamalama we'll only need that docstinrg, then this is good to go!

@larryshamalama
Copy link
Member Author

Sounds good! I'll fix this later today on the train 😅

@larryshamalama
Copy link
Member Author

I only added the docstring to pm.model_to_graphviz since it seemed to be the only user-facing function. I also added type hints and, instead of str, I used VarName.

On another note, I refactored a lot of the code since, with v4, many of it seemed irrelevant. The hope is that nothing breaks as I rely on functions fromaesara.graph.basic, but I have a bit of uncertainty there. I'm happy to look into this file if anything else breaks in the future.

Copy link
Member

@michaelosthege michaelosthege left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a few typing changes, otherwise this looks great 👍

I'll apply my suggestions & fix the pre-commit if needed... ⏳

@michaelosthege
Copy link
Member

/pre-commit-run

@michaelosthege michaelosthege merged commit d0af6b1 into pymc-devs:main May 22, 2022
@michaelosthege
Copy link
Member

Thanks @larryshamalama !

I'm sure this will prove to be a very useful feature for the still underappreciated pm.model_to_graphviz

@larryshamalama larryshamalama deleted the graphviz-filter-vars branch May 22, 2022 12:26
@larryshamalama
Copy link
Member Author

Thanks to you for your suggestions too :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Variable selection for pm.model_to_graphviz
4 participants