-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Fix dim lengths created from coords or ConstantData #5751 #5763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Great strategy. Thanks for taking such a principled approach to development It seems that there type being changed somewhere. Might be as simple as updating this test. |
Codecov Report
@@ Coverage Diff @@
## main #5763 +/- ##
=======================================
Coverage 89.27% 89.28%
=======================================
Files 74 74
Lines 13811 13822 +11
=======================================
+ Hits 12330 12341 +11
Misses 1481 1481
|
Failing test looks to be down to sampling error?
And it's highlighted in an outstanding issue #5739. |
This test is failing because y is not the correct shape,
|
Maybe do more asserts? with pm.Model(coords={"feature": [1], "group": ["A", "B"]}):
assert pmodel.dim_lengths["feature"].eval() == 1
assert pmodel.dim_lengths["group"].eval() == 2 |
My thinking for that test was to make sure that size and dims can be passed at the same time and we get the expected shape out which will not have been possible before this type change in |
When That's why I suggested to only pass dims, and actually it's a little concerning that the test doesn't pass. Maybe call with pm.Model() as pmodel:
pmodel.add_coord("feature", [1], fixed=True/False)
pmodel.add_coord("group", ["A", "B"], fixed=True/False)
assert pmodel.dim_lengths["feature"].eval() == 1
assert pmodel.dim_lengths["group"].eval() == 2
x = pm.Normal("x", 0, 1, dims="feature")
y = pm.Normal("y", x[..., None], 1, dims=("feature", "group"))
assert x.eval().shape == (1,)
assert y.eval().shape == (1, 2) |
When I try that all cases of fixed = True/False lead to y.eval().shape == (1, 1). |
Oh, now I see why. The This works 👇 because then the RV is # ...
x = pm.Normal("x", 0, 1, dims="feature")
y = pm.Normal("y", x, 1, dims=("group", "feature"))
assert x.eval().shape == (1,)
assert y.eval().shape == (2, 1) |
I think I'm still misunderstanding something here. Wouldn't passing dims like that still be allowed prior to this PR? So I don't understand what it is testing with respect to this PR. |
Yes, that's already allowed, but where this test comes in is in combination with the changes from #5761. Actually, it's a little unfortunate that these changes are on separate branches. I'm inclined to merge #5761 even with those coverage gaps (I'll review it really carefully right now). |
* Support TensorConstant entries in `Model.dim_lengths`. * Raise errors on attempts to resize dims with `TensorConstant` length. * Raise errors on attempts to resize dims that are linked to non-shared variables. * Warn about resizing dims that weren't initialized from a shared variable (supposedly via `add_coord(..., fixed=False)`, see #5763). * Raise errors when attempting to resize a dim that had coord values without providing new coord values. Closes #5760 by anticipating that not all symbolic dim lengths originate from RVs. Co-authored-by: Michael Osthege <[email protected]>
@LukeLB the other PR was merged, so now you should be able to rebase and bring the coverage back up :) |
@michaelosthege OK I think I understand. I'll rebase and commit your suggested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add test cases for the ShapeWarning
and ValueError
?
With match=""
one can make sure that the exception is the correct one. That's important because there are two different ShapeError
s that one could run into..
We should aim to cover all code paths from the diff here https://github.com/pymc-devs/pymc/pull/5761/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't make the change suggestion, but the whole block containing the ShapeWarning
can be removed..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @LukeLB !
This brings the coverage back up should lead to fewer shape problems long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah damn, I just realized that we forgot to actually use the new fixed
kwarg.
I commented a change in one of the tests that should currently fail.
Let me know if you want me to take over - you put in a lot of effort already and I don't want to get demotivated by all this back and forth
Not at all! The longer it goes the more I learn :) Thanks for the offer though. |
Why does the assert statement on line 758 fail? The |
The So what needs to be done is a change in the logic behind And I should add that |
OK so I think I have a solution which I have pushed but I'm running into a wall with a failing test which I can't seem to fix. Unfortunatley I won't be able to work on this till after the weekend so if there is a time pressure on this then it will be worth someone else picking it up. |
Thanks @LukeLB, I will try to pick it up in the evening |
@LukeLB this weekend is fine! You've made great progress on this and don't want to take away joy of getting it merged! |
I rebased the branch and squashed the history into three logically separate commits:
I will force-push them one by one so we can confirm that the first two commits don't create new problems. |
0e40156
to
18f6fe5
Compare
Needed to create resizable dims with coordinate values, because coords passed to the model will become immutable dims. See pymc-devs#5763. Also, because `pm.Data` variables can be N-dimensional, the coordinate values can't reliably be taken from the value.
2c37381
to
79fdfdd
Compare
@LukeLB @canyon289 @ricardoV94 please review and note the new changes I made in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, left a small question and a suggestion about using the same keyword internally (fixed vs mutable)
if not mutable: | ||
# Use the dimension lengths from the before it was tensorified. | ||
# These can still be tensors, but in many cases they are numeric. | ||
xshape = np.shape(arr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't matter much but x.type.shape
should have the same info
This increases the safety of working with resizable dimensions. Dims created via `pm.Model(coords=...)` coordinates are now immutable. Only dimension lengths that are created anew from `pm.MutableData`, `pm.Data(mutable=True)`, or the underlying `add_coord(mutable=True)` will become shared variables. Co-authored-by: Michael Osthege <[email protected]>
79fdfdd
to
abad0a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel 👍 about merging as long as the tests go ✔.
Me too! |
Needed to create resizable dims with coordinate values, because coords passed to the model will become immutable dims. See #5763. Also, because `pm.Data` variables can be N-dimensional, the coordinate values can't reliably be taken from the value.
Congrats @LukeLB! Thank you for leading this PR |
@canyon289 @michaelosthege @ricardoV94 thanks all!! It's been a pleasure :) |
Closes #5751. Fixes dimensions created by add_coord by making dimension length a TensorConstant. I have also added a test for this using some code from #5181.
I'm setting as WIP as @canyon289 suggested this PR is required for other work to become unblocked and I'm having problems with some failing tests (which could just be due to me running things locally with an incorrect set-up?). So it's worth seeing if they fail on here and then look at fixes, which will be faster than me trying to figure things out on my own!
Depending on what your PR does, here are a few things you might want to address in the description: