Add sum-zero constrained intrinsic conditional autoregressive (ICAR) variable #4851

zoj613 · 2021-07-10T22:29:44Z

THis PR is a followup on #4518 (comment) . It impelements the sum-zero constrained intrinsic conditonal autoregressive (ICAR) model as described in [1].

References

[1] Keefe M.J., Ferreira M.A.R., Franck C.T. On the formal specification of sum-zero constrained intrinsic conditional autoregressive models Spat. Stat., 24 (2018), pp. 54-65

Depending on what your PR does, here are a few things you might want to address in the description:

important background, or details about the implementation
are the changes—especially new features—covered by tests and docstrings?
linting/style checks have been run
consider adding/updating relevant example notebooks
right before it's ready to merge, mention the PR in the RELEASE-NOTES.md

zoj613 · 2021-07-10T22:56:04Z

Copy of the article can be found here: #4518 (comment)

ricardoV94

Thanks for opening the PR. I left some suggestions.

I suggest you check this developer guide for the next steps (docstrings, tests, etc...): https://www.github.com/pymc-devs/pymc3/tree/main/docs%2Fsource%2Fdeveloper_guide_implementing_distribution.md

pymc3/distributions/multivariate.py

codecov · 2021-07-17T13:01:00Z

Codecov Report

Merging #4851 (b0a5853) into main (06d1412) will decrease coverage by 2.58%.
The diff coverage is 81.81%.

@@            Coverage Diff             @@
##             main    #4851      +/-   ##
==========================================
- Coverage   73.09%   70.51%   -2.59%     
==========================================
  Files          86       86              
  Lines       13856    13888      +32     
==========================================
- Hits        10128     9793     -335     
- Misses       3728     4095     +367

Impacted Files	Coverage Δ
pymc3/distributions/__init__.py	`100.00% <ø> (ø)`
pymc3/distributions/multivariate.py	`54.75% <81.81%> (-16.70%)`	⬇️
pymc3/printing.py	`22.22% <0.00%> (-63.64%)`	⬇️
pymc3/distributions/discrete.py	`79.25% <0.00%> (-19.73%)`	⬇️
pymc3/distributions/continuous.py	`84.33% <0.00%> (-11.93%)`	⬇️
pymc3/distributions/dist_math.py	`80.52% <0.00%> (-6.85%)`	⬇️
pymc3/distributions/bound.py	`28.44% <0.00%> (-3.67%)`	⬇️
pymc3/distributions/logprob.py	`94.52% <0.00%> (-1.37%)`	⬇️
pymc3/distributions/transforms.py	`91.30% <0.00%> (-1.09%)`	⬇️
... and 6 more

zoj613 · 2021-07-18T21:57:39Z

@ricardoV94 is there infrastructure to generate adjecency matrices for testing CAR variables? I'm having trouble creating input for the test_distributions.py test im supposed to write.

Normally, i'd use something like the following, assuming the area of interest is made up of a rectangular lattice:

import libpysal 

lattice_rows = 2
lattice_cols = 5
A = libpysal.weights.lat2SW(lattice_rows, lattice_cols, criterion='rook', row_st=False)

which gives:

>>> A.toarray()
array([[0, 1, 0, 0, 0, 1, 0, 0, 0, 0],
       [1, 0, 1, 0, 0, 0, 1, 0, 0, 0],
       [0, 1, 0, 1, 0, 0, 0, 1, 0, 0],
       [0, 0, 1, 0, 1, 0, 0, 0, 1, 0],
       [0, 0, 0, 1, 0, 0, 0, 0, 0, 1],
       [1, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 1, 0, 0, 0, 1, 0, 1, 0, 0],
       [0, 0, 1, 0, 0, 0, 1, 0, 1, 0],
       [0, 0, 0, 1, 0, 0, 0, 1, 0, 1],
       [0, 0, 0, 0, 1, 0, 0, 0, 1, 0]], dtype=int8)

ricardoV94 · 2021-07-19T05:44:54Z

You can hardcode some predefined matrices if it's too complicated to generate then dynamically. If it's too many you can put them in a separate file.

zoj613 · 2021-07-19T14:53:00Z

You can hardcode some predefined matrices if it's too complicated to generate then dynamically. If it's too many you can put them in a separate file.

I added a simple function to generate such, though it's far from being robust. I am now getting weird errors locally for the logp tests. I think maybe my understanding of the Domain objects for multivariate distributions is lacking. I am getting IndexError: list index out of range and
`

Details

x = A

    def _is_sparse(x):
        """
    
        Returns
        -------
        boolean
            True iff x is a L{scipy.sparse.spmatrix} (and not a L{numpy.ndarray}).
    
        """
        if not isinstance(x, (scipy.sparse.spmatrix, np.ndarray, tuple, list)):
>           raise NotImplementedError(
                "this function should only be called on "
                "sparse.scipy.sparse.spmatrix or "
                "numpy.ndarray, not,",
                x,
            )
E           NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A)

depending on how I define the input to check_logp

zoj613 · 2021-07-20T15:01:28Z

@ricardoV94 any suggestions regarding the above errors?

ricardoV94 · 2021-07-20T15:12:46Z

@ricardoV94 any suggestions regarding the above errors?

The edge entries in a Domain are excluded. If you have less than three entries (e.g., Domain([0, 1]), there will be nothing usable in the check_logp. Could this be your issue? For instance, of your tau domain only the middle value will ever be tested.

zoj613 · 2021-07-20T15:20:25Z

@ricardoV94 any suggestions regarding the above errors?

The edge entries in a Domain are excluded. If you have less than three entries (e.g., Domain([0, 1]), there will be nothing usable in the check_logp. Could this be your issue? For instance, of your tau domain only the middle value will ever be tested.

I tried that using the following:

        adj_mat = Domain([random_adjacency_matrix(n, rng) for _ in range(5)])
        dom = Domain([generate_vals(n, rng) for _ in range(5)])
        tau = Domain([0.1, 0.25, 1., 1.5, 2.])

But raises the first error I got, which is: NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A). It appears that the aesara.tensor.sharedvar.TensorSharedVariable arguments are not passed in correctly to the dist function.

See the full trace below:

Details

...
        adj_mat = Domain([random_adjacency_matrix(n, rng) for _ in range(5)])
        dom = Domain([generate_vals(n, rng) for _ in range(5)])
        tau = Domain([0.1, 0.25, 1., 1.5, 2.])
    
>       self.check_logp(
            ICAR,
            dom,
            {"A": adj_mat, "tau": tau},
            ref_icar_logp,
        )

pymc3/tests/test_distributions.py:2088: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pymc3/tests/test_distributions.py:657: in check_logp
    model, param_vars = build_model(pymc3_dist, domain, paramdomains, extra_args)
pymc3/tests/test_distributions.py:249: in build_model
    distfam(
pymc3/distributions/distribution.py:207: in __new__
    rv_out = cls.dist(*args, rng=rng, initval=None, **kwargs)
pymc3/distributions/multivariate.py:2130: in dist
    if aesara.sparse._is_sparse(A):
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

x = A

    def _is_sparse(x):
        """
    
        Returns
        -------
        boolean
            True iff x is a L{scipy.sparse.spmatrix} (and not a L{numpy.ndarray}).
    
        """
        if not isinstance(x, (scipy.sparse.spmatrix, np.ndarray, tuple, list)):
>           raise NotImplementedError(
                "this function should only be called on "
                "sparse.scipy.sparse.spmatrix or "
                "numpy.ndarray, not,",
                x,
            )
E           NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A)

../.pyenv/versions/miniconda3-4.7.12/envs/pymc3/lib/python3.9/site-packages/aesara/sparse/type.py:18: NotImplementedError
=========================================================================================================== short test summary info ===========================================================================================================
FAILED pymc3/tests/test_distributions.py::TestMatchesScipy::test_icar[3] - NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A)
FAILED pymc3/tests/test_distributions.py::TestMatchesScipy::test_icar[20] - NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A)
FAILED pymc3/tests/test_distributions.py::TestMatchesScipy::test_icar[100] - NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A)
FAILED pymc3/tests/test_distributions.py::TestMatchesScipy::test_icar[500] - NotImplementedError: ('this function should only be called on sparse.scipy.sparse.spmatrix or numpy.ndarray, not,', A)

EDIT: if I comment out the lines:

+        #if aesara.sparse._is_sparse(A):
+        #    A = aesara.sparse.dense_from_sparse(floatX(A))
+        #else:
+        #    A = at.as_tensor_variable(floatX(A))
+        A = at.as_tensor_variable(floatX(A))

the the tests run as expected. Is there a subtle bug in _is_sparse that cannot handle certain input?

EDIT2: I decided to use _is_sparse_variable instead and things work now.

zoj613 · 2021-07-20T16:28:33Z

I think this is ready for a round of review @brandonwillard @fonnesbeck @ricardoV94

The generation of the adjacency matrix can probably be improved by a lot to prevent the weird failures in tests.

brandonwillard

Looks like this needs some updates in order for the tests to pass and to get coverage for the logp implementation.

pymc3/distributions/multivariate.py

brandonwillard · 2021-08-06T23:55:24Z

pymc3/distributions/multivariate.py

+            + 0.5 * at.log(d).sum()
+            - 0.5 * tau * at.dot(value, at.dot(W, value))
+        )
+        return bound(out, at.math.abs_(at.sum(value)) < 1e-06)


Is this effectively being clipped for numerical reasons?

Yep, the sum of the values have to be 0 so I set the precision to 6 digits.

pymc3/distributions/multivariate.py

Co-authored-by: Ricardo Vieira <[email protected]>

Co-authored-by: Brandon T. Willard <[email protected]>

Co-authored-by: Ricardo Vieira <[email protected]>

ricardoV94 · 2023-03-15T00:01:28Z

Closing this as stale.

@zoj613 let us know if you want to pick this up at another time. Probably pymc-experimental would be a better target

zoj613 changed the title ~~Add sum-zero constrained intrinsic conditional autoregressive model~~ Add sum-zero constrained intrinsic conditional autoregressive (ICAR) model Jul 10, 2021

zoj613 force-pushed the icar branch from 696ebc4 to 2f84110 Compare July 10, 2021 22:33

zoj613 force-pushed the icar branch from 2f84110 to 6089164 Compare July 10, 2021 22:58

ricardoV94 reviewed Jul 11, 2021

View reviewed changes

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved

pymc3/distributions/multivariate.py Show resolved Hide resolved

zoj613 changed the title ~~Add sum-zero constrained intrinsic conditional autoregressive (ICAR) model~~ Add sum-zero constrained intrinsic conditional autoregressive (ICAR) variable Jul 11, 2021

zoj613 force-pushed the icar branch from fdb1759 to 3e30132 Compare July 17, 2021 12:45

zoj613 force-pushed the icar branch from f16cbcb to a0ed224 Compare July 20, 2021 16:05

zoj613 marked this pull request as ready for review July 20, 2021 16:19

brandonwillard suggested changes Aug 6, 2021

View reviewed changes

ricardoV94 reviewed Aug 7, 2021

View reviewed changes

pymc3/distributions/multivariate.py Outdated Show resolved Hide resolved

zoj613 force-pushed the icar branch from 707fdd9 to b0a5853 Compare August 7, 2021 09:23

zoj613 and others added 10 commits August 7, 2021 11:27

Add sum-zero constrained intrinsic conditional autoregressive model

084a096

Apply suggestions from code review

34c2e87

Co-authored-by: Ricardo Vieira <[email protected]>

add docstring

5a5d978

use pseudo-inverse for hermitian matrices instead of general function

f32d01c

update docstrings and reduce duplication

758543a

account for sparse adjacency matrix input

28dfc67

Add input validation.

6120046

add tests for random number generation

056291f

add tests for logp

70a42d8

Assign the results of assertion tests, as suggested in review.

525bfe6

Co-authored-by: Brandon T. Willard <[email protected]>

import abs function from the module top level

b0a5853

Co-authored-by: Ricardo Vieira <[email protected]>

ricardoV94 closed this Mar 15, 2023

Uh oh!

Add sum-zero constrained intrinsic conditional autoregressive (ICAR) variable #4851

Add sum-zero constrained intrinsic conditional autoregressive (ICAR) variable #4851

Uh oh!

Conversation

zoj613 commented Jul 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

References

Uh oh!

zoj613 commented Jul 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jul 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zoj613 commented Jul 18, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ricardoV94 commented Jul 19, 2021

Uh oh!

zoj613 commented Jul 19, 2021

Uh oh!

zoj613 commented Jul 20, 2021

Uh oh!

ricardoV94 commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoj613 commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoj613 commented Jul 20, 2021

Uh oh!

brandonwillard left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

brandonwillard Aug 6, 2021

Choose a reason for hiding this comment

Uh oh!

zoj613 Aug 7, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ricardoV94 commented Mar 15, 2023

Uh oh!

Uh oh!

zoj613 commented Jul 10, 2021 •

edited

Loading

zoj613 commented Jul 10, 2021 •

edited

Loading

ricardoV94 left a comment •

edited

Loading

codecov bot commented Jul 17, 2021 •

edited

Loading

zoj613 commented Jul 18, 2021 •

edited

Loading

ricardoV94 commented Jul 20, 2021 •

edited

Loading

zoj613 commented Jul 20, 2021 •

edited

Loading