Heteroskedastic GPs notebook #161

JohnGoertz · 2021-05-21T11:59:37Z

Description

Several other GP frameworks (namely GPflow and GPy have examples
of how to model heteroskedasticity with GPs, but PyMC3 doesn't. There are hints scattered throughout the Discourse on how to do so via latent GPs through sparse approximations and/or coregionalization, but these largely rely on external links and aren't collected in one place.

I've put together a notebook that works through various ways to achieve this. Let me know if there are any changes or extensions you'd like to see!

review-notebook-app · 2021-05-21T11:59:41Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

twiecki · 2021-05-21T15:17:57Z

I just skimmed but this looks awesome - thanks so much!

@drbenvincent or @bwengals: do you have time to take a look?

drbenvincent · 2021-05-21T15:31:26Z

I can take a look. I've not used GP's, so my comments would probably be restricted to clarity / accessibility.

review-notebook-app · 2021-05-21T16:08:35Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:35Z
----------------------------------------------------------------

result -> results

JohnGoertz commented on 2021-05-22T18:52:58Z
----------------------------------------------------------------

Fixed!

review-notebook-app · 2021-05-21T16:08:36Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:36Z
----------------------------------------------------------------

Would _, axs = ... would be clearer than having the trailing [1] ? Just an idle comment, feel free to ignore.

JohnGoertz commented on 2021-05-21T16:21:06Z
----------------------------------------------------------------

Yeah, true, I got in the habit of the [1] notation because I rarely ever use the fig handle, but you're right the other way is more clear.

review-notebook-app · 2021-05-21T16:08:37Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:36Z
----------------------------------------------------------------

It's probably with increasing target_accept to something like 0.9 (in the sampling options) to see if that avoids the divergences

JohnGoertz commented on 2021-05-21T16:21:33Z
----------------------------------------------------------------

I'll give that a shot, good call.

JohnGoertz commented on 2021-05-22T18:55:42Z
----------------------------------------------------------------

Yeah, I had to go up to 0.95 to get down to only a couple dozen divergences for each Latent model. Unfortunately roughly doubled sampling time, so I made a note at the end pointing out the trade-off.

drbenvincent commented on 2021-05-22T19:59:41Z
----------------------------------------------------------------

Sounds good

review-notebook-app · 2021-05-21T16:08:37Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:37Z
----------------------------------------------------------------

Any reason to manually define rather than use the commented out line?

JohnGoertz commented on 2021-05-21T16:23:37Z
----------------------------------------------------------------

Initially, I hadn't figured out how to make the kmeans initialization reproducible, but I've found out it's through numpy's legacy random.seed() , so I'll change to that now.

OriolAbril commented on 2021-05-22T11:11:42Z
----------------------------------------------------------------

This is also probably worth an issue.

JohnGoertz commented on 2021-05-22T19:06:52Z
----------------------------------------------------------------

Yeah, it should be easy enough to fix, I can raise an issue and try to fix it myself by just inserting a call to np.random.seed() within the wrapper.

I found that the LMC implementation was actually quite sensitive to the placement of inducing points. Many initializations from kmeans led to zero-gradient errors (in particular in u_rotated ) that I couldn't figure out how to fix by tweaking the priors. I couldn't decide if that was worth pointing out or if it was somewhat specific to this dataset. I found that if I just selected inducing points from the original inputs it worked fine, perhaps because the data was evenly spaced to begin with. To be honest I need to read more into what the prior on u_rotated is doing.

review-notebook-app · 2021-05-21T16:08:38Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:37Z
----------------------------------------------------------------

Same point... It's probably with increasing target_accept to something like 0.9 (in the sampling options) to see if that avoids the divergences

JohnGoertz commented on 2021-05-22T19:08:03Z
----------------------------------------------------------------

I got this down to 20 divergences with target_accept=0.95, but that ~tripled sampling time, so probably don't want to go much higher.

review-notebook-app · 2021-05-21T16:08:38Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:38Z
----------------------------------------------------------------

Same point again about divergences

OriolAbril commented on 2021-05-22T11:15:07Z
----------------------------------------------------------------

Also seeing the ess warning, we should check ess and quantile ess and probably use a less extreme CI probability. Estimating the 95% one with less than 200 ess on some parameters is not a good idea.

JohnGoertz commented on 2021-05-22T19:16:50Z
----------------------------------------------------------------

Increasing target_accept to 0.95 improved this for all of them. The lowest ess_bulk for this model was 300 (one of the v s), though ℓ and η were only 370 and 383. The ess_tail for those three ended up being 131, 141, and 128. The non-correlated sparse latent did much better, with the smallest ess_bulk being 850 (η) and ess_tail all over 1100. r_hat s 1.01 or less all around, though. Do you think we should point that out and/or narrow drop the CI range? My only concern with a narrower CI is that it might obscure the difference in CI width for the two noise regimes, which is kind of the whole point. Maybe just a note at the end discussing the need for longer sampling of htsc to have good confidence in the CI?

review-notebook-app · 2021-05-21T16:08:39Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:38Z
----------------------------------------------------------------

This feels a bit out of place here, especially without a brief explanation. It looks like it's not just a larger version of the last right hand subplot?

JohnGoertz commented on 2021-05-21T17:05:27Z
----------------------------------------------------------------

Admittedly, this was a bit of a hacky way for me to get around this issue with the way thumbnails are generated for the gallery page. The current behavior probably would have caused the thumbnail to crop the final three-panel figure down to just the middle panel, which I felt wasn't the most representative. I can try to find a way around it...

review-notebook-app · 2021-05-21T16:08:39Z

View / edit / reply to this conversation on ReviewNB

drbenvincent commented on 2021-05-21T16:08:39Z
----------------------------------------------------------------

I don't know how strict people are about conforming to the suggested watermark in the style guide

JohnGoertz commented on 2021-05-21T16:30:22Z
----------------------------------------------------------------

Woops, added the -p xarray bit

JohnGoertz · 2021-05-21T16:21:06Z

Yeah, true, I got in the habit of the [1] notation because I rarely ever use the fig handle, but you're right the other way is more clear.

View entire conversation on ReviewNB

JohnGoertz · 2021-05-21T16:21:34Z

I'll give that a shot, good call.

View entire conversation on ReviewNB

JohnGoertz · 2021-05-21T16:23:37Z

Initially, I hadn't figured out how to make the kmeans initialization reproducible, but I've found out it's through numpy's legacy random.seed() , so I'll change to that now.

View entire conversation on ReviewNB

JohnGoertz · 2021-05-21T16:30:23Z

Woops, added the -p xarray bit

View entire conversation on ReviewNB

JohnGoertz · 2021-05-21T17:05:28Z

Admittedly, this was a bit of a hacky way for me to get around this issue with the way thumbnails are generated for the gallery page. The current behavior probably would have caused the thumbnail to crop the final three-panel figure down to just the middle panel, which I felt wasn't the most representative. I can try to find a way around it...

View entire conversation on ReviewNB

bwengals · 2021-05-21T20:10:01Z

this is REALLY cool

OriolAbril · 2021-05-22T11:11:43Z

This is also probably worth an issue.

View entire conversation on ReviewNB

OriolAbril · 2021-05-22T11:15:08Z

Also seeing the ess warning, we should check ess and quantile ess and probably use a less extreme CI probability. Estimating the 95% one with less than 200 ess on some parameters is not a good idea.

View entire conversation on ReviewNB

review-notebook-app · 2021-05-22T11:19:56Z

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-05-22T11:19:56Z
----------------------------------------------------------------

I would probably remove the jedi config and let users choose whatever they want

JohnGoertz commented on 2021-05-22T19:17:51Z
----------------------------------------------------------------

Oh good point. Jedi wasn't behaving for me, so that's boilerplate code for me, I'll take it out.

review-notebook-app · 2021-05-22T11:19:57Z

View / edit / reply to this conversation on ReviewNB

OriolAbril commented on 2021-05-22T11:19:56Z
----------------------------------------------------------------

I am not too familiar with gp functions so I don't know how good/bad dims integration is with them. We should use dims (either at model definition or via idata_kwargs or both) and if this turns out to be too complicated, open issues to signal the areas that need work. When using dims, I would then add the posterior predictive samples to the inferencedata object and average, select... using labeled dims. i.e. instead of var(axis=0) that people reading the notebook may not know what it means it would become var(dim=("chain", "draw")). This will probably break a lot of plotting code though but I can help with that, and it will also serve to indicate missing functionality in xarray, probably along the lines of https://github.com/arviz-devs/arviz/issues/1668 like random sample generation.

It may also be worth it to tackle this in a follow-up PR, right now I'm mostly trying to gauge how much work will that be as well as how much (if any) extra clarity it can provide (the notebook is already amazing).

JohnGoertz · 2021-05-22T18:52:59Z

Fixed!

View entire conversation on ReviewNB

JohnGoertz · 2021-05-22T18:55:43Z

Yeah, I had to go up to 0.95 to get down to only a couple dozen divergences for each Latent model. Unfortunately roughly doubled sampling time, so I made a note at the end pointing out the trade-off.

View entire conversation on ReviewNB

JohnGoertz · 2021-05-22T19:06:53Z

Yeah, it should be easy enough to fix, I can raise an issue and try to fix it myself by just inserting a call to np.random.seed() within the wrapper.

I found that the LMC implementation was actually quite sensitive to the placement of inducing points. Many initializations from kmeans led to zero-gradient errors (in particular in u_rotated ) that I couldn't figure out how to fix by tweaking the priors. I couldn't decide if that was worth pointing out or if it was somewhat specific to this dataset. I found that if I just selected inducing points from the original inputs it worked fine, perhaps because the data was evenly spaced to begin with. To be honest I need to read more into what the prior on u_rotated is doing.

View entire conversation on ReviewNB

JohnGoertz · 2021-05-22T19:08:04Z

I got this down to 20 divergences with target_accept=0.95, but that ~tripled sampling time, so probably don't want to go much higher.

View entire conversation on ReviewNB

JohnGoertz · 2021-05-22T19:16:51Z

Increasing target_accept to 0.95 improved this for all of them. The lowest ess_bulk for this model was 300 (one of the v s), though ℓ and η were only 370 and 383. The ess_tail for those three ended up being 131, 141, and 128. The non-correlated sparse latent did much better, with the smallest ess_bulk being 850 (η) and ess_tail all over 1100. r_hat s 1.01 or less all around, though. Do you think we should point that out and/or narrow drop the CI range? My only concern with a narrower CI is that it might obscure the difference in CI width for the two noise regimes, which is kind of the whole point. Maybe just a note at the end discussing the need for longer sampling of htsc to have good confidence in the CI?

View entire conversation on ReviewNB

JohnGoertz · 2021-05-22T19:17:52Z

Oh good point. Jedi wasn't behaving for me, so that's boilerplate code for me, I'll take it out.

View entire conversation on ReviewNB

drbenvincent · 2021-05-22T19:59:41Z

Sounds good

View entire conversation on ReviewNB

JohnGoertz · 2021-05-23T19:28:57Z

Okay, I've re-run with a higher target_accept, which reduced but didn't eliminate divergences. However, upon attempting to demonstrate how to get the learned correlation coefficient out of the LMC, it seems it's not really working as expected. It appears to have learned a slight negative correlation (-0.1), albeit very wide 95% CI bounds (-0.9,0.9). I added a note of this, as well as a discussion of the sampler statistics (particularly ess_tail), but maybe it's best to omit the LMC from this notebook? Perhaps this isn't the best example of its use, and it may be more confusing than helpful.

drbenvincent · 2021-06-07T11:20:05Z

All my points were addressed, this all looks great to me. But I'm not experienced with GP's, so I think this notebook just needs a final check over from someone with more GP knowledge?

twiecki · 2021-06-09T07:32:41Z

Thanks @JohnGoertz and @drbenvincent!

JohnGoertz · 2021-06-11T11:37:52Z

That's great, thanks for looking it over and glad to contribute!

Heteroskedastic GPs notebook

7216461

pre-commit

e047fd2

Adds commentary on LMC and convergence statistics

bb675a6

twiecki merged commit cc5350b into pymc-devs:main Jun 9, 2021

Uh oh!

Heteroskedastic GPs notebook #161

Heteroskedastic GPs notebook #161

Uh oh!

Conversation

JohnGoertz commented May 21, 2021

Description

Uh oh!

review-notebook-app bot commented May 21, 2021

Uh oh!

twiecki commented May 21, 2021

Uh oh!

drbenvincent commented May 21, 2021

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JohnGoertz commented May 21, 2021

Uh oh!

JohnGoertz commented May 21, 2021

Uh oh!

JohnGoertz commented May 21, 2021

Uh oh!

JohnGoertz commented May 21, 2021

Uh oh!

JohnGoertz commented May 21, 2021

Uh oh!

bwengals commented May 21, 2021

Uh oh!

OriolAbril commented May 22, 2021

Uh oh!

OriolAbril commented May 22, 2021

Uh oh!

review-notebook-app bot commented May 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented May 22, 2021

Uh oh!

JohnGoertz commented May 22, 2021

Uh oh!

JohnGoertz commented May 22, 2021

Uh oh!

JohnGoertz commented May 22, 2021

Uh oh!

JohnGoertz commented May 22, 2021

Uh oh!

JohnGoertz commented May 22, 2021

Uh oh!

JohnGoertz commented May 22, 2021

Uh oh!

drbenvincent commented May 22, 2021

Uh oh!

JohnGoertz commented May 23, 2021

Uh oh!

drbenvincent commented Jun 7, 2021

Uh oh!

twiecki commented Jun 9, 2021

Uh oh!

JohnGoertz commented Jun 11, 2021

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 21, 2021 •

edited

Loading

review-notebook-app bot commented May 22, 2021 •

edited

Loading