Remove `tune` stat from steps and fix non-discarding of tuning draws from trace by eclipse1605 · Pull Request #8015 · pymc-devs/pymc

eclipse1605 · 2025-12-20T11:32:43Z

Description

I tried to make “tuning vs draws” a driver owned concept again. Right now, parts of sampling/postprocessing infer warmup length from a per-step "tune" sampler stat, which can get out of sync (e.g. a step method returning "tune": False everywhere makes PyMC think n_tune == 0, so warmup isn’t discarded and the logs look wrong).

Related Issues

Fixes: #7997
Context: #7776 (progressbar/stat refactor that exposed the mismatch)
Related discussion/attempts: #7730, #7721, #7724, #8014

ricardoV94 · 2025-12-21T14:44:06Z

@OriolAbril / @aloctavodia does any part of Arviz require the step samples to have a tune flag? Is it enough that we have warmup / posterior distinction, each with their number of draws?

pymc/step_methods/compound.py

pymc/backends/mcbackend.py

pymc/step_methods/metropolis.py

ricardoV94 · 2025-12-21T14:48:31Z

Taking a step back, would it make sense for a tune=None mode where the sampler(s) decide how much tune they need? In that case it would make sense for the individual steps to report back whether they're tuning or not.

Even if that's the case, I think it still makes sense to remove this currently useless stat and reintroduce in a separate PR (provided nobody finds a reason why it is actually useful/needed).

CC @aloctavodia, @lucianopaz @aseyboldt

codecov · 2025-12-21T15:05:59Z

Codecov Report

❌ Patch coverage is 90.32258% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.56%. Comparing base (9082a04) to head (e9c066b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
pymc/backends/ndarray.py	50.00%	1 Missing ⚠️
pymc/sampling/parallel.py	0.00%	1 Missing ⚠️
pymc/sampling/population.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8015      +/-   ##
==========================================
+ Coverage   84.55%   84.56%   +0.01%     
==========================================
  Files         124      124              
  Lines       19872    19866       -6     
==========================================
- Hits        16802    16799       -3     
+ Misses       3070     3067       -3

Files with missing lines	Coverage Δ
pymc/backends/arviz.py	`96.04% <100.00%> (ø)`
pymc/backends/base.py	`88.26% <100.00%> (-0.44%)`	⬇️
pymc/backends/mcbackend.py	`99.28% <100.00%> (+0.02%)`	⬆️
pymc/backends/zarr.py	`93.87% <100.00%> (+0.05%)`	⬆️
pymc/sampling/mcmc.py	`90.61% <100.00%> (+0.27%)`	⬆️
pymc/smc/sampling.py	`96.55% <100.00%> (ø)`
pymc/step_methods/compound.py	`98.68% <100.00%> (+0.81%)`	⬆️
pymc/step_methods/hmc/base_hmc.py	`92.25% <ø> (ø)`
pymc/step_methods/hmc/hmc.py	`94.59% <ø> (ø)`
pymc/step_methods/hmc/nuts.py	`97.61% <ø> (ø)`
... and 6 more

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

michaelosthege · 2025-12-21T17:10:25Z

Taking a step back, would it make sense for a tune=None mode where the sampler(s) decide how much tune they need? In that case it would make sense for the individual steps to report back whether they're tuning or not.

Automatically stopping the warmup early would be nice. I think we should agree on cleanly separated definitions of warmup, burn-in and tuning. Samplers not needing to tune parameters doesn't mean that there's no need for a warmup phase of burn-in iterations (however one might call it).

Our current implementation is bad because it doesn't separate the concepts.

aloctavodia · 2025-12-22T10:52:46Z

ArviZ does not require or use a "tune" stats anywhere.

eclipse1605 · 2025-12-30T10:31:37Z

#7997 (comment)

michaelosthege

I like where this is going!

Using a slightly different naming I think we can simplify a bit more.

pymc/backends/base.py

pymc/backends/mcbackend.py

pymc/sampling/mcmc.py

eclipse1605 · 2025-12-30T16:02:21Z

@michaelosthege does this make sense?

pymc/sampling/mcmc.py

eclipse1605 · 2026-01-02T05:59:08Z

@michaelosthege check this out

michaelosthege

I'm not familiar with how the progress bar gets updated. Possibly my two comments on that matter are invalid, but please check them.

I'll also trigger the CI tests

pymc/step_methods/compound.py

pymc/step_methods/metropolis.py

pymc/step_methods/slicer.py

michaelosthege · 2026-01-02T14:45:05Z

tests/backends/test_mcbackend.py

-        tune = mtrace._straces[0].get_sampler_stats("tune")
-        assert isinstance(tune, np.ndarray)
+        # warmup is tracked by the sampling driver
        if discard_warmup:
-            assert tune.shape == (7, 3)
+            assert len(mtrace) == 7
        else:
-            assert tune.shape == (12, 3)
-        pass


can this test remain as before, but using the in_warmup stat instead?

@eclipse1605 this comment still sounds relevant though

eclipse1605 · 2026-01-07T08:36:34Z

I'm not familiar with how the progress bar gets updated. Possibly my two comments on that matter are invalid, but please check them.

hey, sorry for the delay but i think they're valid because warmup bookkeeping is now explicitly driver owned

eclipse1605 · 2026-01-07T08:41:15Z

@michaelosthege ive made the tests consistent with the changes, running the ci tests again will mostly pass now

michaelosthege

Looks good to me!

Thanks @eclipse1605 for your endurance with this!

eclipse1605 · 2026-01-07T18:41:08Z

Looks good to me!

Thanks @eclipse1605 for your endurance with this!

thanks a ton for the reviews and guidance @michaelosthege and @ricardoV94, really appreciate the patience since im still getting my bearings here :)

pymc/step_methods/metropolis.py

pymc/step_methods/slicer.py

ricardoV94 · 2026-01-07T22:00:14Z

tests/backends/test_arviz.py

            test_dict = {
                "posterior": ["u1", "n1"],
-                "sample_stats": ["~tune", "accept"],
+                "sample_stats": ["~in_warmup", "accept"],


I'm not sure about changing the output variable name, this seems like a breaking change for users?

The specific line I pointed to may not be relevant. The general question is whether we changed anything in MultiTrace/InferenceData output with this PR other than the tune flag not existing per step.

it now writes the warmup flag once as in_warmup, but for users, nothing new shows up. when we persist sampler stats (e.g. in mcbackend) we store that boolean and keep trace.get_sampler_stats("tune") working by aliasing to the new field. the default NDArray backend still omits both names, just like before. and to_inference_data continues to drop whichever warmup marker exists, so the resulting InferenceData matches main; the test only switches the "absent" check to the new internal name. no other MultiTrace/InferenceData variables changed.

ricardoV94

This looks sleek, I just want to do a manual integration test locally before merging

eclipse1605 · 2026-01-09T17:33:29Z

This looks sleek, I just want to do a manual integration test locally before merging

sounds good!

eclipse1605 · 2026-01-09T17:58:34Z

hey @ricardoV94 i tried to understand the failed test but didn't really get very far with it. is it failing because jax spits out NaNs when the dirichlet concentration is super skewed, so the multinomial never sees a clean prob vector?

ricardoV94 · 2026-01-09T18:41:25Z

That one fails now and then, don't worry about it

eclipse1605 · 2026-01-17T09:44:00Z

@eclipse1605 did you see the missed comment above about doing more minimal changes to the pre-existing test?

do you mean this

ricardoV94 · 2026-01-17T14:18:59Z

#8015 (comment)

eclipse1605 · 2026-01-17T15:53:10Z

i saw that, but i wanted a clarification as to whether we want to add an explicit tune alias assertion to preserve that compatibility

ricardoV94 · 2026-01-19T18:40:36Z

tests/backends/test_mcbackend.py

+            assert all(len(s) == 7 for s in in_warmup)
+            assert all(not np.any(s) for s in in_warmup)


What is this in_warmup object we're seeing here? From the test alone I have a hard time figuring out. Is it a numpy array?

It's unclear to me why this changed, it seemed like we just moved the source of tune/warmup, not the final stored contents?

@eclipse1605 ^ comment

pymc/backends/ndarray.py

eclipse1605 · 2026-01-22T19:05:12Z

@ricardoV94 any changes required in this?

ricardoV94 · 2026-01-26T11:43:10Z

I have a question about why the test changed, thought the output would still be the same. Also we merged another PR so this one now has conflicts that need to be solved. Let me know if you need help

eclipse1605 · 2026-01-27T06:05:23Z

as i said above, the test changed because the tune sampler stat has been removed right. so the warmup tracking is now handled by the sampling driver, and the backend no longer stores tune. earlier in the test, tune was retrieved using get_sampler_stats("tune"), and its shape was checked to verify the number of warmup and posterior samples.

the test asserted that tune was a NumPy array and checked its shape:

if discard_warmup was True, the shape was (7, 3).
if discard_warmup was False, the shape was (12, 3).

because warmup tracking is now managed directly by the sampling process, tune doesn't need to be stored in the backend, so the test cannot retrieve it. so instead of checking the shape of tune, it checks the length of the MultiTrace object (len(mtrace)) to determine the number of posterior samples.

let me know if that helps clarify things.

eclipse1605 · 2026-01-27T06:09:57Z

also, if we move on to merging this, ill likely need help fixing the merge conflicts

jessegrabowski · 2026-02-03T20:00:40Z

also, if we move on to merging this, ill likely need help fixing the merge conflicts

I'm happy to help but I'd want to let #8047 go in first because it will change things around again.

eclipse1605 · 2026-03-03T09:17:23Z

also, if we move on to merging this, ill likely need help fixing the merge conflicts

I'm happy to help but I'd want to let #8047 go in first because it will change things around again.

sure, makes sense. let me know when we want to merge this given there are no more changes required :)

…ckends

read-the-docs-community · 2026-03-08T12:55:08Z

Documentation build overview

📚 pymc | 🛠️ Build #31711938 | 📁 Comparing e9c066b against latest (9082a04)

🔍 Preview build

Show files changed (18 files in total): 📝 18 modified | ➕ 0 added | ➖ 0 deleted

File	Status
glossary.html	📝 modified
api/generated/pymc.backends.base.BaseTrace.html	📝 modified
api/generated/pymc.backends.zarr.ZarrChain.html	📝 modified
_modules/pymc/backends/arviz.html	📝 modified
_modules/pymc/backends/base.html	📝 modified
_modules/pymc/backends/ndarray.html	📝 modified
_modules/pymc/backends/zarr.html	📝 modified
_modules/pymc/sampling/mcmc.html	📝 modified
_modules/pymc/smc/sampling.html	📝 modified
_modules/pymc/step_methods/compound.html	📝 modified
_modules/pymc/step_methods/metropolis.html	📝 modified
_modules/pymc/step_methods/slicer.html	📝 modified
_modules/pymc/variational/opvi.html	📝 modified
api/generated/classmethods/pymc.backends.NDArray.record.html	📝 modified
api/generated/classmethods/pymc.backends.base.BaseTrace.record.html	📝 modified
api/generated/classmethods/pymc.backends.zarr.ZarrChain.record.html	📝 modified
_modules/pymc/step_methods/hmc/hmc.html	📝 modified
_modules/pymc/step_methods/hmc/nuts.html	📝 modified

ricardoV94 · 2026-03-08T13:29:31Z

Thanks @michaelosthege. The failing tests are a known issue. I'll try to fix them after, but need not block this PR.

…from trace (pymc-devs#8015) Co-authored-by: Michael Osthege <michael.osthege@outlook.com>

michaelosthege · 2026-03-08T14:24:59Z

Thanks @michaelosthege. The failing tests are a known issue. I'll try to fix them after, but need not block this PR.

😅 I just fixed them. Will do another PR then ;)

…from trace (#8015) Co-authored-by: Michael Osthege <michael.osthege@outlook.com>

github-actions bot added the bug label Dec 20, 2025

ricardoV94 mentioned this pull request Dec 21, 2025

BUG: CategoricalGibbsMetropolis doesn't respect the tune parameter #7997

Open

ricardoV94 reviewed Dec 21, 2025

View reviewed changes

pymc/step_methods/compound.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Dec 21, 2025

View reviewed changes

pymc/backends/mcbackend.py Show resolved Hide resolved

ricardoV94 reviewed Dec 21, 2025

View reviewed changes

pymc/step_methods/metropolis.py Show resolved Hide resolved

michaelosthege reviewed Dec 30, 2025

View reviewed changes

pymc/backends/base.py Outdated Show resolved Hide resolved

pymc/backends/mcbackend.py Outdated Show resolved Hide resolved

pymc/sampling/mcmc.py Show resolved Hide resolved

eclipse1605 requested a review from michaelosthege January 1, 2026 06:31

michaelosthege reviewed Jan 1, 2026

View reviewed changes

pymc/sampling/mcmc.py Outdated Show resolved Hide resolved

michaelosthege reviewed Jan 2, 2026

View reviewed changes

michaelosthege approved these changes Jan 7, 2026

View reviewed changes

ricardoV94 reviewed Jan 7, 2026

View reviewed changes

pymc/step_methods/metropolis.py Outdated Show resolved Hide resolved

ricardoV94 requested changes Jan 7, 2026

View reviewed changes

ricardoV94 requested a review from lucianopaz January 7, 2026 22:01

eclipse1605 requested a review from ricardoV94 January 8, 2026 19:10

ricardoV94 approved these changes Jan 9, 2026

View reviewed changes

ricardoV94 added the don't merge label Jan 9, 2026

ricardoV94 reviewed Jan 19, 2026

View reviewed changes

pymc/backends/ndarray.py Show resolved Hide resolved

ricardoV94 mentioned this pull request Jan 28, 2026

Fix CategoricalGibbsMetropolis to respect tune parameter (#7997) #8076

Closed

6 tasks

michaelosthege added this to the v6 milestone Feb 2, 2026

eclipse1605 and others added 7 commits March 8, 2026 12:09

attempt to fix warmup bookkeeping

e64929a

make warmup/tune driver owned and persist per draw tune marker via ba…

7dfe417

…ckends

persist driver owned warmup marker as in_warmup

7623fb8

drop compatibility shim

d619bb1

fix tests and backends for in_warmup trace.record api

a4c3d4b

removed redundant tuning flag

5b52eb8

Fix typing to make mypy pass again

e9c066b

michaelosthege force-pushed the remove-tune-sampler-stat branch from 4f537f0 to e9c066b Compare March 8, 2026 12:48

ricardoV94 merged commit b41a3bd into pymc-devs:main Mar 8, 2026
37 of 42 checks passed

ricardoV94 mentioned this pull request Mar 8, 2026

Fix mypy in zarr.py: handle StatShape int|None in np.arange #8160

Closed

11 tasks

ricardoV94 pushed a commit to ricardoV94/pymc that referenced this pull request Mar 8, 2026

Remove tune stat from steps and fix non-discarding of tuning draws …

b0cc053

…from trace (pymc-devs#8015) Co-authored-by: Michael Osthege <michael.osthege@outlook.com>

michaelosthege mentioned this pull request Mar 8, 2026

Emit scalar DEMetropolis "scaling" stat #8178

Merged

5 tasks

ricardoV94 pushed a commit that referenced this pull request Mar 8, 2026

Remove tune stat from steps and fix non-discarding of tuning draws …

977ecaa

…from trace (#8015) Co-authored-by: Michael Osthege <michael.osthege@outlook.com>

ricardoV94 mentioned this pull request Mar 27, 2026

v6->main #8222

Closed

		assert all(len(s) == 7 for s in in_warmup)
		assert all(not np.any(s) for s in in_warmup)

Conversation

eclipse1605 commented Dec 20, 2025

Description

Related Issues

Uh oh!

ricardoV94 commented Dec 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ricardoV94 commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Dec 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

michaelosthege commented Dec 21, 2025

Uh oh!

aloctavodia commented Dec 22, 2025

Uh oh!

eclipse1605 commented Dec 30, 2025

Uh oh!

michaelosthege left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

eclipse1605 commented Dec 30, 2025

Uh oh!

Uh oh!

eclipse1605 commented Jan 2, 2026

Uh oh!

michaelosthege left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michaelosthege Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

eclipse1605 commented Jan 7, 2026

Uh oh!

eclipse1605 commented Jan 7, 2026

Uh oh!

michaelosthege left a comment

Choose a reason for hiding this comment

Uh oh!

eclipse1605 commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

ricardoV94 Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eclipse1605 Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

ricardoV94 left a comment

Choose a reason for hiding this comment

Uh oh!

eclipse1605 commented Jan 9, 2026

Uh oh!

eclipse1605 commented Jan 9, 2026

Uh oh!

ricardoV94 commented Jan 9, 2026

Uh oh!

eclipse1605 commented Jan 17, 2026

Uh oh!

ricardoV94 commented Jan 17, 2026

Uh oh!

eclipse1605 commented Jan 17, 2026

Uh oh!

ricardoV94 Jan 19, 2026

ricardoV94 commented Dec 21, 2025 •

edited

Loading

codecov bot commented Dec 21, 2025 •

edited

Loading

ricardoV94 Jan 7, 2026 •

edited

Loading