Skip to content

Print inline draws in falsifying example#4716

Open
DRMacIver wants to merge 23 commits into
masterfrom
DRMacIver/deferred-pretty-printing
Open

Print inline draws in falsifying example#4716
DRMacIver wants to merge 23 commits into
masterfrom
DRMacIver/deferred-pretty-printing

Conversation

@DRMacIver

Copy link
Copy Markdown
Member

If Hegel has taught us anything it's that inline draws are awesome, and that it's a shame that the UX for them in Hypothesis isn't better. This makes the UX better. You now get good printing of draws as part of the falsifying example, and you can use the DataObject in @example.

@given(data=st.data())
def test(data):
    x = data.draw(st.integers(), label="Something")
    ...

This will now print as:

Falsifying example: test(
    data=DataObject(draws=[
        # Something
        0,
    ]),
)

@DRMacIver DRMacIver requested a review from Liam-DeVoe April 24, 2026 14:04
@Zac-HD

Zac-HD commented Apr 24, 2026

Copy link
Copy Markdown
Member

Nice! There's a similar trick I've been meaning to set up for st.functions(), representing them as a dict lookup for pure functions and something like lambda ..., __returns__=[...][::-1]: __returns__.pop() for impure functions. Just never got around to implementing it and seeing whether I think it's actually an improvement...

@DRMacIver

Copy link
Copy Markdown
Member Author

Nice! There's a similar trick I've been meaning to set up for st.functions(), representing them as a dict lookup for pure functions and something like lambda ..., __returns__=[...][::-1]: __returns__.pop() for impure functions. Just never got around to implementing it and seeing whether I think it's actually an improvement...

Yeah I think the deferred pretty-printer approach will work well for a lot of other similar cases. st.functions() and st.randoms(use_true_random=False) are both on my hit list.

@Liam-DeVoe Liam-DeVoe left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Can we split this into two PRs, one for the new pretty-printing logic and one for the new ability to specify interactive draws on @example? The release notes in this PR bury the lede, because the latter is substantially more impactful to the user API than the former
  • For the @example-PR: DataObject is now part of the public API, and we should update the docs accordingly

Comment on lines +569 to +571

def finalize(self) -> None:
"""Replay all outstanding deferreds created on this printer and

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love this name, because python already uses the term finalize for GC: https://docs.python.org/3/library/weakref.html#weakref.finalize

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. I'll rename it.

Comment thread hypothesis-python/tests/common/utils.py Outdated
Comment on lines +373 to +377
"""
import functools
import inspect

from hypothesis import given

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move imports to top level

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh I'm going to add a lint and format-refactor for this.

Comment on lines 50 to 55
with raises(AssertionError) as err:
test()
assert "Draw 1 (Some numbers): [0, 0]" in err.value.__notes__
assert "Draw 2 (A number): 0" in err.value.__notes__
notes = "\n".join(err.value.__notes__)
assert "# Some numbers" in notes
assert "# A number" in notes

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change this to assert a multiline regex? This loses the assertion of the minimal values

@DRMacIver DRMacIver force-pushed the DRMacIver/deferred-pretty-printing branch from cbd2311 to 41c122f Compare April 27, 2026 12:12
@DRMacIver

Copy link
Copy Markdown
Member Author
  • Can we split this into two PRs, one for the new pretty-printing logic and one for the new ability to specify interactive draws on @example? The release notes in this PR bury the lede, because the latter is substantially more impactful to the user API than the former

I can definitely improve the release notes to not bury the lede, but I think the answer is... not really, in any sensible way. Literally the only change the pretty-print logic is there to enable is that you can print st.draws() like this and it's sortof crazy to add that printing and have it not work.

@Liam-DeVoe

Liam-DeVoe commented Apr 30, 2026

Copy link
Copy Markdown
Member

OK, let's definitely make the ability to use @example with interactive draws the headlining feature of this release, not the changing output format. And add some tests for this new capability of @example.


We'll need to figure out the right way to expose DataObject here. I don't want to expose it as-is in this PR, with the data parameter and with many undocumented attributes—count, conjecture_data, draws, etc. Thoughts:

  • make all of those underscore-private
  • split DataObject in two: publicly-constructed and privately-constructed

@DRMacIver DRMacIver force-pushed the DRMacIver/deferred-pretty-printing branch from 41c122f to 0e24595 Compare May 20, 2026 12:38
@DRMacIver DRMacIver requested a review from Liam-DeVoe May 20, 2026 14:21
DRMacIver and others added 21 commits May 30, 2026 21:19
deferred() returns a new printer whose output will be inserted at the
position deferred() was called once finalize() is invoked. Primitive
calls are recorded as concrete operations so the recording is unaffected
by later mutation of pretty-printed objects.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
finalize() is now called on the parent printer (the one deferred() was
called on) rather than on the returned deferred printer. Any deferred
printer - nested deferreds included - raises on further use after its
parent is finalized, while new deferreds can still be created on the
parent afterwards.

Remove the pre-deferred buffer flush so that line-wrap decisions made
by the parent printer are not forced prematurely by the act of calling
deferred(), aligning replayed output with what the same sequence of
primitive calls would produce without deferral.

Add a stateful test that fuzzes printing programs through both a direct
printer and a printer driven via deferred()/finalize(), asserting their
outputs agree.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add a `draws=` parameter to DataObject (with the ConjectureData parameter
now optional, exactly one of the two required). In replay mode, each
draw() call reads the next value off the pre-recorded draws list.

DataObject._repr_pretty_ now prints "DataObject(draws=[", opens a
deferred printer on the parent, stores it as a class-level attribute
`DataObject.printer`, and closes with "])". Each subsequent draw() call
records a snapshot of the drawn value onto that deferred. The test
runner finalizes the parent printer after the test body returns so the
recorded draws are spliced into the reported output.

Result: when a test using `st.data()` fails, the falsifying example now
shows `DataObject(draws=[0, [0]])` with the actual values drawn,
rather than the opaque `data(...)` placeholder. Because draws are
pretty-printed at draw time, mutations made to drawn values after
data.draw() returns do not appear in the output.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…notes

- Move the deferred printer from a class attribute to an instance
  attribute. This removes the shared-state hack and the corresponding
  reset in core.py.
- Render labeled draws with the label as a comment on the preceding line,
  e.g. ``# Cool thing`` above the value. Each draw is always emitted on
  its own line using `break_()` so the surrounding indentation (e.g. from
  ``repr_call``) is respected.
- Drop the per-draw ``Draw N: value`` notes, both the in-test ``note()``
  call and the observability tail that iterated
  ``data._observability_args``. The same information now appears inline in
  ``DataObject(draws=[...])``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pin the exact rendered form for several draw scenarios - empty, single
unlabeled, multiple unlabeled, labeled, all-labeled, mixed, nested value,
alongside other args, and the two-st.data()-args case - so that any
changes to indentation, comma placement, or label formatting have to be
made deliberately by regenerating the snapshot.

The two_data_args snapshot captures an existing quirk: because
DataStrategy.do_draw memoizes its DataObject on the underlying
ConjectureData, ``@given(st.data(), st.data())`` yields the same
DataObject instance for both args, so both draws end up attributed to
the second printer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
``@snapshot_given(*strategies)`` now decorates the test body directly.
It builds the corresponding Hypothesis property test (always forcing a
failure) and returns a pytest test function taking the ``snapshot``
fixture that asserts the captured falsifying-example output equals the
snapshot value. The decorated function's own name is used for the test
name, so the "Falsifying example: <name>(" line matches the test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Snapshot tests go alongside the other snapshot-based tests under
``tests/snapshots/`` and use the shared ``SNAPSHOT_SETTINGS`` +
``run_test_for_falsifying_example`` helpers. The
``test_combinators.py::test_data_draw`` snapshot is also regenerated to
reflect the new ``DataObject(draws=[...])`` rendering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Each draw now registers its choice-node range on
  ``conjecture_data.arg_slices``, so the explain phase varies it and
  populates ``slice_comments`` for it.  When the deferred printer
  records the drawn value, we check the printer's ``slice_comments``
  for that range and, if present, emit the comment next to the value
  (matching the top-level ``repr_call`` annotation style, e.g.
  ``0,  # or any other generated value``).

- Add a re-entrancy guard (``_pretty_printing_draw``): when ``draw()``
  calls ``printer.pretty(result)`` on a value that happens to be the
  DataObject itself, the re-entrant ``_repr_pretty_`` sees the flag and
  emits ``DataObject(...)`` rather than trying to open another deferred
  (which previously produced garbled output).

- Switch the snapshot-test ``snapshot_given`` helper to
  ``EXPLAIN_SETTINGS`` so the new explain annotations are exercised in
  the snapshots; update all snapshots accordingly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``snapshot_given`` decorator is a general-purpose helper for any
snapshot test that wants to assert on the falsifying-example output of
a Hypothesis property test, so it belongs in
``tests/snapshots/conftest.py`` alongside ``SNAPSHOT_SETTINGS`` /
``EXPLAIN_SETTINGS`` rather than inline in ``test_data_object.py``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ls.py

``SNAPSHOT_SETTINGS``, ``EXPLAIN_SETTINGS`` and ``snapshot_given`` now
live alongside ``run_test_for_falsifying_example`` in
``tests/common/utils.py``, which is where the rest of the shared test
utilities already live. ``tests/snapshots/conftest.py`` no longer has
any content and is removed; the other ``tests/snapshots/test_*.py``
modules are updated to import from ``tests.common.utils``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
``_DeferredPrinter`` now copies its parent's ``stack`` at creation time.
That way, when a ``data.draw(st.just(data))`` later re-enters
``pretty(data)`` through the deferred, the pretty-printer's normal cycle
mechanism sees ``id(data)`` already in the inherited stack, sets
``cycle=True``, and ``DataObject._repr_pretty_`` can bail to
``DataObject(...)`` with no ad-hoc instance/class flags.

The mutual-recursion test is kept as a snapshot so the behaviour in
that case is pinned rather than hidden.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Stack inheritance alone catches self-reference (``data.draw(st.just(data))``)
- the inherited stack still contains ``id(data)`` when the draw later
pretty-prints it, so the pretty-printer sets ``cycle=True``.

Mutual recursion (``d1.draw(st.just(d2)); d2.draw(st.just(d1))`` with
two distinct DataObjects) isn't caught by stack inheritance, because by
the time ``d1.draw`` pretty-prints ``d2``, the outer ``pretty(d2)`` has
already popped its entry. The result was that ``d2``'s top-level
rendering got emptied out while its content was nested inside ``d1``.

``RepresentationPrinter`` now exposes a ``root`` attribute (itself for
top-level printers, inherited for deferreds), and
``DataObject._repr_pretty_`` treats "my live printer shares its root
with the caller ``p``" as a cycle too - so the second pretty of d2
bails to ``DataObject(...)`` and d2's top-level draws list is
preserved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Wrap the deferred in ``p.group(4)`` so that ``break_()`` calls recorded
inside it emit at ``parent_indent + 4``, and drop the literal
``text("    ")`` spaces that previously did the indenting. That literal
was fixed-width and didn't compose when a DataObject was drawn as a
value inside another DataObject's draws list - e.g.
``test_data_from_data`` was rendering as

    d1=DataObject(draws=[
        DataObject(draws=[
        0,  # or any other generated value
    ]),
        1,
    ]),

with the inner ``0`` and ``])`` visually misaligned.  It now renders as

    d1=DataObject(draws=[
        DataObject(draws=[
            0,  # or any other generated value
        ]),
        1,
    ]),

Also drop the trailing ``break_()`` from each draw - it was producing a
double-newline against the closing ``break_()`` emitted before ``])``.
Closing break is kept so ``])`` sits on its own line at the outer
indent.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verify that ``@example(DataObject(draws=[...]))`` plus ``@given(st.data())``:

- Feeds the drawn values to the test body in order, ignoring the
  ``strategy`` argument (mixed types are fine).
- Accepts an empty draws list for tests that don't call ``data.draw``.
- Works with multiple ``@example`` decorators, each with their own
  draws.
- Renders the example's drawn values in the falsifying-example output
  ("Falsifying explicit example: ...") just like regular runs.
- Preserves labels: labelled draws show up as ``# label`` comments
  above the value in the rendering.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ``pretty._replay_calls``: add ``assert child._recording is not None``
  so mypy accepts it as ``list[...]`` rather than ``list[...] | None``.
- ``test_custom_reprs::test_reprs_as_created_interactive``: regenerate
  snapshot (was the pre-feature ``data=data(...)\nDraw 1: Bar(10)`` form).
- ``test_provider::test_realization_with_verbosity_draw`` /
  ``test_realization_with_observability``: update expected strings - the
  per-draw ``Draw N: value`` notes are no longer emitted (their info is
  inline in ``DataObject(draws=[...])``); the verbosity test now just
  asserts the symbolic marker is present.
- ``test_data_object_pretty`` falsifying-example regexes: with the
  ``explain`` phase on (the CI default for these tests), each draw's
  line has an inline ``# or any other generated value`` comment that
  broke the tight ``\s*,\s*`` patterns. Extract the ``draws=[...]``
  section first, then check the values appear in order.
- Re-run ``shed`` for the formatting nits the check-format job flagged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The format and lint jobs on CI run a newer shed/ruff than my local
toolchain; they flag adjacent string-literal concatenations like
``"foo" "bar"`` (20 occurrences in the previous commit) and want them
merged into single strings. Merged the literals and dropped a spurious
blank line in ``test_pretty_deferred_stateful.py``.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The ``if printer is not None and printer._dead:`` branch in ``draw()``
clears a stale deferred-printer reference left over from a previous
printing session (e.g. the printer was finalized externally between
the pretty-print and the first subsequent ``data.draw(...)`` call).
CI's coverage check flagged lines 2401-2402 as uncovered. Add a test
that exercises that exact sequence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… API

- Rename RepresentationPrinter.finalize() to resolve() to avoid
  collision with Python's weakref.finalize terminology.
- Move inline imports to top level in tests/common/utils.py.
- Restore minimal-value assertions in test_prints_labels_if_given_on_failure
  using multiline regex.
- Split DataObject into a public class (draws= only, no internal fields)
  and an internal _DataObject subclass that carries conjecture_data and
  engine plumbing. Re-export DataObject from top-level hypothesis.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
DRMacIver and others added 2 commits May 30, 2026 21:26
_DataObject overrides draw() completely and never reads _draws,
so there's no need to initialize it. This avoids a type conflict
with the parent class where _draws is typed as list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…pshot

- Use :class:`~hypothesis.strategies.DataObject` in RELEASE.rst so
  Sphinx can resolve the reference.
- Add test_fuzz_DataObject to the ghostwriter recorded output since
  DataObject is now in hypothesis.__all__.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@DRMacIver DRMacIver force-pushed the DRMacIver/deferred-pretty-printing branch from c77b61d to ebb8e3e Compare May 30, 2026 21:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants