A very small external seam around observe -> selector -> extract #2005

Rul1an · 2026-04-16T09:03:52Z

Rul1an
Apr 16, 2026

Hi,

We just shipped a small Assay-side sample built around one observe-derived selector and one selector-scoped extract result:

https://github.com/Rul1an/assay/tree/main/examples/stagehand-selector-scoped-extract-evidence

What we liked about this seam is that it stayed really small and reviewable. We did not have to treat DOM truth, page snapshots, or Stagehand runtime semantics as truth. We could stay on one selector anchor that came from observe(), then one bounded structured result from extract() scoped to that selector.

That felt like a pretty nice external-consumer seam, especially with the selector-scoping direction already showing up in the docs and repo conversations.

We also tried to keep ourselves honest. The sample is runtime-backed from a local probe, but we are not claiming the full provider-live path is proven yet. A local ollama/llama3.2:3b probe fell over on the structured observe() prompt, so we kept that gap explicit instead of smoothing it away.

Main question from our side: if someone wants the smallest honest Stagehand surface for an external evidence consumer, does observe-derived selector -> selector-scoped extract feel like the right place to start, or is there an even thinner official seam you would rather point them at?

Would love your gut read. This lane feels close, but we are very happy to tighten it further if there is a more natural boundary upstream.

aaronlab · 2026-05-10T19:45:10Z

aaronlab
May 10, 2026

This feels like the right first seam if the external consumer needs evidence/provenance rather than replay semantics.

The useful part of observe -> selector -> selector-scoped extract is that it gives you a bounded chain:

what Stagehand was asked to find;
which observed selector/anchor was chosen;
what structured result was extracted from that bounded target;
where the chain failed if the provider, prompt, selector, or schema did not hold.

I would probably not make it thinner by storing only the final structured extract result, because then the consumer loses the provenance that makes the evidence reviewable. I also would not make it broader by importing DOM snapshots, screenshots, or full runtime state unless the downstream use case is audit/replay rather than evidence import.

The metadata I would keep attached is small: Stagehand version, model/provider, URL or route pattern if safe to store, observe instruction, selected selector/anchor, extract schema version/hash, extraction status/error, and timestamp/run id.

For BrowserTrace's Stagehand adapter work I have been thinking in terms of preserving call boundaries (act / extract / observe) plus outputs and errors, because failure debugging needs provenance. Your seam is a lower-noise version of the same idea: observation anchor -> scoped evidence result. That seems like a good external-consumer boundary to test before widening it.

0 replies

Rul1an · 2026-05-10T19:57:41Z

Rul1an
May 10, 2026
Author

Thanks, this is a really useful read.

The "final extract alone is too thin, full runtime state is too broad" framing is exactly the line we were trying to hold. The provenance chain is the valuable part here, not just the extracted JSON.

I’ll keep the first Assay-side boundary at:

observe instruction -> selected selector/anchor -> selector-scoped extract result -> status/error

And keep the attached metadata small: Stagehand version, model/provider, safe route or URL pattern, schema hash/version, timestamp/run id.

The BrowserTrace comparison is helpful too. Preserving call boundaries for debugging makes sense; for this lane I’ll keep the evidence import as the lower-noise version of that, and only widen if the use case becomes audit/replay rather than bounded evidence.

Appreciate the careful gut check.

1 reply

aaronlab May 10, 2026

That shape looks good. One guardrail I would keep explicit: tie the selected selector/anchor and the extraction schema hash/version to the same observation attempt or run id. That way a later consumer cannot accidentally compare an extract result from one anchor with provenance from another.

If the use case widens toward audit/replay later, I would keep that as a second boundary rather than promoting screenshots, DOM, or full runtime state into this evidence reducer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A very small external seam around observe -> selector -> extract #2005

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

A very small external seam around observe -> selector -> extract #2005

Uh oh!

Rul1an Apr 16, 2026

Replies: 2 comments · 1 reply

Uh oh!

aaronlab May 10, 2026

Uh oh!

Rul1an May 10, 2026 Author

Uh oh!

aaronlab May 10, 2026

Rul1an
Apr 16, 2026

Replies: 2 comments 1 reply

aaronlab
May 10, 2026

Rul1an
May 10, 2026
Author