Skip to content

Integration: Add langextract as a local visual trace layer (observability HTML viewer + CLI)Β #1412

@MervinPraison

Description

@MervinPraison

Overview

Add first-class integration with google/langextract so that a user can run a PraisonAI workflow and view the full execution β€” agent boundaries, LLM turns, tool calls, final output β€” as an interactive, self-contained HTML visualization that highlights every step grounded in the source prompt/input text.

This is the visualization analogue of the n8n integration (praisonai n8n open) we just shipped. Where n8n gave us a visual editor for workflows, langextract will give us a visual viewer for workflow executions β€” a zero-server, zero-sign-up way to inspect what an agent actually did, grounded in the exact spans of input it reasoned about.

Primary UX target:

# Run any YAML workflow with langextract observability
praisonai agents.yaml --observe langextract
# -> produces trace.jsonl + trace.html, opens browser to HTML viewer

# Or render an existing trace/session to HTML
praisonai langextract view trace.jsonl --open

Python API target:

from praisonaiagents import Agent
from praisonai.observability import LangextractSink, LangextractSinkConfig

agent = Agent(name="researcher", instructions="Summarize text")
with LangextractSink(config=LangextractSinkConfig(output_path="run.html")):
    agent.start("Long input document ...")
# run.jsonl + run.html written; open run.html to explore the trace

Background

What is langextract?

"A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization." β€” google/langextract README

Key properties that make it ideal as PraisonAI's visualization layer:

  1. Precise source grounding β€” every extraction maps to an exact char_interval of the source text.
  2. Self-contained interactive HTML β€” lx.visualize("extractions.jsonl") produces a single HTML file with highlights, timeline, filters. No server, no sign-up.
  3. Stable data model β€” lx.data.Extraction(extraction_class, extraction_text, attributes, char_interval) + lx.data.AnnotatedDocument is small and round-trippable.
  4. Apache-2.0 β€” compatible with PraisonAI licensing.
  5. Optional dependency β€” only activated when the user opts in.

Why this is valuable

  • Zero-config review UX β€” today, to understand what an agent did, the user has to trawl terminal logs, Langfuse cloud, or enable verbose mode. A single local HTML file is dramatically simpler.
  • Grounding debugging β€” for agents that read large inputs (docs, web pages, transcripts), seeing which spans the agent actually used vs. hallucinated is the Github actions fixΒ #1 debugging ask. langextract renders this natively.
  • Shareable, offline β€” the HTML is self-contained; drop it into Slack/a PR for review.
  • Complements β€” does not replace β€” Langfuse β€” Langfuse is cloud SaaS + production tracing; langextract is local file + run-time review.
  • Matches the n8n pattern β€” "one command, external UI becomes your eyes on the workflow" is proven user-friendly.

Current ecosystem state

Today users who want visual trace review must either:

  • Use --observe langfuse (requires cloud sign-up, API keys, internet).
  • Write a custom notebook that wires Agent events into a plotting library.
  • Use output="actions" which only prints to the terminal.

langextract closes the "local-only, zero-install-extra-infra, self-contained HTML review" gap.


Architecture Analysis

Current Implementation

Existing trace / observability infrastructure (ready for reuse β€” DRY):

Component Path Role
ActionEvent / ActionEventType src/praisonai-agents/praisonaiagents/trace/protocol.py Canonical event schema (AGENT_START/END, TOOL_START/END, ERROR, OUTPUT)
TraceSinkProtocol src/praisonai-agents/praisonaiagents/trace/protocol.py Pluggable sink interface (emit, flush, close)
ContextTraceEmitter + get_context_emitter() src/praisonai-agents/praisonaiagents/trace/context_events.py Already wired into Agent.chat/Agent.start β€” emits agent_start, agent_end, llm_response, tool_start, tool_end
LangfuseSink src/praisonai/praisonai/observability/langfuse.py Reference implementation of a TraceSinkProtocol adapter β€” we copy this pattern
--observe flag src/praisonai/praisonai/cli/app.py:124-153 Already parses provider name; currently only accepts langfuse. Extend with langextract.
_setup_langfuse_observability() src/praisonai/praisonai/cli/app.py:153 Pattern for how a sink is wired up at CLI entry β€” copy for _setup_langextract_observability

Zero existing langextract integration in any of:

  • src/praisonai-agents/praisonaiagents/tools/ (no langextract_tools.py)
  • src/praisonai/praisonai/observability/ (only langfuse.py)
  • examples/python/tools/ (only the user's unofficial ~/test/langextract/app.py)
  • PraisonAI-tools/ (nothing)
  • PraisonAIDocs/docs/observability/ (not covered)

So this is a greenfield addition that slots cleanly into existing extension points β€” no protocol changes, no breaking changes.

Key File Locations

File Purpose Why it matters here
src/praisonai-agents/praisonaiagents/trace/protocol.py ActionEvent, TraceSinkProtocol, NoOpSink, ListSink Our new adapter implements TraceSinkProtocol. Zero changes here.
src/praisonai-agents/praisonaiagents/trace/context_events.py ContextTraceEmitter, ContextEvent (with richer LLM token / content fields) Source of events we map to lx.data.Extraction. Zero changes here.
src/praisonai/praisonai/observability/langfuse.py (306 lines) Reference TraceSinkProtocol adapter Template for the new LangextractSink.
src/praisonai/praisonai/observability/__init__.py Lazy-loaded exports Register LangextractSink / LangextractSinkConfig here.
src/praisonai/praisonai/cli/app.py:124-153 --observe typer option + _setup_*_observability Extend to accept langextract and wire up the sink.
src/praisonai/praisonai/cli/commands/ Per-feature CLI subapps New langextract.py sub-app with view, render, open commands.
src/praisonai-agents/praisonaiagents/tools/ Built-in tools New optional langextract_tools.py wrapping lx.extract as a callable tool for agents.

Data-flow (proposed)

Agent.start(input_text)
  └─ ContextTraceEmitter.agent_start/llm_response/tool_start/tool_end/agent_end
       └─ LangextractSink.emit(ActionEvent)          ← NEW
            └─ maps event β†’ lx.data.Extraction
                 Β· extraction_class  = "agent" | "llm_turn" | "tool_call" | "final_output" | "error"
                 Β· extraction_text   = verbatim span from input_text (or from agent output for output classes)
                 Β· char_interval     = exact offsets (None for non-grounded events)
                 Β· attributes        = {agent_name, tool_name, duration_ms, tokens, finish_reason, status}
            └─ buffered list of Extractions
       └─ on agent_end / close():
            Β· build lx.data.AnnotatedDocument(text=input_text, extractions=[...])
            Β· lx.io.save_annotated_documents([doc], output_name=path.jsonl)
            Β· html = lx.visualize(path.jsonl)
            Β· write path.html, optionally webbrowser.open()

Gap Analysis

Area Current State Gap Severity Placement
Observability adapter Only LangfuseSink exists No LangextractSink implementing TraceSinkProtocol Critical (blocks feature) praisonai/observability/langextract.py (wrapper)
Event→Extraction mapping No mapper; ContextEvent has rich fields already Need a pure function that grounds events against input text and produces lx.data.Extraction list Critical praisonai/observability/langextract.py
CLI observe provider --observe langfuse only; hard-coded check at cli/app.py:151 Need langextract accepted; need _setup_langextract_observability() High praisonai/cli/app.py
CLI sub-command No langextract command Need praisonai langextract view, render, open for ad-hoc + session rendering High praisonai/cli/commands/langextract.py
Built-in tool tavily_tools.py, agentql examples β€” nothing for langextract Optional langextract_tools.py exposing lx.extract as a tool (so agents can use langextract, not only be visualized by it) Medium praisonaiagents/tools/langextract_tools.py
Python export No public symbols Export LangextractSink, LangextractSinkConfig via lazy __getattr__ in praisonai/observability/__init__.py High Already lazy-loader exists, just add two entries
Packaging langextract not in any extras Add [langextract] extra to src/praisonai/pyproject.toml + optional-dependencies table High pyproject.toml
Docs No page Add PraisonAIDocs/docs/observability/langextract.mdx with quick-start, API, CLI, examples Medium Docs repo
Examples Only user's unofficial ~/test/langextract/app.py Add examples/python/observability/langextract_basic.py and langextract_with_tools.py Medium examples/python/observability/
Tests None Unit tests for the event→Extraction mapper + sink + CLI wiring; a real-agentic smoke test that runs an Agent and asserts HTML + jsonl are produced High src/praisonai/tests/test_langextract_integration.py
Perf (import time) β€” langextract is heavy; must be a lazy import gated behind the extra High Implementation discipline

No existing test, example, doc, or CLI contract is broken by any of the above β€” this is strictly additive.


Proposed Implementation

Phase 1: Minimal (MVP) β€” LangextractSink adapter + --observe langextract

  1. Add the adapter β€” src/praisonai/praisonai/observability/langextract.py:
"""
Langextract TraceSinkProtocol Implementation for PraisonAI.

Provides LangextractSink adapter that implements TraceSinkProtocol from the core SDK,
producing self-contained interactive HTML visualizations of agent runs grounded in
the original input text.

Architecture:
- Core SDK (praisonaiagents): Defines TraceSinkProtocol (unchanged)
- Wrapper (praisonai): Implements LangextractSink adapter (this file)
- Pattern: Protocol-driven design per AGENTS.md Β§4.1 β€” mirrors LangfuseSink
"""

from __future__ import annotations
import os
import threading
import webbrowser
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Dict, List, Optional

from praisonaiagents.trace.protocol import (
    ActionEvent,
    ActionEventType,
    TraceSinkProtocol,
)


@dataclass
class LangextractSinkConfig:
    """Configuration for the langextract trace sink."""
    output_path: str = "praisonai-trace.html"
    jsonl_path: Optional[str] = None           # derived from output_path if None
    document_id: str = "praisonai-run"
    auto_open: bool = False                     # open HTML in browser on close()
    include_llm_content: bool = True            # include response text in attributes
    include_tool_args: bool = True
    enabled: bool = True


class LangextractSink:
    """
    Implements `TraceSinkProtocol` by accumulating ActionEvents and, on `close()`,
    rendering them as a langextract AnnotatedDocument + interactive HTML.

    Grounding strategy:
      - We record the first AGENT_START's `metadata["input"]` as the source text.
      - OUTPUT events produce extractions grounded against the agent's output.
      - TOOL_* events produce ungrounded extractions (char_interval=None) whose
        `attributes` carry the tool name, args summary, duration, status.
      - AGENT_START/END bracket a run; we emit a single parent "agent" extraction
        spanning the whole document for overview.
    """

    __slots__ = ("_config", "_lock", "_events", "_source_text", "_closed")

    def __init__(self, config: Optional[LangextractSinkConfig] = None) -> None:
        self._config = config or LangextractSinkConfig()
        self._lock = threading.Lock()
        self._events: List[ActionEvent] = []
        self._source_text: Optional[str] = None
        self._closed = False

    # ---- TraceSinkProtocol -------------------------------------------------

    def emit(self, event: ActionEvent) -> None:
        if not self._config.enabled or self._closed:
            return
        with self._lock:
            # Capture source text from first AGENT_START
            if (
                self._source_text is None
                and event.event_type == ActionEventType.AGENT_START.value
                and event.metadata
            ):
                self._source_text = event.metadata.get("input") or ""
            self._events.append(event)

    def flush(self) -> None:
        pass  # no-op; HTML is built on close()

    def close(self) -> None:
        if self._closed:
            return
        self._closed = True
        try:
            self._render()
        except Exception as e:
            # Observability must never break the agent
            import logging
            logging.getLogger(__name__).warning("LangextractSink render failed: %s", e)

    # ---- Rendering ---------------------------------------------------------

    def _render(self) -> None:
        # Lazy import β€” langextract is optional
        import langextract as lx  # type: ignore

        source = self._source_text or ""
        extractions = list(self._events_to_extractions(lx, source))
        doc = lx.data.AnnotatedDocument(
            document_id=self._config.document_id,
            text=source,
            extractions=extractions,
        )

        jsonl = self._config.jsonl_path or (Path(self._config.output_path).with_suffix(".jsonl").as_posix())
        Path(jsonl).parent.mkdir(parents=True, exist_ok=True)
        lx.io.save_annotated_documents([doc], output_name=os.path.basename(jsonl), output_dir=os.path.dirname(jsonl) or ".")

        html = lx.visualize(jsonl)
        html_text = html.data if hasattr(html, "data") else html
        Path(self._config.output_path).write_text(html_text, encoding="utf-8")

        if self._config.auto_open:
            webbrowser.open(f"file://{Path(self._config.output_path).resolve()}")

    def _events_to_extractions(self, lx, source: str):
        """Pure mapper: ActionEvent list -> lx.data.Extraction generator."""
        for ev in self._events:
            et = ev.event_type
            attrs: Dict[str, Any] = {
                "agent_name": ev.agent_name,
                "duration_ms": ev.duration_ms,
                "status": ev.status,
            }
            if et == ActionEventType.AGENT_START.value:
                yield lx.data.Extraction(
                    extraction_class="agent_run",
                    extraction_text=(source[:200] if source else ev.agent_name or "agent"),
                    attributes={**attrs, "kind": "start"},
                )
            elif et == ActionEventType.TOOL_START.value:
                yield lx.data.Extraction(
                    extraction_class="tool_call",
                    extraction_text=ev.tool_name or "tool",
                    attributes={
                        **attrs,
                        "tool_name": ev.tool_name,
                        "tool_args": ev.tool_args if self._config.include_tool_args else None,
                    },
                )
            elif et == ActionEventType.TOOL_END.value:
                yield lx.data.Extraction(
                    extraction_class="tool_result",
                    extraction_text=ev.tool_result_summary or "(empty)",
                    attributes={**attrs, "tool_name": ev.tool_name},
                )
            elif et == ActionEventType.OUTPUT.value:
                yield lx.data.Extraction(
                    extraction_class="final_output",
                    extraction_text=(ev.metadata or {}).get("content", "")[:1000],
                    attributes=attrs,
                )
            elif et == ActionEventType.ERROR.value:
                yield lx.data.Extraction(
                    extraction_class="error",
                    extraction_text=ev.error_message or "error",
                    attributes=attrs,
                )
            # AGENT_END is summary-only β€” skip for now; could produce run stats extraction
  1. Extend the CLI --observe flag β€” src/praisonai/praisonai/cli/app.py:
# Replace cli/app.py:150-153
if observe:
    if observe == "langfuse":
        _setup_langfuse_observability(verbose=verbose)
    elif observe == "langextract":
        _setup_langextract_observability(verbose=verbose)
    else:
        raise typer.BadParameter(
            f"Unsupported observe provider: {observe}. "
            "Choose one of: langfuse, langextract."
        )

_setup_langextract_observability() mirrors _setup_langfuse_observability(): builds a LangextractSinkConfig from env vars (PRAISONAI_LANGEXTRACT_OUTPUT, PRAISONAI_LANGEXTRACT_AUTO_OPEN), constructs LangextractSink, registers it on the global ContextTraceEmitter.

  1. Register lazy exports β€” append to src/praisonai/praisonai/observability/__init__.py:
# In __getattr__
elif name == "LangextractSink":
    from .langextract import LangextractSink
    return LangextractSink
elif name == "LangextractSinkConfig":
    from .langextract import LangextractSinkConfig
    return LangextractSinkConfig
  1. Packaging extra β€” src/praisonai/pyproject.toml:
[project.optional-dependencies]
langextract = ["langextract>=1.0.0"]

Phase 2: Production β€” CLI sub-app + tool + session rendering

  1. New CLI sub-app β€” src/praisonai/praisonai/cli/commands/langextract.py:
import typer
from pathlib import Path
from typing import Optional

app = typer.Typer(name="langextract", help="Render PraisonAI traces with langextract.")


@app.command(name="view")
def view(
    jsonl_path: Path = typer.Argument(..., help="Path to annotated-documents JSONL"),
    output_html: Path = typer.Option("trace.html", "--output", "-o"),
    no_open: bool = typer.Option(False, "--no-open"),
):
    """Render an existing annotated-documents JSONL to an interactive HTML."""
    import langextract as lx
    import webbrowser

    html = lx.visualize(str(jsonl_path))
    html_text = html.data if hasattr(html, "data") else html
    output_html.write_text(html_text, encoding="utf-8")
    typer.echo(f"βœ… Wrote {output_html}")
    if not no_open:
        webbrowser.open(f"file://{output_html.resolve()}")


@app.command(name="render")
def render(
    yaml_path: Path = typer.Argument(..., help="PraisonAI YAML workflow"),
    output_html: Path = typer.Option("workflow.html", "--output", "-o"),
    no_open: bool = typer.Option(False, "--no-open"),
    api_url: Optional[str] = typer.Option(None, "--api-url"),
):
    """Run a workflow end-to-end with LangextractSink attached, then open the HTML."""
    from praisonai.observability import LangextractSink, LangextractSinkConfig
    from praisonai import PraisonAI

    sink = LangextractSink(
        config=LangextractSinkConfig(output_path=str(output_html), auto_open=not no_open)
    )
    # attach sink to the global trace emitter for the duration of the run
    from praisonaiagents.trace import get_context_emitter
    get_context_emitter().add_sink(sink)
    try:
        result = PraisonAI(agent_file=str(yaml_path)).main()
        typer.echo(result)
    finally:
        sink.close()
    typer.echo(f"βœ… Trace rendered: {output_html}")
  1. Built-in tool (Phase 2b) β€” src/praisonai-agents/praisonaiagents/tools/langextract_tools.py:
"""Thin wrapper exposing lx.extract as a callable tool."""
from typing import Any, Dict, List, Optional


def langextract_extract(
    text: str,
    prompt_description: str,
    examples: List[Dict[str, Any]],
    model_id: str = "gemini-2.5-flash",
    extraction_passes: int = 1,
    max_workers: int = 10,
) -> Dict[str, Any]:
    """Run langextract over `text` using the given prompt and few-shot examples.
    Returns the serialized extraction result dict."""
    try:
        import langextract as lx  # lazy
    except ImportError as e:
        raise ImportError("pip install 'praisonai[langextract]'") from e

    # Convert dict examples -> lx.data.ExampleData (DRY helper elsewhere)
    ex_objs = [_dict_to_example(lx, e) for e in examples]
    result = lx.extract(
        text_or_documents=text,
        prompt_description=prompt_description,
        examples=ex_objs,
        model_id=model_id,
        extraction_passes=extraction_passes,
        max_workers=max_workers,
    )
    return {
        "text": result.text,
        "extractions": [
            {
                "extraction_class": e.extraction_class,
                "extraction_text": e.extraction_text,
                "attributes": e.attributes or {},
                "char_interval": (
                    {"start": e.char_interval.start_pos, "end": e.char_interval.end_pos}
                    if e.char_interval else None
                ),
            }
            for e in result.extractions
        ],
    }


def _dict_to_example(lx, d: Dict[str, Any]):
    return lx.data.ExampleData(
        text=d["text"],
        extractions=[
            lx.data.Extraction(
                extraction_class=e["extraction_class"],
                extraction_text=e["extraction_text"],
                attributes=e.get("attributes"),
            )
            for e in d.get("extractions", [])
        ],
    )

Agents can then use this tool directly:

from praisonaiagents import Agent
from praisonaiagents.tools.langextract_tools import langextract_extract

agent = Agent(
    name="extractor",
    instructions="Extract entities from provided text",
    tools=[langextract_extract],
)

Files to Create / Modify

New files

File Purpose
src/praisonai/praisonai/observability/langextract.py LangextractSink + LangextractSinkConfig adapter
src/praisonai/praisonai/cli/commands/langextract.py `praisonai langextract view
src/praisonai-agents/praisonaiagents/tools/langextract_tools.py Optional built-in tool wrapping lx.extract
src/praisonai/tests/test_langextract_integration.py Unit + integration + smoke tests (mirrors test_n8n_integration.py)
src/praisonai/tests/fixtures/sample_trace.jsonl Deterministic fixture for mapper tests
examples/python/observability/langextract_basic.py "Run an agent β†’ open HTML" minimal example
examples/python/observability/langextract_with_tools.py Multi-tool agent rendered as annotated workflow
PraisonAIDocs/docs/observability/langextract.mdx Quick-start, API, CLI, examples, screenshot

Modified files

File Change
src/praisonai/praisonai/observability/__init__.py Add LangextractSink/LangextractSinkConfig to lazy __getattr__; keep __all__ symbolic for TYPE_CHECKING
src/praisonai/praisonai/cli/app.py Replace the hard-coded observe != "langfuse" check with a dispatch that also accepts langextract; add _setup_langextract_observability()
src/praisonai/praisonai/cli/commands/__init__.py (or wherever subapps register) Register the new langextract Typer sub-app
src/praisonai/pyproject.toml Add langextract = ["langextract>=1.0.0"] under [project.optional-dependencies]
src/praisonai-agents/praisonaiagents/tools/__init__.py Lazy export langextract_extract
PraisonAIDocs/docs/observability/index.mdx (if exists) / sidebar config Add sidebar entry for the new page

Zero files need to be removed; zero public APIs change.


Technical Considerations

Dependencies

  • langextract>=1.0.0 (Apache-2.0) β€” pulls google-genai, openai, httpx, beautifulsoup4, pydantic. ~50 MB install.
  • Must be an optional extra (pip install praisonai[langextract]) β€” not a hard dependency.

Performance impact

  • Zero when --observe langextract is not set: LangextractSink is never imported; lazy __getattr__ in praisonai/observability/__init__.py keeps import time flat.
  • With the sink active: a bounded in-memory list of ActionEvents (typically 10s–100s per run). Rendering happens exactly once in close() β€” amortized across the whole run.
  • langextract itself is only imported inside _render() β€” adding the sink alone (without triggering close()) does not pull the heavy dep.
  • Target: no measurable impact on import praisonaiagents (<200ms invariant per AGENTS.md Β§4.2).

Safety / approval

  • Read-only observability β€” no network calls in the default path (lx.visualize is pure local templating). No user data leaves the machine.
  • If users also use langextract_extract as a tool, that calls an LLM β€” must honor the agent's existing approval / policy hooks automatically (no new hook needed; the tool just runs like any other tool).
  • HTML output is written to a user-chosen path; nothing is executed, only rendered.

Multi-agent safety

  • LangextractSink instances are per-run, not global. cli/app.py wires one sink per CLI invocation.
  • Internally _events is guarded by threading.Lock (same pattern as LangfuseSink).
  • For AgentTeam / concurrent agents: each agent's ContextTraceEmitter gets the same sink, which accumulates all events and groups them by agent_name in the rendered HTML.

Backward compatibility

  • Strictly additive: no existing behavior changes. --observe langfuse continues to work identically.
  • TraceSinkProtocol is not modified.
  • New optional extra β€” default installs are unchanged.

Import-time discipline (MUST)

  • src/praisonai/praisonai/observability/langextract.py must not import langextract at module scope. Use import langextract as lx inside _render() only.
  • Likewise langextract_tools.py imports inside the function body.

Acceptance Criteria

  • pip install 'praisonai[langextract]' works and pulls langextract.
  • from praisonai.observability import LangextractSink, LangextractSinkConfig succeeds without triggering a langextract import.
  • praisonai agents.yaml --observe langextract runs the workflow and writes both praisonai-trace.jsonl and praisonai-trace.html.
  • praisonai langextract view trace.jsonl writes an HTML file and (by default) opens it in the browser.
  • praisonai langextract render agents.yaml -o run.html runs the workflow and opens run.html.
  • praisonai --observe langextract agents.yaml still works when langextract is NOT installed β€” fails with a clear message pointing to the extra (not an unhandled ImportError).
  • Python API: with LangextractSink(...) context manager works (implement __enter__/__exit__ in addition to close()).
  • Unit tests cover the eventβ†’Extraction mapper with β‰₯90% line coverage; tests are deterministic (no network, no LLM calls).
  • Real agentic smoke test: runs a simple Agent with one tool, asserts jsonl is valid + HTML contains the agent name, tool name, and input text.
  • Multi-agent test: two agents running sequentially produce one HTML that shows both agents distinctly.
  • Cold-import benchmark: python -c "import praisonaiagents" is within Β±2% of main (no regression).
  • Cold-import benchmark: python -c "from praisonai.observability import LangextractSink" does not import langextract.
  • Docs page PraisonAIDocs/docs/observability/langextract.mdx published with: prerequisites, one-liner quick-start, screenshot of HTML viewer, CLI reference, Python API reference.
  • Added to the --observe help text: Enable observability (langfuse, langextract).

Implementation Notes

Key files to read first

  1. src/praisonai/praisonai/observability/langfuse.py (306 lines) β€” mirror this pattern exactly for the new sink.
  2. src/praisonai-agents/praisonaiagents/trace/protocol.py β€” ActionEvent and TraceSinkProtocol (unchanged).
  3. src/praisonai-agents/praisonaiagents/trace/context_events.py β€” how events are emitted from Agent.chat; note LLM_RESPONSE and CONTEXT_SNAPSHOT carry the richest data.
  4. src/praisonai/praisonai/cli/app.py:115-180 β€” where --observe is parsed and dispatched.
  5. examples/python/observability/ β€” existing examples for shape of a good example file.
  6. Langextract upstream docs: https://github.com/google/langextract#3-visualize-the-results

Critical integration points

  1. Event capture: hook into get_context_emitter() (already global, already wired into Agent.chat). Use the same registration pattern used by LangfuseSink β€” no SDK changes needed.
  2. Source text grounding: the source text comes from AGENT_START.metadata["input"]. If missing (e.g., programmatic agent.chat() without input metadata), fall back to the concatenation of all prompts. Document this clearly.
  3. Close timing: close() must run even when the agent errors out. Use a try/finally in _setup_langextract_observability so close() fires at interpreter shutdown (via atexit.register) and on exception.
  4. lx.visualize return shape: may be a plain string or an IPython.display.HTML object (.data attribute). Handle both branches (see the LangfuseSink .get() / hasattr pattern).
  5. Char-grounding for ungrounded events: tool calls and errors are not in the source text. Leave char_interval=None; langextract explicitly supports this and will render them in the side panel, not inline.

Testing commands

# Unit + integration
pytest src/praisonai/tests/test_langextract_integration.py -v

# Lazy-import check β€” must NOT trigger langextract import
python - <<'PY'
import sys
from praisonai.observability import LangextractSink
assert 'langextract' not in sys.modules, "LangextractSink import pulled heavy dep!"
print("OK: LangextractSink is lazy")
PY

# Real agentic smoke test
cat > /tmp/agents.yaml <<'YAML'
name: Demo
agents:
  writer:
    role: Writer
    goal: Write a haiku
    llm: gpt-4o-mini
YAML
praisonai /tmp/agents.yaml --observe langextract --quiet
ls -la praisonai-trace.html praisonai-trace.jsonl
python -c "import pathlib; html=pathlib.Path('praisonai-trace.html').read_text(); assert '<html' in html.lower() and 'writer' in html.lower(), 'HTML missing agent'; print('OK')"

# CLI sub-app
praisonai langextract view praisonai-trace.jsonl --no-open -o /tmp/out.html
test -s /tmp/out.html && echo "OK: view produced HTML"

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingclaudeAuto-trigger Claude analysisdocumentationImprovements or additions to documentationperformance

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions