You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add first-class integration with google/langextract so that a user can run a PraisonAI workflow and view the full execution β agent boundaries, LLM turns, tool calls, final output β as an interactive, self-contained HTML visualization that highlights every step grounded in the source prompt/input text.
This is the visualization analogue of the n8n integration (praisonai n8n open) we just shipped. Where n8n gave us a visual editor for workflows, langextract will give us a visual viewer for workflow executions β a zero-server, zero-sign-up way to inspect what an agent actually did, grounded in the exact spans of input it reasoned about.
Primary UX target:
# Run any YAML workflow with langextract observability
praisonai agents.yaml --observe langextract
# -> produces trace.jsonl + trace.html, opens browser to HTML viewer# Or render an existing trace/session to HTML
praisonai langextract view trace.jsonl --open
Python API target:
frompraisonaiagentsimportAgentfrompraisonai.observabilityimportLangextractSink, LangextractSinkConfigagent=Agent(name="researcher", instructions="Summarize text")
withLangextractSink(config=LangextractSinkConfig(output_path="run.html")):
agent.start("Long input document ...")
# run.jsonl + run.html written; open run.html to explore the trace
Background
What is langextract?
"A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization." β google/langextract README
Key properties that make it ideal as PraisonAI's visualization layer:
Precise source grounding β every extraction maps to an exact char_interval of the source text.
Self-contained interactive HTML β lx.visualize("extractions.jsonl") produces a single HTML file with highlights, timeline, filters. No server, no sign-up.
Stable data model β lx.data.Extraction(extraction_class, extraction_text, attributes, char_interval) + lx.data.AnnotatedDocument is small and round-trippable.
Apache-2.0 β compatible with PraisonAI licensing.
Optional dependency β only activated when the user opts in.
Why this is valuable
Zero-config review UX β today, to understand what an agent did, the user has to trawl terminal logs, Langfuse cloud, or enable verbose mode. A single local HTML file is dramatically simpler.
Grounding debugging β for agents that read large inputs (docs, web pages, transcripts), seeing which spans the agent actually used vs. hallucinated is the Github actions fixΒ #1 debugging ask. langextract renders this natively.
Shareable, offline β the HTML is self-contained; drop it into Slack/a PR for review.
Complements β does not replace β Langfuse β Langfuse is cloud SaaS + production tracing; langextract is local file + run-time review.
Matches the n8n pattern β "one command, external UI becomes your eyes on the workflow" is proven user-friendly.
Current ecosystem state
Today users who want visual trace review must either:
Use --observe langfuse (requires cloud sign-up, API keys, internet).
Write a custom notebook that wires Agent events into a plotting library.
Use output="actions" which only prints to the terminal.
langextract closes the "local-only, zero-install-extra-infra, self-contained HTML review" gap.
Architecture Analysis
Current Implementation
Existing trace / observability infrastructure (ready for reuse β DRY):
_setup_langextract_observability() mirrors _setup_langfuse_observability(): builds a LangextractSinkConfig from env vars (PRAISONAI_LANGEXTRACT_OUTPUT, PRAISONAI_LANGEXTRACT_AUTO_OPEN), constructs LangextractSink, registers it on the global ContextTraceEmitter.
Register lazy exports β append to src/praisonai/praisonai/observability/__init__.py:
# In __getattr__elifname=="LangextractSink":
from .langextractimportLangextractSinkreturnLangextractSinkelifname=="LangextractSinkConfig":
from .langextractimportLangextractSinkConfigreturnLangextractSinkConfig
Must be an optional extra (pip install praisonai[langextract]) β not a hard dependency.
Performance impact
Zero when --observe langextract is not set: LangextractSink is never imported; lazy __getattr__ in praisonai/observability/__init__.py keeps import time flat.
With the sink active: a bounded in-memory list of ActionEvents (typically 10sβ100s per run). Rendering happens exactly once in close() β amortized across the whole run.
langextract itself is only imported inside _render() β adding the sink alone (without triggering close()) does not pull the heavy dep.
Target: no measurable impact on import praisonaiagents (<200ms invariant per AGENTS.md Β§4.2).
Safety / approval
Read-only observability β no network calls in the default path (lx.visualize is pure local templating). No user data leaves the machine.
If users also use langextract_extractas a tool, that calls an LLM β must honor the agent's existing approval / policy hooks automatically (no new hook needed; the tool just runs like any other tool).
HTML output is written to a user-chosen path; nothing is executed, only rendered.
Multi-agent safety
LangextractSink instances are per-run, not global. cli/app.py wires one sink per CLI invocation.
Internally _events is guarded by threading.Lock (same pattern as LangfuseSink).
For AgentTeam / concurrent agents: each agent's ContextTraceEmitter gets the same sink, which accumulates all events and groups them by agent_name in the rendered HTML.
Backward compatibility
Strictly additive: no existing behavior changes. --observe langfuse continues to work identically.
TraceSinkProtocol is not modified.
New optional extra β default installs are unchanged.
Import-time discipline (MUST)
src/praisonai/praisonai/observability/langextract.py must not import langextract at module scope. Use import langextract as lx inside _render() only.
Likewise langextract_tools.py imports inside the function body.
Acceptance Criteria
pip install 'praisonai[langextract]' works and pulls langextract.
from praisonai.observability import LangextractSink, LangextractSinkConfig succeeds without triggering a langextract import.
praisonai agents.yaml --observe langextract runs the workflow and writes both praisonai-trace.jsonl and praisonai-trace.html.
praisonai langextract view trace.jsonl writes an HTML file and (by default) opens it in the browser.
praisonai langextract render agents.yaml -o run.html runs the workflow and opens run.html.
praisonai --observe langextract agents.yaml still works when langextract is NOT installed β fails with a clear message pointing to the extra (not an unhandled ImportError).
Python API: with LangextractSink(...) context manager works (implement __enter__/__exit__ in addition to close()).
Unit tests cover the eventβExtraction mapper with β₯90% line coverage; tests are deterministic (no network, no LLM calls).
Real agentic smoke test: runs a simple Agent with one tool, asserts jsonl is valid + HTML contains the agent name, tool name, and input text.
Multi-agent test: two agents running sequentially produce one HTML that shows both agents distinctly.
Cold-import benchmark: python -c "import praisonaiagents" is within Β±2% of main (no regression).
Cold-import benchmark: python -c "from praisonai.observability import LangextractSink" does not import langextract.
Docs page PraisonAIDocs/docs/observability/langextract.mdx published with: prerequisites, one-liner quick-start, screenshot of HTML viewer, CLI reference, Python API reference.
Added to the --observe help text: Enable observability (langfuse, langextract).
Implementation Notes
Key files to read first
src/praisonai/praisonai/observability/langfuse.py (306 lines) β mirror this pattern exactly for the new sink.
src/praisonai-agents/praisonaiagents/trace/protocol.py β ActionEvent and TraceSinkProtocol (unchanged).
src/praisonai-agents/praisonaiagents/trace/context_events.py β how events are emitted from Agent.chat; note LLM_RESPONSE and CONTEXT_SNAPSHOT carry the richest data.
src/praisonai/praisonai/cli/app.py:115-180 β where --observe is parsed and dispatched.
examples/python/observability/ β existing examples for shape of a good example file.
Event capture: hook into get_context_emitter() (already global, already wired into Agent.chat). Use the same registration pattern used by LangfuseSink β no SDK changes needed.
Source text grounding: the source text comes from AGENT_START.metadata["input"]. If missing (e.g., programmatic agent.chat() without input metadata), fall back to the concatenation of all prompts. Document this clearly.
Close timing: close() must run even when the agent errors out. Use a try/finally in _setup_langextract_observability so close() fires at interpreter shutdown (via atexit.register) and on exception.
lx.visualize return shape: may be a plain string or an IPython.display.HTML object (.data attribute). Handle both branches (see the LangfuseSink .get() / hasattr pattern).
Char-grounding for ungrounded events: tool calls and errors are not in the source text. Leave char_interval=None; langextract explicitly supports this and will render them in the side panel, not inline.
Testing commands
# Unit + integration
pytest src/praisonai/tests/test_langextract_integration.py -v
# Lazy-import check β must NOT trigger langextract import
python - <<'PY'import sysfrom praisonai.observability import LangextractSinkassert 'langextract' not in sys.modules, "LangextractSink import pulled heavy dep!"print("OK: LangextractSink is lazy")PY# Real agentic smoke test
cat > /tmp/agents.yaml <<'YAML'name: Demoagents: writer: role: Writer goal: Write a haiku llm: gpt-4o-miniYAML
praisonai /tmp/agents.yaml --observe langextract --quiet
ls -la praisonai-trace.html praisonai-trace.jsonl
python -c "import pathlib; html=pathlib.Path('praisonai-trace.html').read_text(); assert '<html' in html.lower() and 'writer' in html.lower(), 'HTML missing agent'; print('OK')"# CLI sub-app
praisonai langextract view praisonai-trace.jsonl --no-open -o /tmp/out.html
test -s /tmp/out.html &&echo"OK: view produced HTML"
Overview
Add first-class integration with google/langextract so that a user can run a PraisonAI workflow and view the full execution β agent boundaries, LLM turns, tool calls, final output β as an interactive, self-contained HTML visualization that highlights every step grounded in the source prompt/input text.
This is the visualization analogue of the n8n integration (
praisonai n8n open) we just shipped. Where n8n gave us a visual editor for workflows, langextract will give us a visual viewer for workflow executions β a zero-server, zero-sign-up way to inspect what an agent actually did, grounded in the exact spans of input it reasoned about.Primary UX target:
Python API target:
Background
What is langextract?
Key properties that make it ideal as PraisonAI's visualization layer:
char_intervalof the source text.lx.visualize("extractions.jsonl")produces a single HTML file with highlights, timeline, filters. No server, no sign-up.lx.data.Extraction(extraction_class, extraction_text, attributes, char_interval)+lx.data.AnnotatedDocumentis small and round-trippable.Why this is valuable
Current ecosystem state
Today users who want visual trace review must either:
--observe langfuse(requires cloud sign-up, API keys, internet).Agentevents into a plotting library.output="actions"which only prints to the terminal.langextract closes the "local-only, zero-install-extra-infra, self-contained HTML review" gap.
Architecture Analysis
Current Implementation
Existing trace / observability infrastructure (ready for reuse β DRY):
ActionEvent/ActionEventTypesrc/praisonai-agents/praisonaiagents/trace/protocol.pyTraceSinkProtocolsrc/praisonai-agents/praisonaiagents/trace/protocol.pyemit,flush,close)ContextTraceEmitter+get_context_emitter()src/praisonai-agents/praisonaiagents/trace/context_events.pyAgent.chat/Agent.startβ emitsagent_start,agent_end,llm_response,tool_start,tool_endLangfuseSinksrc/praisonai/praisonai/observability/langfuse.pyTraceSinkProtocoladapter β we copy this pattern--observeflagsrc/praisonai/praisonai/cli/app.py:124-153langfuse. Extend withlangextract._setup_langfuse_observability()src/praisonai/praisonai/cli/app.py:153_setup_langextract_observabilityZero existing langextract integration in any of:
src/praisonai-agents/praisonaiagents/tools/(nolangextract_tools.py)src/praisonai/praisonai/observability/(onlylangfuse.py)examples/python/tools/(only the user's unofficial~/test/langextract/app.py)PraisonAI-tools/(nothing)PraisonAIDocs/docs/observability/(not covered)So this is a greenfield addition that slots cleanly into existing extension points β no protocol changes, no breaking changes.
Key File Locations
src/praisonai-agents/praisonaiagents/trace/protocol.pyActionEvent,TraceSinkProtocol,NoOpSink,ListSinkTraceSinkProtocol. Zero changes here.src/praisonai-agents/praisonaiagents/trace/context_events.pyContextTraceEmitter,ContextEvent(with richer LLM token / content fields)lx.data.Extraction. Zero changes here.src/praisonai/praisonai/observability/langfuse.py(306 lines)TraceSinkProtocoladapterLangextractSink.src/praisonai/praisonai/observability/__init__.pyLangextractSink/LangextractSinkConfighere.src/praisonai/praisonai/cli/app.py:124-153--observetyper option +_setup_*_observabilitylangextractand wire up the sink.src/praisonai/praisonai/cli/commands/langextract.pysub-app withview,render,opencommands.src/praisonai-agents/praisonaiagents/tools/langextract_tools.pywrappinglx.extractas a callable tool for agents.Data-flow (proposed)
Gap Analysis
LangfuseSinkexistsLangextractSinkimplementingTraceSinkProtocolpraisonai/observability/langextract.py(wrapper)ContextEventhas rich fields alreadylx.data.Extractionlistpraisonai/observability/langextract.py--observe langfuseonly; hard-coded check atcli/app.py:151langextractaccepted; need_setup_langextract_observability()praisonai/cli/app.pylangextractcommandpraisonai langextract view,render,openfor ad-hoc + session renderingpraisonai/cli/commands/langextract.pytavily_tools.py,agentqlexamples β nothing for langextractlangextract_tools.pyexposinglx.extractas a tool (so agents can use langextract, not only be visualized by it)praisonaiagents/tools/langextract_tools.pyLangextractSink,LangextractSinkConfigvia lazy__getattr__inpraisonai/observability/__init__.pylangextractnot in any extras[langextract]extra tosrc/praisonai/pyproject.toml+ optional-dependencies tablepyproject.tomlPraisonAIDocs/docs/observability/langextract.mdxwith quick-start, API, CLI, examples~/test/langextract/app.pyexamples/python/observability/langextract_basic.pyandlangextract_with_tools.pyexamples/python/observability/src/praisonai/tests/test_langextract_integration.pylangextractis heavy; must be a lazy import gated behind the extraNo existing test, example, doc, or CLI contract is broken by any of the above β this is strictly additive.
Proposed Implementation
Phase 1: Minimal (MVP) β
LangextractSinkadapter +--observe langextractsrc/praisonai/praisonai/observability/langextract.py:--observeflag βsrc/praisonai/praisonai/cli/app.py:_setup_langextract_observability()mirrors_setup_langfuse_observability(): builds aLangextractSinkConfigfrom env vars (PRAISONAI_LANGEXTRACT_OUTPUT,PRAISONAI_LANGEXTRACT_AUTO_OPEN), constructsLangextractSink, registers it on the globalContextTraceEmitter.src/praisonai/praisonai/observability/__init__.py:src/praisonai/pyproject.toml:Phase 2: Production β CLI sub-app + tool + session rendering
src/praisonai/praisonai/cli/commands/langextract.py:src/praisonai-agents/praisonaiagents/tools/langextract_tools.py:Agents can then use this tool directly:
Files to Create / Modify
New files
src/praisonai/praisonai/observability/langextract.pyLangextractSink+LangextractSinkConfigadaptersrc/praisonai/praisonai/cli/commands/langextract.pysrc/praisonai-agents/praisonaiagents/tools/langextract_tools.pylx.extractsrc/praisonai/tests/test_langextract_integration.pytest_n8n_integration.py)src/praisonai/tests/fixtures/sample_trace.jsonlexamples/python/observability/langextract_basic.pyexamples/python/observability/langextract_with_tools.pyPraisonAIDocs/docs/observability/langextract.mdxModified files
src/praisonai/praisonai/observability/__init__.pyLangextractSink/LangextractSinkConfigto lazy__getattr__; keep__all__symbolic for TYPE_CHECKINGsrc/praisonai/praisonai/cli/app.pyobserve != "langfuse"check with a dispatch that also acceptslangextract; add_setup_langextract_observability()src/praisonai/praisonai/cli/commands/__init__.py(or wherever subapps register)langextractTyper sub-appsrc/praisonai/pyproject.tomllangextract = ["langextract>=1.0.0"]under[project.optional-dependencies]src/praisonai-agents/praisonaiagents/tools/__init__.pylangextract_extractPraisonAIDocs/docs/observability/index.mdx(if exists) / sidebar configZero files need to be removed; zero public APIs change.
Technical Considerations
Dependencies
langextract>=1.0.0(Apache-2.0) β pullsgoogle-genai,openai,httpx,beautifulsoup4,pydantic. ~50 MB install.pip install praisonai[langextract]) β not a hard dependency.Performance impact
--observe langextractis not set:LangextractSinkis never imported; lazy__getattr__inpraisonai/observability/__init__.pykeeps import time flat.ActionEvents (typically 10sβ100s per run). Rendering happens exactly once inclose()β amortized across the whole run.langextractitself is only imported inside_render()β adding the sink alone (without triggeringclose()) does not pull the heavy dep.import praisonaiagents(<200ms invariant per AGENTS.md Β§4.2).Safety / approval
lx.visualizeis pure local templating). No user data leaves the machine.langextract_extractas a tool, that calls an LLM β must honor the agent's existing approval / policy hooks automatically (no new hook needed; the tool just runs like any other tool).Multi-agent safety
LangextractSinkinstances are per-run, not global.cli/app.pywires one sink per CLI invocation._eventsis guarded bythreading.Lock(same pattern asLangfuseSink).AgentTeam/ concurrent agents: each agent'sContextTraceEmittergets the same sink, which accumulates all events and groups them byagent_namein the rendered HTML.Backward compatibility
--observe langfusecontinues to work identically.TraceSinkProtocolis not modified.Import-time discipline (MUST)
src/praisonai/praisonai/observability/langextract.pymust notimport langextractat module scope. Useimport langextract as lxinside_render()only.langextract_tools.pyimports inside the function body.Acceptance Criteria
pip install 'praisonai[langextract]'works and pullslangextract.from praisonai.observability import LangextractSink, LangextractSinkConfigsucceeds without triggering alangextractimport.praisonai agents.yaml --observe langextractruns the workflow and writes bothpraisonai-trace.jsonlandpraisonai-trace.html.praisonai langextract view trace.jsonlwrites an HTML file and (by default) opens it in the browser.praisonai langextract render agents.yaml -o run.htmlruns the workflow and opensrun.html.praisonai --observe langextract agents.yamlstill works whenlangextractis NOT installed β fails with a clear message pointing to the extra (not an unhandledImportError).with LangextractSink(...)context manager works (implement__enter__/__exit__in addition toclose()).Agentwith one tool, asserts jsonl is valid + HTML contains the agent name, tool name, and input text.python -c "import praisonaiagents"is within Β±2% of main (no regression).python -c "from praisonai.observability import LangextractSink"does not importlangextract.PraisonAIDocs/docs/observability/langextract.mdxpublished with: prerequisites, one-liner quick-start, screenshot of HTML viewer, CLI reference, Python API reference.--observehelp text:Enable observability (langfuse, langextract).Implementation Notes
Key files to read first
src/praisonai/praisonai/observability/langfuse.py(306 lines) β mirror this pattern exactly for the new sink.src/praisonai-agents/praisonaiagents/trace/protocol.pyβActionEventandTraceSinkProtocol(unchanged).src/praisonai-agents/praisonaiagents/trace/context_events.pyβ how events are emitted fromAgent.chat; noteLLM_RESPONSEandCONTEXT_SNAPSHOTcarry the richest data.src/praisonai/praisonai/cli/app.py:115-180β where--observeis parsed and dispatched.examples/python/observability/β existing examples for shape of a good example file.Critical integration points
get_context_emitter()(already global, already wired intoAgent.chat). Use the same registration pattern used byLangfuseSinkβ no SDK changes needed.AGENT_START.metadata["input"]. If missing (e.g., programmaticagent.chat()without input metadata), fall back to the concatenation of all prompts. Document this clearly.close()must run even when the agent errors out. Use atry/finallyin_setup_langextract_observabilitysoclose()fires at interpreter shutdown (viaatexit.register) and on exception.lx.visualizereturn shape: may be a plain string or anIPython.display.HTMLobject (.dataattribute). Handle both branches (see the LangfuseSink.get()/hasattrpattern).char_interval=None; langextract explicitly supports this and will render them in the side panel, not inline.Testing commands
References
/Users/praison/praisonai-package/src/praisonai-agents/AGENTS.mdβ Β§4.1 protocol-driven core, Β§4.2 lazy imports, Β§4.9 namingsrc/praisonai/praisonai/observability/langfuse.py~/test/langextract/app.pyβ demonstrateslx.extractwrapped as aBaseTool; informed the Phase 2b tool design