Skip to content

feat: Rich Markdown rendering with skin-aware themes and /markdown toggle#5150

Closed
lucaspirola wants to merge 9 commits into
NousResearch:mainfrom
lucaspirola:feat/markdown-rich-v2
Closed

feat: Rich Markdown rendering with skin-aware themes and /markdown toggle#5150
lucaspirola wants to merge 9 commits into
NousResearch:mainfrom
lucaspirola:feat/markdown-rich-v2

Conversation

@lucaspirola
Copy link
Copy Markdown

Summary

Adds full Rich Markdown rendering for CLI responses using Rich's built-in Markdown class, with skin integration and user control:

  • Rich Markdown rendering — headings, bold, italic, code blocks (Pygments syntax highlighting), tables, lists, blockquotes rendered in the terminal
  • Skin-aware code themes — reads code_theme from the active skin, updates live on /skin change
  • /markdown [on|off] command (alias /md) — toggle rendering at runtime, persists to display.markdown in config.yaml
  • Fast-path plain-text detection — skips the markdown parser for responses with no markdown syntax
  • Graceful fallbacktry/except wraps all render calls; renderer crashes fall back to plain text
  • Streaming support — block-boundary detection avoids re-rendering mid-word; code fence tracking prevents broken highlighting
  • Platform hint update — CLI prompt tells the LLM to use markdown freely

Files changed (3 files)

File Changes
cli.py _render_response(), _emit_stream_markdown(), _find_block_boundary(), _handle_markdown_command(), skin theme init
hermes_cli/commands.py /markdown command with /md alias
agent/prompt_builder.py CLI platform hint updated

Issues addressed

Closes #3621, closes #4236, relates to #684

Test plan

  • Unit tests: _has_markdown_syntax() (13 cases), config persistence, command registration
  • Interactive: headings, code blocks, tables, lists render correctly
  • Interactive: /markdown off shows raw markdown; /markdown on re-enables
  • Interactive: /skin ares updates code theme colours live

🤖 Generated with Claude Code

@trevorgordon981
Copy link
Copy Markdown

Substantial feature — streaming block-boundary detection that respects code fences is the right approach for avoiding mid-render chopping, and the fast-path plain-text check + graceful fallback show good defensive thinking. Skin-aware theming is a nice touch.

A few concerns before merge:

1. Fast-path false negative on long responses with late markdown.

return bool(_MD_SYNTAX_RE.search(text[:500] if len(text) > 500 else text))

If markdown syntax doesn't appear in the first 500 chars, _has_markdown_syntax returns False and the response renders as plain text. Agentic responses often have a plain-text preamble ("I checked the file and found that...") followed by tool output, tables, or code blocks later in the response. Those will render as raw | col | col | text and unformatted code blocks.

The author's reasoning ("markdown almost always appears early") holds for short assistant chat but not for tool-heavy agentic outputs — exactly the workload Hermes is built for.

Two fixes:

  • Raise the scan window to something more permissive — 4-8k chars would cover most tool-heavy responses at minimal cost since re.search short-circuits on first match:
    _MD_SCAN_LIMIT = 8192
    return bool(_MD_SYNTAX_RE.search(text[:_MD_SCAN_LIMIT]))
  • Or drop the window entirely and trust re.search to short-circuit. Regex engines bail on first match; the worst case is a plain-text response with no markdown, which scans to end — and even then 100k chars scans in single-digit ms.

The current behavior will surprise users when some responses render beautifully and others show raw syntax.

2. Prompt hint change is inconsistent with markdown_enabled=False users.

"cli": (
    "You are a CLI AI Agent. Your terminal supports full markdown "
    "rendering. Use markdown freely ..."
),

A user who turns off markdown rendering via /markdown off is still told by the system prompt that their terminal supports full markdown rendering. The LLM then emits markdown syntax that the CLI displays raw. This is a regression vs. the old behavior.

Make the prompt hint respect the setting:

# Pass markdown_enabled into the prompt context, or swap hints:
CLI_HINT_MARKDOWN_ON = "... Your terminal supports full markdown rendering. Use markdown freely..."
CLI_HINT_MARKDOWN_OFF = "You are a CLI AI Agent. Use plain text rendered inside a terminal..."

And select at prompt-build time based on the config.

3. Exception swallowing without logging.

except Exception:
    return _rich_text_from_ansi(text)

and

try:
    self._stream_md_console.print(_RichMarkdown(chunk, ...))
except Exception:
    self._stream_md_console.print(chunk)

If Rich Markdown ever chokes on valid markdown (edge case in the parser, unicode surrogate pair, etc.), users see plain output and there's no trail for debugging. At minimum logger.debug("markdown render failed: %s", e) so operators can find the bad input.

4. Skin change during stream isn't reflected.

_stream_md_code_theme / _stream_md_text_color are captured once at stream start:

if not self._stream_box_opened:
    ...
    self._stream_md_code_theme = _skin.get_color("code_theme", "monokai")
    self._stream_md_text_color = _skin.get_color("banner_text", "#FFF8DC")

Running /skin ares mid-response keeps the current response on the old theme. The PR claims "updates live on /skin change" — that's only true between responses. Worth clarifying in the PR description or refreshing theme per-chunk (cheap, just reads from _skin dict).

5. Streaming state has 7 new instance variables.

self._stream_md_buf = ""
self._stream_md_rendered = 0
self._stream_md_fence_open = False
self._stream_md_console = None
self._stream_md_iobuf = None
self._stream_md_code_theme = "monokai"
self._stream_md_text_color = ""

Extraction into a StreamingMarkdownRenderer dataclass/class would keep the main agent cleaner and make it easier to unit-test the block-boundary logic in isolation. Not a blocker; nice-to-have for future maintainability.

Minor nits:

  • text[:500] if len(text) > 500 else text — Python slicing handles short strings gracefully; just text[:500] would work identically.
  • shutil.get_terminal_size((80, 24)).columns called on every chunk render is a small syscall per chunk. Capture once per response.

Ship it with the scan-window fix (#1) and prompt hint consistency (#2) — those are user-visible regressions. The rest are polish.

@lucaspirola
Copy link
Copy Markdown
Author

All addressed in the latest push — thanks for the thorough review.

#1 Scan window — raised to 8192 chars via a named _MD_SCAN_LIMIT constant. Agreed that agentic preamble + late code block is the common Hermes workload.

#2 Prompt hint — added a "cli_no_markdown" entry to PLATFORM_HINTS and the agent is now instantiated with platform="cli_no_markdown" when markdown_enabled=False, so the LLM gets the plain-text instruction consistently.

#3 Exception logging — both handlers now call logger.debug("... failed: %s", _e) before falling back.

#4 Skin/stream clarification — updated the in-code comment to explicitly say the theme is captured once at stream-open and a /skin change mid-stream takes effect on the next response. PR description updated accordingly.

Minor nitstext[:_MD_SCAN_LIMIT] (no conditional), _stream_md_term_width cached at stream open and reused per chunk.

On #5 (StreamingMarkdownRenderer dataclass) — agreed it would be a nice cleanup; left as a follow-up to keep this PR focused.

lucaspirola and others added 9 commits April 19, 2026 11:15
Make the markdown renderer adapt to the active skin's colour palette
instead of hardcoding monokai/white.  _render_response() now accepts
code_theme and text_color from the skin — banner_text becomes the base
paragraph colour (Rich Markdown's style= layers underneath element
styles, preserving heading/code/bold colours), and code_theme falls
back to monokai unless a skin overrides it via get_color("code_theme").

Zero changes to skin definitions or SkinConfig — existing skins and
user-defined skins work automatically through get_color() fallbacks.

Also adds docstrings and inline comments to all markdown rendering code
(regex, fast-path, block boundary detection, chunk rendering, streaming
strategy, flush, command handler) for clarity and maintainability.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CLI now has a full Rich Markdown renderer, so the platform hint
should tell the LLM to use markdown instead of discouraging it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…g, term width cache

- Raise fast-path scan window from 500 to 8192 chars so agentic
  responses with plain-text preambles before tables/code blocks
  still get markdown rendering
- Add cli_no_markdown platform hint; pass it when markdown_enabled=False
  so the LLM doesn't emit markdown that would display as raw syntax
- Add logger.debug() to both renderer exception handlers so render
  failures leave a diagnostic trail
- Cache terminal width at stream-open time to avoid a syscall per chunk
- Clarify in-code comment: skin theme is captured per response, not
  mid-stream (/skin takes effect on the next response)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
System prompt changes (prompt_builder.py, run_agent.py):
- Remove "unless otherwise directed below" loophole from DEFAULT_AGENT_IDENTITY
- Add TONE_AND_STYLE_GUIDANCE and OUTPUT_EFFICIENCY_GUIDANCE constants with
  explicit FORBIDDEN/WRONG→RIGHT examples to enforce concise responses
- Inject both early (position 2, after identity) for open-source models that
  weight earlier instructions more heavily
- Gate injection on non-messaging, non-cron platforms
- Add conciseness hint to cli and cli_no_markdown platform hints

CLI status bar changes (cli.py):
- Remove all decorative Panel borders and response labels (⚕ Hermes)
- Remove separator lines between queries (exchange divider, bg/btw task dividers)
- Status bar hides during inference; shows live counter while agent runs
- After response: status bar switches to ∑ total / ↩ last inference time display
- Timer absent before first message (no Hermes-uptime counter at startup)
- Add _inference_total_seconds and _last_inference_seconds session accumulators
- Add _response_received flag to switch status bar mode after first response
- Per-turn flags (_summary_printed_this_turn, _response_received) reset on each chat()

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
… bad rebase conflict resolutions

Previous rebase conflict resolutions in 866c194 and 48aa22e5 left
display.py with only 134 lines (missing all classes including KawaiiSpinner)
and commands.py with line-number prefixes from cat -n output.

Restore both files from upstream (3a63514) and re-apply the two small
additions our branch actually needed: the /markdown command registration
in commands.py and the skin-aware diff color infrastructure already
present in the upstream display.py.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…conflict resolution

The 60687a4e conflict resolution removed build_environment_hints from the
prompt_builder import, but run_agent.py still calls it at line 3827.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@lucaspirola lucaspirola force-pushed the feat/markdown-rich-v2 branch from 6bb19bb to 8ed6638 Compare April 19, 2026 23:15
@lucaspirola
Copy link
Copy Markdown
Author

Closing in favour of two focused PRs cleanly rebased onto current main:

  • feat: Rich Markdown rendering → lucaspirola:feat/markdown-pr1-clean
  • feat: concise CLI output + status bar timers → lucaspirola:feat/concise-cli-pr2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(cli): Adicionar renderização markdown nativa no output [Feature]: Native Markdown and md table rendering in CLI

2 participants