Skip to content

fix(prompt): allow ZWJ emoji in context files#18616

Open
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/issue-18581-zwj-emoji-allowed
Open

fix(prompt): allow ZWJ emoji in context files#18616
liuhao1024 wants to merge 1 commit into
NousResearch:mainfrom
liuhao1024:fix/issue-18581-zwj-emoji-allowed

Conversation

@liuhao1024
Copy link
Copy Markdown
Contributor

Summary

  • Remove U+200D (Zero Width Joiner) from _CONTEXT_INVISIBLE_CHARS so that standard Unicode emoji sequences no longer silently block entire context files
  • Add regression tests: ZWJ emoji passes through, other invisible chars still blocked

Problem

_CONTEXT_INVISIBLE_CHARS in agent/prompt_builder.py included \u200d (ZWJ), which is the standard Unicode mechanism for gendered and compound emoji (e.g. U+1F938 U+200D U+2640 U+FE0F for woman cartwheeling). When any context file (SOUL.md, AGENTS.md, HERMES.md, etc.) contained a ZWJ emoji, _scan_context_content() replaced the entire file with a [BLOCKED: ...] message, silently losing all context.

Fix

Remove \u200d from the set. The remaining characters are all legitimately suspicious in context files:

Character Name Why suspicious
\u200b Zero-width space Steganography, hidden text
\u200c Zero-width non-joiner Rare in normal text
\u2060 Word joiner Rare in normal text
\ufeff BOM / ZWNBSP Suspicious mid-text
\u202a\u202e Directional overrides Prompt injection vector

ZWJ (\u200d) is different: it is the Unicode Consortium's standard mechanism for joining emoji into sequences. Blocking it censors legitimate user content.

Tests

  • test_zwj_emoji_allowed_in_context: ZWJ emoji in SOUL.md passes through unchanged
  • test_other_invisible_chars_still_blocked: All other invisible characters remain blocked

Closes #18581

U+200D (Zero Width Joiner) is the standard Unicode mechanism for
gendered and compound emoji sequences.  The invisible-char scanner
blocked entire context files (SOUL.md, AGENTS.md, etc.) when any
ZWJ was present, silently censoring legitimate emoji.

Remove U+200D from _CONTEXT_INVISIBLE_CHARS while keeping all other
truly suspicious invisible characters (zero-width space, directional
overrides, BOM).  Add regression tests for both the allowed ZWJ path
and the continued blocking of the remaining invisible characters.

Closes NousResearch#18581
@alt-glitch alt-glitch added type/bug Something isn't working P1 High — major feature broken, no workaround comp/agent Core agent loop, run_agent.py, prompt builder labels May 2, 2026
Cyrene963 pushed a commit to Cyrene963/hermes-agent that referenced this pull request May 3, 2026
Community PRs applied:
- NousResearch#18596: Enable secret redaction by default (SECURITY)
- NousResearch#18650: Sanitize malformed tool messages + auto-recover on API 400
- NousResearch#18607: Emergency compression before max_iterations exhaustion
- NousResearch#18603: Compression fallback to main model on 413 rate limit
- NousResearch#18638: Pass threshold_percent on model switch
- NousResearch#18663: Strip extra_content from tool_calls for strict APIs
- NousResearch#18618: Forward explicit_api_key to OpenRouter
- NousResearch#18632: Show cache tokens in /insights breakdown
- NousResearch#18614: Add idempotency guard for patch duplicate loops
- NousResearch#18600: Raise ValueError when HERMES_HOME unset in profile mode
- NousResearch#18616: Allow ZWJ emoji in context files
- NousResearch#18582: Reload .env on /restart
- NousResearch#18547: Stabilize system prompt prefix for KV cache reuse
- NousResearch#18692: Strip FTS5 operators from session search truncation terms

Fix: Add order_by_last_active=True to list_sessions_rich call
(pre-existing commit 142b4bf code sync)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp/agent Core agent loop, run_agent.py, prompt builder P1 High — major feature broken, no workaround type/bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🤸‍♀️ SOUL.md blocked by ZWJ emoji — cartwheel gymnast triggers prompt injection filter

2 participants