Skip to content

[Bug]: Matrix Gateway: Race condition between auto-redaction and message delivery with high-speed models #19075

@mshostack

Description

@mshostack

Bug Description

NOTE: This is Matrix gateway related but your bug report form does not have a Matrix gateway option.

When using low-latency LLM models (I was using gemini-3.1-flash-lite-preview), the Matrix gateway would only respond with "response truncated due to output length limit," even when the actual content is well within the 4,000-character limit. This was me using the Element X client on android. This was really frustrating as I could not even self-diagnose from my phone while using Matrix and had to wait till I was back at my laptop to get access to hermes chat.

Steps to Reproduce

  1. Use a high-speed model (gemini-3.1-flash-lite-preview).
  2. Observe that the bot sends a message followed immediately by a redaction of system reactions.
  3. The gateway logs confirm successful delivery (sent event) followed by redacted, but the user interface/logs report a delivery error.
  4. Comment out the redaction, then try again. Message comes through fine.

Expected Behavior

I expected to receive a non-error response to all my messages regardless of length. I even got this error when sending a simple "test" message.

Actual Behavior

"response truncated due to output length limit"

Affected Component

Gateway (Telegram/Discord/Slack/WhatsApp), Other

Messaging Platform (if gateway-related)

No response

Debug Report

I am not comfortable outputting all of the content in there as some of those logs included PII from a brief scan.

I am a human writing this, this report is not clanker slop reporting.

Operating System

Ubuntu 24.04.4 LTS (Noble Numbat)

Python Version

3.12.3

Hermes Version

v2026.4.30-161-gf98b5d00a

Additional Logs / Traceback (optional)

Root Cause Analysis (optional)

This seems like it may be an architectural race condition in gateway/platforms/matrix.py. The MatrixAdapter performs automated cleanup (redacting 👀 processing reactions and bot-seeded approval ✅ ❌ buttons) in tight succession with the actual message delivery.

With high-speed models, the redaction request is reaching the Matrix homeserver before or immediately alongside the message delivery confirmation. This triggers a false-positive in the gateway's monitoring/tracking logic, where it interprets the sudden "event missing" state (due to redaction) as a failure or truncation of the primary message delivery.

Proposed Fix (optional)

Decouple redaction logic from the immediate message delivery/processing loop. Implementing a mandatory delay of say, 5-10 seconds before performing auto-redaction cleanup might allow the message delivery status to stabilize, preventing the race condition and the subsequent false-positive error reporting.

Are you willing to submit a PR for this?

  • I'd like to fix this myself and submit a PR

Metadata

Metadata

Assignees

No one assigned

    Labels

    P1High — major feature broken, no workaroundcomp/gatewayGateway runner, session dispatch, deliveryplatform/matrixMatrix adapter (E2EE)type/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions