Skip to content

A long queue of to-device messages can prevent outgoing federation working #17035

@shyrwall

Description

@shyrwall

Description

Hi

This will be a vague bug report so I'm hoping someone will come with a "aha" moment when reading this.

For the past week my homeserver has been unable to send outgoing federation messages towards matrix.org. Initially after starting synapse some messages go through but after a few seconds it stops and goes into a retry loop.
During these retries matrix.org , or rather cloudflare, returns a http error of 520.

Upon further inspection i managed to narrow it down to a m.direct_to_device (very large json object) being posted by a single user.
After deleting the event from device_federation_outbox everything worked again.

My theory is that the object was too large so matrix.org/cloudflare threw an error and Synapse just kept retrying.

If this is correct is there a bug in Synapse where it should somehow split this into multiple requests?
Or is it a matrix.org bug that has a low limit on requests size?

Attaching the deleted event.

Thank you
bad_edu.log

Steps to reproduce

Homeserver

xmr.se

Synapse Version

Multiple tested, 1.99 and up. Now 1.103.0

Installation Method

pip (from PyPI)

Database

postgresql. Single, no, no

Workers

Multiple workers

Platform

Not relevant.

Configuration

No response

Relevant log output

2024-03-25 19:22:26,936 - synapse.http.matrixfederationclient - 755 - INFO - federation_transaction_transmission_loop-24 - {PUT-O-29} [matrix.org] Got response headers: 520
2024-03-25 19:22:26,936 - synapse.http.matrixfederationclient - 798 - INFO - federation_transaction_transmission_loop-24 - {PUT-O-29} [matrix.org] Request failed: PUT matrix-federation://matrix.org/_matrix/federation/v1/send/1711389457952: HttpResponseException('520: ')

Anything else that would be useful to know?

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions