Skip to content
This repository was archived by the owner on Apr 26, 2024. It is now read-only.
This repository was archived by the owner on Apr 26, 2024. It is now read-only.

CS-API /messages requests fail on develop when using postgres due to NumericValueOutOfRange if some events have repeatedly failed to backfill #13929

Closed
@DMRobertson

Description

@DMRobertson

https://sentry.tools.element.io/organizations/element/issues/33447/?project=2&query=is%3Aunresolved

Running this query:

SELECT backward_extrem.event_id, event.depth FROM events AS event
/**
* Get the edge connections from the event_edges table
* so we can see whether this event's prev_events points
* to a backward extremity in the next join.
*/
INNER JOIN event_edges AS edge
ON edge.event_id = event.event_id
/**
* We find the "oldest" events in the room by looking for
* events connected to backwards extremeties (oldest events
* in the room that we know of so far).
*/
INNER JOIN event_backward_extremities AS backward_extrem
ON edge.prev_event_id = backward_extrem.event_id
/**
* We use this info to make sure we don't retry to use a backfill point
* if we've already attempted to backfill from it recently.
*/
LEFT JOIN event_failed_pull_attempts AS failed_backfill_attempt_info
ON
failed_backfill_attempt_info.room_id = backward_extrem.room_id
AND failed_backfill_attempt_info.event_id = backward_extrem.event_id
WHERE
backward_extrem.room_id = ?
/* We only care about non-state edges because we used to use
* `event_edges` for two different sorts of "edges" (the current
* event DAG, but also a link to the previous state, for state
* events). These legacy state event edges can be distinguished by
* `is_state` and are removed from the codebase and schema but
* because the schema change is in a background update, it's not
* necessarily safe to assume that it will have been completed.
*/
AND edge.is_state is ? /* False */
/**
* Exponential back-off (up to the upper bound) so we don't retry the
* same backfill point over and over. ex. 2hr, 4hr, 8hr, 16hr, etc.
*
* We use `1 << n` as a power of 2 equivalent for compatibility
* with older SQLites. The left shift equivalent only works with
* powers of 2 because left shift is a binary operation (base-2).
* Otherwise, we would use `power(2, n)` or the power operator, `2^n`.
*/
AND (
failed_backfill_attempt_info.event_id IS NULL
OR ? /* current_time */ >= failed_backfill_attempt_info.last_attempt_ts + /*least*/%s((1 << failed_backfill_attempt_info.num_attempts) * ? /* step */, ? /* upper bound */)
)
/**
* Sort from highest to the lowest depth. Then tie-break on
* alphabetical order of the event_ids so we get a consistent
* ordering which is nice when asserting things in tests.
*/
ORDER BY event.depth DESC, backward_extrem.event_id DESC

with args

[['!redacted_room_name:example.com', False, 1664362899049, 3600000, 604800000]]

from

txn.execute(
sql % (least_function,),
(
room_id,
False,
self._clock.time_msec(),
1000 * BACKFILL_EVENT_EXPONENTIAL_BACKOFF_STEP_SECONDS,
1000 * BACKFILL_EVENT_BACKOFF_UPPER_BOUND_SECONDS,
),
)

Failed handle request via "RoomMessageListRestServlet": "<XForwardedForRequest at 0x7fba292b3490 method='GET' uri='/_matrix/client/r0/rooms/!redacted_room_id:example.com/messages?from=s3305809885_757284974_68989_1616590947_1642068190_3598845_602714760_5320124732_0&dir=b&limit=50&filter=%7B%22lazy_load_members%22%3Atrue%7D' clientproto='HTTP/1.1' site='11106'>"

Running on 1.68.0 (b=matrix-org-hotfixes,3f30bdca19)

Metadata

Metadata

Assignees

Labels

A-Messages-Endpoint/messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill)O-UncommonMost users are unlikely to come across this or unexpected workflowS-MajorMajor functionality / product severely impaired, no satisfactory workaround.T-DefectBugs, crashes, hangs, security vulnerabilities, or other reported issues.X-RegressionSomething broke which worked on a previous releaseX-Release-BlockerMust be resolved before making a release

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions