Skip to content

chore: Add a test to validate that Terminated message handling in Cluster Sharding #32771

@JustinPihony

Description

@JustinPihony

A race condition was tested with a local reproducer re #32766 , but this should be part of the regular testing cadence.

The first Terminated signal is handled specially in #32756, which actually ends being put behind other stashed messages, delaying its processing.

To reproduce required at least 9 nodes, with 3 being terminated. But the main part to reproduce was probably having ongoing shard allocation when the nodes were being terminated, so that the terminated signals were received during state update.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions