Skip to content
This repository was archived by the owner on Nov 6, 2020. It is now read-only.
This repository was archived by the owner on Nov 6, 2020. It is now read-only.

Race Condition and incorrectly dropped consensus messages #11082

@dforsten

Description

@dforsten

In ethcore/src/client/client.rs the implementation of the queue() function of IoChannelQueue there is no guarantee that self.currently_queued is incremented before it is being decremented.

This potentially results in an underflow of self.currently_queued and incorrectly dropped consensus messages if new consensus messages are being processed on another thread at the same time:

2019-09-19 09:17:54 UTC IO Worker #0 DEBUG sync  0 -> Dispatching packet: 21
2019-09-19 09:17:54 UTC IO Worker #0 TRACE sync  Received consensus packet from 0
2019-09-19 09:17:54 UTC IO Worker #0 DEBUG poa  Ignoring the message, error queueing: The queue is full (18446744073709551615)
2019-09-19 09:17:54 UTC IO Worker #0 DEBUG sync  0 -> Dispatching packet: 21
2019-09-19 09:17:54 UTC IO Worker #0 TRACE sync  Received consensus packet from 0
2019-09-19 09:17:54 UTC IO Worker #0 DEBUG poa  Ignoring the message, error queueing: The queue is full (18446744073709551615)
2019-09-19 09:17:54 UTC IO Worker #0 DEBUG sync  0 -> Dispatching packet: 21
2019-09-19 09:17:54 UTC IO Worker #0 TRACE sync  Received consensus packet from 0
2019-09-19 09:17:54 UTC IO Worker #0 DEBUG poa  Ignoring the message, error queueing: The queue is full (18446744073709551615)
2019-09-19 09:17:54 UTC IO Worker #2 DEBUG sync  0 -> Dispatching packet: 21
2019-09-19 09:17:54 UTC IO Worker #2 TRACE sync  Received consensus packet from 0

We have encountered that issue quite frequently when deploying Honey Badger validators, leading to the Honey Badger Consensus Engine to become stuck in configurations with low node counts.

Removing the queue size check promptly fixed the issue. One possible fix for the underflow is to increment self.currently_queued before calling channel.send(), and decrementing it after channel.send() returns an error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    F2-bug 🐞The client fails to follow expected behavior.M4-core ⛓Core client code / Rust.Q2-easy 💃Can be fixed by copy and pasting from StackOverflow.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions