Skip to content

Rabbit delays writing data to socket till hearbeat is sent. #1228

Closed
@pydevd

Description

@pydevd
Environment:
 - Vagrant (1.9.1) Ubuntu 14.04:
     - Docker (version 1.12.5):
         - RabbitMQ v3.6.5
         - Client (Celery app v4.0.2, Python 2.7)
         - Server (Celery app v4.0.2, Python 3.5)

Use case: functional tests.

Workflow #1:
 1. py.test (TestApp) starts in docker RabbitMQ and Server application.
 2. TestApp registers "new" Client by sending task to Server.
 3. TestApp starts in docker Client application.
 4. Client and Server do a "handshake" by own protocol.
 5. TestApp sends task to Client for test purposes (TestTask).
 6. Client receives task immediately and executes it.


Workflow #2:
 1. py.test (TestApp) starts in docker RabbitMQ and Server application.
 2. TestApp registers "active" Client by sending task to Server.
 3. TestApp starts in docker Client application.
 4. TestApp sends task to Client for test purposes (TestTask).
 6. Client receives task after 60 seconds (THIS IS A PROBLEM).


Workflow #2:
 1. py.test (TestApp) starts in docker RabbitMQ and Server application.
 2. TestApp registers "active" Client by sending task to Server.
 3. TestApp starts in docker Client application.
 4. TestApp sleeps for 20 seconds.
 5. TestApp sends task to Client for test purposes (TestTask).
 6. Client receives task immediately and executes it.

What I've researched:

When Celery app establishing connection with RabbitMQ, they negotiate a Heartbeat Timeout Interval,
which is 60 seconds and this value set in Celery configuration for Client.

Debugging "internals" of Celery in Workflow #2, I've found, that "epoll" returns RabbitMQ connection socket in "ready for read" state
only after RabbiMQ will send his Heartbeat to this connection (60 seconds for my configuration).

From Workflows #1 & #3 we can see, that if there is a little delay between after Client started and TestTask sent
(handshake and synthetic delay), everything is working perfectly.

I have no ideas about this behavior. I need tasks be executed as soon as they will be sent/retrieved, not after this big delay.

I can fix tests by decreasing Heartbeat Timeout Interval, but this is not an option for production.

What can you suggest?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions