Skip to content
This repository was archived by the owner on Feb 7, 2024. It is now read-only.

Randomly "Failed connect to pusher" When Sending Message While Server Short Spike #623

Closed
kelvin195eclipse opened this issue Nov 29, 2020 · 4 comments
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed

Comments

@kelvin195eclipse
Copy link

Hi,

I have been using this package on my web app, running with Laravel Queue & Supervisord since June and everything was working great. My app is a lottery result announcing app, that running live from 7-8pm every Wed and Weekends. Which means, during these period of time, my app will have constant high volume of visitors and requests (~5-10 users per second).

However as my app user increases recently, I've been facing intermittent "Failed connect to pusher" problems since last month. I checked all the logs and tried to identify the error, it seems like happening in a pattern where especially during those short load spikes of the server (> 1 of load average, I have 6 cores CPU).

What is actually happening in the background that causing this random fails, and what approaches can I try to resolve it?

Thank you!

@rennokki
Copy link
Collaborator

rennokki commented Dec 1, 2020

Make sure you got enough file descriptors set https://beyondco.de/docs/laravel-websockets/faq/deploying#open-connection-limit

Also check this: http://socketo.me/docs/deploy#ulimit

@rennokki rennokki added good first issue Good for newcomers help wanted Extra attention is needed network Issues caused by the network configuration labels Dec 1, 2020
@kelvin195eclipse
Copy link
Author

kelvin195eclipse commented Dec 2, 2020

Make sure you got enough file descriptors set https://beyondco.de/docs/laravel-websockets/faq/deploying#open-connection-limit

Also check this: http://socketo.me/docs/deploy#ulimit

Thank you for the lead!

I did some research and I found a command to check my server's file descriptors status:

Screenshot 2020-12-02 at 3 19 29 PM

Is this the correct place I could check whether file descriptors limit is the root cause?

Thanks again!

@rennokki rennokki added bug Something isn't working and removed network Issues caused by the network configuration labels Dec 2, 2020
@kelvin195eclipse
Copy link
Author

Update:

My app have just been through the peak session again, and the failed connect to pusher problem happened 12 times in total. Due to my observation on the app, this time I confirmed that it did update on the clients after some delays (even it became failed job afterwards), means the failed jobs can be ignored. By the way, I have also logged the response status code in PusherBroadcaster.php, and the status was 0 for those failed jobs.

Here are the logged items (While all message were sent successfully):

[2020-12-02 19:54:46][20274] Processing: App\Events\UpdateResult
[2020-12-02 19:55:12][20274] Processed: App\Events\UpdateResult
[2020-12-02 19:55:18][20275] Processing: App\Events\UpdateResult
[2020-12-02 19:55:48][20275] Failed: App\Events\UpdateResult
[2020-12-02 19:55:56][20276] Processing: App\Events\UpdateResult
[2020-12-02 19:56:26][20276] Failed: App\Events\UpdateResult
[2020-12-02 19:56:40][20277] Processing: App\Events\UpdateResult
[2020-12-02 19:57:10][20277] Failed: App\Events\UpdateResult
[2020-12-02 19:57:14][20278] Processing: App\Events\UpdateResult
[2020-12-02 19:57:44][20278] Failed: App\Events\UpdateResult
[2020-12-02 19:57:54][20279] Processing: App\Events\UpdateResult
[2020-12-02 19:58:23][20279] Processed: App\Events\UpdateResult
[2020-12-02 19:58:27][20280] Processing: App\Events\UpdateResult
[2020-12-02 19:58:57][20280] Failed: App\Events\UpdateResult
[2020-12-02 19:59:08][20281] Processing: App\Events\UpdateResult
[2020-12-02 19:59:38][20281] Failed: App\Events\UpdateResult
[2020-12-02 19:59:38][20282] Processing: App\Events\UpdateResult
[2020-12-02 20:00:04][20283] Processing: App\Events\UpdateResult
[2020-12-02 20:00:08][20282] Failed: App\Events\UpdateResult
[2020-12-02 20:00:20][20284] Processing: App\Events\UpdateResult
[2020-12-02 20:00:34][20283] Failed: App\Events\UpdateResult
[2020-12-02 20:00:50][20284] Failed: App\Events\UpdateResult
[2020-12-02 20:04:09][20285] Processing: App\Events\UpdateResult
[2020-12-02 20:04:27][20285] Processed: App\Events\UpdateResult
[2020-12-02 20:04:45][20286] Processing: App\Events\UpdateResult
[2020-12-02 20:04:59][20287] Processing: App\Events\UpdateResult
[2020-12-02 20:05:06][20288] Processing: App\Events\UpdateResult
[2020-12-02 20:05:11][20286] Processed: App\Events\UpdateResult
[2020-12-02 20:05:15][20289] Processing: App\Events\UpdateResult
[2020-12-02 20:05:24][20287] Processed: App\Events\UpdateResult
[2020-12-02 20:05:36][20288] Failed: App\Events\UpdateResult
[2020-12-02 20:05:45][20289] Failed: App\Events\UpdateResult

Thank you!

@rennokki
Copy link
Collaborator

rennokki commented Dec 7, 2020

@kelvin195eclipse If you run with Supervisor, please add a stdout_logfile as shown here: https://laravel.com/docs/8.x/queues#configuring-supervisor and after the peak hour passes, check if any errors appear. To me, it seems like the websockets app crashes after a heavy load and since you got logging with Horizon, it's still nothing relevant. We should know what Websockets actually does and if it crashes.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants