Skip to content
This repository was archived by the owner on Feb 7, 2024. It is now read-only.

[BUG]: Duplicate connections when the previous connection was not closed. #724

Closed
myckhel opened this issue Mar 26, 2021 · 16 comments
Closed

Comments

@myckhel
Copy link

myckhel commented Mar 26, 2021

I have duplicate connections and subscriptions to channels in my redis database which is causing some huge bugs.

I will first of all ask

How do we handle connections that are not active for some time?

when i make a new websocket connection i and subscribe to eg. presence-user.1-online
i think my subscription will be saved in redis db presence-user.1-online/users with:

459436735.214668603 {"user_id":2002,"user_info":true}

now if i unsubscribe from presence-user.1-online the 459436735.214668603 | {"user_id":2002,"user_info":true} will be deleted.

But bug happens if my disconnection was not handled.
e.g
if my internet goes of and i close the app then we should expect that the websocket connection was not closed when the app closed.
which means i will still have my subscription persisted in the database: 459436735.214668603 | {"user_id":2002,"user_info":true}

now i have internet and i launched the app again
i got a new websocket connection 459436736.214668604 and i tried to subscribe to the same channel presence-user.1-online
now i have duplicate subscriptions:

459436735.214668603 {"user_id":2002,"user_info":true}
459436736.214668604 {"user_id":2002,"user_info":true}
-- --

when i unsubscribe i will have:

459436735.214668603 {"user_id":2002,"user_info":true}

i think this connection that was not closed will be there in db as long as possible causing unexpected issues.

but when i discovered this issue i was thinking this case should have been handled using something similar to the ping pong
maybe if the server pings the connection and does not respond within some amount of time then all the instances of the connection should be deleted.

i think something similar to this approach will avoid duplicate connections and cleanup unused resources in the db.

@myckhel myckhel changed the title BUG: Duplicate connections when the previous connection was not closed. [BUG]: Duplicate connections when the previous connection was not closed. Mar 26, 2021
@simonbuehler
Copy link
Contributor

should be fixed in #708

@myckhel
Copy link
Author

myckhel commented Mar 30, 2021

should be fixed in #708

This pull request has not solved the issue.

I hope you understood my explanations.

@simonbuehler
Copy link
Contributor

#708 aimed at the local Channelmanager, maybe something similar is hidden in the redis manager

@rennokki rennokki reopened this Mar 31, 2021
@myckhel
Copy link
Author

myckhel commented Mar 31, 2021

@rennokki
Copy link
Collaborator

@myckhel Starting with a Redis database fixes the job? It seems like the Lock might already exist in Redis and won't pass the checks: https://github.com/simonbuehler/laravel-websockets/blob/0dc2db12c43a344249149f26815f20edf96ff421/src/ChannelManagers/RedisChannelManager.php#L378

@myckhel
Copy link
Author

myckhel commented Apr 19, 2021

@myckhel Starting with a Redis database fixes the job? It seems like the Lock might already exist in Redis and won't pass the checks: https://github.com/simonbuehler/laravel-websockets/blob/0dc2db12c43a344249149f26815f20edf96ff421/src/ChannelManagers/RedisChannelManager.php#L378

You meant restarting?
if so then
restarting with redis database doesnt fix the issue
still have some socket connections not been cleaned up

screen shot of my redis db with lot of unused or expired sockets

Screenshot 2021-04-19 at 22 53 17

really dont know much about redis programmatically
thats why i have not been able to debug deeply

but i tried to do little and discovered the callback at never gets called: https://github.com/simonbuehler/laravel-websockets/blob/0dc2db12c43a344249149f26815f20edf96ff421/src/ChannelManagers/RedisChannelManager.php#L378

here is the screenshot of the debug

if the callback was ever called 2 should be printed in the console but only 1 got printed at every 10 secs.

Screenshot 2021-04-19 at 23 01 16

@myckhel
Copy link
Author

myckhel commented Apr 20, 2021

Debugging Further.

from the referenced line function: https://github.com/simonbuehler/laravel-websockets/blob/0dc2db12c43a344249149f26815f20edf96ff421/src/ChannelManagers/RedisChannelManager.php#L378

i tried moving the callback from inside and above the $this->lock()->get(fn)

$this->getConnectionsFromSet(0, now()->subMinutes(1)->format('U'))
          ->then(function ($connections) {
              foreach ($connections as $socketId => $appId) {
                  $connection = $this->fakeConnectionForApp($appId, $socketId);
                  dump($socketId, $connection);
                  $this->unsubscribeFromAllChannels($connection);
              }
          });

i reduced ->subMinutes(2) to ->subMinutes(1) for faster debugging.

Now i tried to reproduce the bug but this time everything worked fine as expected.

Reproduction

  • subscribe to a presence channel
  • stop the websocket server. (so that the subscription will still be active in the redis db)
  • restart the websocket server
  • resubscribe to presence channel. (so that there will be duplicate socket connections)
  • unsubscribe from presence channel. (the previous connection not properly closed will still be active in the db)

After above reproduction steps and while refreshing the redis db to see changes within one minute.
when unsubscribed from the presence channel (the previous connection which was not properly closed will still active in the db)

when 1 minute passed, i tried to refresh the db again to see if the (the previous connection which was not properly closed is still active in the db) but it was gone which means the function actually worked for connections that didint pong within 1 minute.

so i guess there is something wrong with the redis lock.

@myckhel
Copy link
Author

myckhel commented Apr 20, 2021

Manually deleting the {appname}_socket_database_laravel-websockets:channel-manager:lock key from the redis db fixed the issue.

i think some how the redis lock was not released and being persistent in the db forever.

and every time websocket tries to attempt a lock then the lock is always busy because it was never released.

so the get($callback) wont execute.

maybe there should be a way to forceRelease the lock if being locked for maxTime or else the lock will never be released from the db in that case.

Update:
As laravel database schema suggested https://laravel.com/docs/8.x/cache#atomic-locks-prerequisites-database

I think it should be a good idea to have expired_at key with value of the time when the lock will expire.

then from the removeObsoleteConnections() we check if the db is locked and the lock expired_at is less than current_time and forceRelease() the lock.

a similar approach will fix the issue when the redis lock was never released.

@myckhel
Copy link
Author

myckhel commented Apr 20, 2021

from https://github.com/simonbuehler/laravel-websockets/blob/0dc2db12c43a344249149f26815f20edf96ff421/src/ChannelManagers/RedisChannelManager.php#L812

protected function lock()
    {
        // set the expire  time to 5 secs. // debugging purpose only
        return new RedisLock($this->redis, static::$lockName, 5);
    }

AND
not making the release to release the lock: https://github.com/laravel/framework/blob/856cdbaf55b4304da0389b3ce33044f4b105e0fe/src/Illuminate/Cache/RedisLock.php#L51

public function release()
    {
        return (bool) true;//$this->redis->eval(LuaScripts::releaseLock(), 1, $this->name, $this->owner);
    }

Automatically

redis will delete the {appname}_socket_database_laravel-websockets:channel-manager:lock after 5 secs.
so when the lock was not release and no way to release it programmatically then redis expiry feature will delete the key, so we dont get stuck when the lock will never be released.

so setting an expiry seconds will be better.
i think this will be a better approach from the previous.

@zek
Copy link

zek commented Apr 24, 2021

Any news for this issue? I also have the exact problem

@myckhel
Copy link
Author

myckhel commented Apr 26, 2021

Any news for this issue? I also have the exact problem

@zek Only if you need the fix asap before being officially fixed.

i suggest you install the pulled_request branch temporarily and try it out.

Steps

// composer.json

"require": {
        ...
        "beyondcode/laravel-websockets": "dev-redis-lock-timeout",
        ...
},
"repositories": [
      {
          "type": "vcs",
          "url": "https://github.com/myckhel/laravel-websockets.git"
      }
    ],
composer update

@zek
Copy link

zek commented Apr 26, 2021

@myckhel I've tried it however I still have some issues. I stopped using redis database but it still persist. So I guess there're some other problems.

@myckhel
Copy link
Author

myckhel commented Apr 26, 2021

did you try to delete the lock key ({appname}_socket_database_laravel-websockets:channel-manager:lock) if exists from the redis database?

after i did that i have not had any issue like that since then.

@myckhel
Copy link
Author

myckhel commented May 19, 2021

@myckhel I've tried it however I still have some issues. I stopped using redis database but it still persist. So I guess there're some other problems.

i discovered too.

redis lock timeout solved some part of the issues.
there is still more issue within the removeObsoleteConnections method.

i guess it does not removeObsoleteConnections for other users which their connections are Obsolete but only removes for the current user i guess.

@danyelkeddah
Copy link

hey @myckhel any new findings or updates here ?

@myckhel
Copy link
Author

myckhel commented Jun 12, 2021

@danyelkeddah i guess i will have to debug removeObsoleteConnections later on because the method does not properly remove some obselete presence channels connections.

@mpociot mpociot closed this as completed Feb 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants