Skip to content

Checking for Dead Agents #365

@ryanstwrt

Description

@ryanstwrt

I have a centralized agent who is continually checking to see if other agents are still running. I am currently lopping through a dictionary list I created when each agent was initialized and grabbing each agent using self._proxy_server.proxy(agent). Where self._proxy_server is proxy.NSProxy(). Once I have an agent I use ka.get_attr('_running') to determine if it is running. This has worked in the past when I have less than 100 agents, however, I am finding that I am getting the following error:

Pyro4.errors.CommunicationError: cannot connect to ('localhost', 43296): [Errno 111] Connection refused)

This error is triggered on self._proxy_server.proxy(agent). Is there a better way to determine if agents have failed somehow? On a side note, I don't have a simple reproducible example; I apologize, however, I've had no luck reproducing it in a smaller scale. Thank you!

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions