Skip to content

Heartbeat Manager #106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jan 8, 2019
Merged

Heartbeat Manager #106

merged 12 commits into from
Jan 8, 2019

Conversation

dgasmith
Copy link
Contributor

@dgasmith dgasmith commented Jan 6, 2019

Description

This PR adds manager heartbeats to the project. This allows jobs that have been allocated to managers that do not shut down properly to be recycled and given to other managers.

The heartbeat interval is set by the FractalServer instance and passed to the managers to ensure consistency.

Todos

  • Adds heartbeats, closes Manager Heartbeats #95.
  • Allows shutdown of CLI invocations via SIGINT and SIGTERM to have a better chance of a clean shutdown, closes Safe kill when run as background thread #90.
  • A new FractalClient requests some data from the server to ensure the connection is live. Error messages should be slightly more explicit now.
  • When a FractalServer owns a Manager, all manager REST queries are now run in a background thread so that communication is not blocked. This is not exactly an ideal solution as we need to continuously build threads. There are plenty of ways to improve in the future, but the current version works and is relatively robust.

Status

  • Ready to go

@dgasmith dgasmith added the Feature Adds a new feature to the project label Jan 6, 2019
@dgasmith dgasmith added this to the v0.4.0b milestone Jan 6, 2019
@QCArchiveBot
Copy link
Collaborator

This pull request introduces 3 alerts when merging 961923a into 75c8ccf - view on LGTM.com

new alerts:

  • 1 for Module is imported more than once
  • 1 for Unused local variable
  • 1 for Unused import

Comment posted by LGTM.com

@codecov
Copy link

codecov bot commented Jan 6, 2019

Codecov Report

Merging #106 into master will decrease coverage by 12.91%.
The diff coverage is 88.1%.

@codecov
Copy link

codecov bot commented Jan 6, 2019

Codecov Report

Merging #106 into master will decrease coverage by 0.21%.
The diff coverage is 88.29%.

@dgasmith
Copy link
Contributor Author

dgasmith commented Jan 7, 2019

PyTest and cov will fail due to:
pytest-dev/pytest-cov#253
conda-forge/pytest-cov-feedstock#14

Waiting another few hours should see it resolved.

@dgasmith
Copy link
Contributor Author

dgasmith commented Jan 7, 2019

If anyone has a moment to review this let me know, otherwise merging sometime tonight.

@dgasmith dgasmith merged commit b0676be into MolSSI:master Jan 8, 2019
@dgasmith dgasmith deleted the heartbeat branch January 15, 2019 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Adds a new feature to the project
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Manager Heartbeats Safe kill when run as background thread
2 participants