Skip to content

Commit 602fc5e

Browse files
Merge pull request #4437 from pmtk/healthcheck-docs
USHIFT-4288: Update greenboot.mds with healthcheck
2 parents b7bde43 + bf7c1f1 commit 602fc5e

File tree

3 files changed

+29
-76
lines changed

3 files changed

+29
-76
lines changed

docs/config/busybox_running_check.sh

Lines changed: 1 addition & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -2,29 +2,10 @@
22
set -e
33

44
SCRIPT_NAME=$(basename $0)
5-
PODS_NS_LIST=(busybox)
6-
PODS_CT_LIST=(1 )
75

86
# Source the MicroShift health check functions library
97
source /usr/share/microshift/functions/greenboot.sh
108

11-
# Set the exit handler to log the exit status
12-
trap 'script_exit' EXIT
13-
14-
# The script exit handler logging the FAILURE or FINISHED message depending
15-
# on the exit status of the last command
16-
#
17-
# args: None
18-
# return: None
19-
function script_exit() {
20-
[ "$?" -ne 0 ] && status=FAILURE || status=FINISHED
21-
echo $status
22-
}
23-
24-
#
25-
# Main
26-
#
27-
289
# Exit if the current user is not 'root'
2910
if [ $(id -u) -ne 0 ] ; then
3011
echo "The '${SCRIPT_NAME}' script must be run with the 'root' user privileges"
@@ -33,36 +14,7 @@ fi
3314

3415
echo "STARTED"
3516

36-
# Exit if the MicroShift service is not enabled
37-
if [ $(systemctl is-enabled microshift.service 2>/dev/null) != "enabled" ] ; then
38-
echo "MicroShift service is not enabled. Exiting..."
39-
exit 0
40-
fi
41-
4217
# Set the wait timeout for the current check based on the boot counter
4318
WAIT_TIMEOUT_SECS=$(get_wait_timeout)
4419

45-
# Wait for pod images to be downloaded
46-
for i in ${!PODS_NS_LIST[@]}; do
47-
CHECK_PODS_NS=${PODS_NS_LIST[$i]}
48-
49-
echo "Waiting ${WAIT_TIMEOUT_SECS}s for pod image(s) from the '${CHECK_PODS_NS}' namespace to be downloaded"
50-
wait_for ${WAIT_TIMEOUT_SECS} namespace_images_downloaded
51-
done
52-
53-
# Wait for pods to enter ready state
54-
for i in ${!PODS_NS_LIST[@]}; do
55-
CHECK_PODS_NS=${PODS_NS_LIST[$i]}
56-
CHECK_PODS_CT=${PODS_CT_LIST[$i]}
57-
58-
echo "Waiting ${WAIT_TIMEOUT_SECS}s for ${CHECK_PODS_CT} pod(s) from the '${CHECK_PODS_NS}' namespace to be in 'Ready' state"
59-
wait_for ${WAIT_TIMEOUT_SECS} namespace_pods_ready
60-
done
61-
62-
# Verify that pods are not restarting
63-
for i in ${!PODS_NS_LIST[@]}; do
64-
CHECK_PODS_NS=${PODS_NS_LIST[$i]}
65-
66-
echo "Checking pod restart count in the '${CHECK_PODS_NS}' namespace"
67-
namespace_pods_not_restarting ${CHECK_PODS_NS}
68-
done
20+
/usr/bin/microshift healthcheck -v=2 --timeout="${WAIT_TIMEOUT_SECS}s" --namespace busybox --deployments busybox-deployment

docs/contributor/greenboot.md

Lines changed: 22 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -46,25 +46,30 @@ sudo journalctl -o cat -u greenboot-healthcheck.service
4646

4747
### Health Check Implementation
4848

49-
The script utilizes the MicroShift health check functions that are available
50-
in the `/usr/share/microshift/functions/greenboot.sh` file to reuse procedures
51-
already implemented for the MicroShift core services. These functions need a
52-
definition of the user workload namespaces and the expected count of pods.
53-
54-
```bash
55-
PODS_NS_LIST=(busybox)
56-
PODS_CT_LIST=(1 )
57-
```
49+
The script utilizes the MicroShift `healthcheck` command that is part of the
50+
`microshift` binary to reuse procedures already implemented for the MicroShift
51+
core services.
5852

5953
The script starts by running sanity checks to verify that it is executed from
60-
the `root` account and that the MicroShift service is enabled.
61-
62-
Finally, the MicroShift health check functions are called to perform the
63-
following actions:
64-
- Get a wait timeout of the current boot cycle for the `wait_for` function
65-
- Call the `namespace_images_downloaded` function to wait until pod images are available
66-
- Call the `namespace_pods_ready` function to wait until pods are ready
67-
- Call the `namespace_pods_not_restarting` function to verify pods are not restarting
54+
the `root` account.
55+
56+
Then, it executes `microshift healthcheck` command with following options:
57+
- `-v=2` to increase verbosity of the output
58+
- `--timeout="${WAIT_TIMEOUT_SECS}s"` to override default 300s timeout value
59+
- `--namespace busybox` to specify the Namespace of the workloads
60+
- `--deployments busybox-deployment` to specify Deployment to check the readiness of
61+
62+
Internally, `microshift healthcheck` checks if workload of the provided type exists and verifies its
63+
status for the specified timeout duration, so the amount of ready replicas (Pods) matches the expected amount.
64+
65+
`microshift healthcheck` also accepts other parameters to specify other kinds
66+
of workload: `--daemonsets` and `--statefulsets`. These options take
67+
comma-delimited list of resources, e.g.: `--daemonsets ovnkube-master,ovnkube-node`.
68+
69+
Alternatively, a `--custom` option can be used with a JSON string, for example:
70+
```
71+
microshift healthcheck --custom '{"openshift-storage":{"deployments": ["lvms-operator"], "daemonsets": ["vg-manager"]}, "openshift-ovn-kubernetes":{"daemonsets": ["ovnkube-master", "ovnkube-node"]}}'
72+
```
6873

6974
## MicroShift Service Failure
7075

docs/user/greenboot.md

Lines changed: 6 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -49,16 +49,12 @@ following commands:
4949
Exiting the health check script with a non-zero status will have the boot declared
5050
as failed. The following validations are performed by the script.
5151

52-
|Validation |Pass |Fail |
53-
|-----------------------------------------------------|------|------|
54-
|Check the script runs with 'root' permissions |Next |exit 0|
55-
|Check microshift.service is enabled |Next |exit 0|
56-
|Wait for microshift.service to be active (!failed) |Next |exit 1|
57-
|Wait for Kubernetes API health endpoints to be OK |Next |exit 1|
58-
|Wait for any Pod to start |Next |exit 1|
59-
|For each core namespace, wait for images to be pulled|Next |exit 1|
60-
|For each core namespace, wait for Pods to be ready |Next |exit 1|
61-
|For each core namespace, check Pods not restarting |exit 0|exit 1|
52+
| Validation | Pass | Fail |
53+
|-------------------------------------------------------------|------|--------|
54+
| Check the script runs with 'root' permissions | Next | exit 0 |
55+
| Check microshift.service is enabled | Next | exit 0 |
56+
| Wait for microshift.service to be active (!failed) | Next | exit 1 |
57+
| For each core namespace, wait for readiness of the workload | Next | exit 1 |
6258

6359
The pre-rollback script runs the `sudo microshift-cleanup-data --ovn` command
6460
to prepare the system for a potential software downgrade.

0 commit comments

Comments
 (0)