diff --git a/datacenter/ucp/2.1/guides/admin/backups-and-disaster-recovery.md b/datacenter/ucp/2.1/guides/admin/backups-and-disaster-recovery.md index 3d44878fdb7..00d75ff1f7a 100644 --- a/datacenter/ucp/2.1/guides/admin/backups-and-disaster-recovery.md +++ b/datacenter/ucp/2.1/guides/admin/backups-and-disaster-recovery.md @@ -14,29 +14,56 @@ The next step is creating a backup policy and disaster recovery plan. ## Backup policy As part of your backup policy you should regularly create backups of UCP. -To create a backup of UCP, use the `docker/ucp backup` command. This creates -a tar archive with the contents of the [volumes used by UCP](../architecture.md) -to persist data, and streams it to stdout. -You need to run the backup command on a UCP manager node. Since UCP stores -the same data on all manager nodes, you only need to create a backup of a -single node. +To create a UCP backup, you can run the `{{ page.docker_image }} backup` command +on a single UCP manager. This command creates a tar archive with the +contents of all the [volumes used by UCP](../architecture.md) to persist data +and streams it to stdout. + +You only need to run the backup command on a single UCP manager node. Since UCP +stores the same data on all manager nodes, you only need to take periodic +backups of a single manager node. To create a consistent backup, the backup command temporarily stops the UCP containers running on the node where the backup is being performed. User -containers and services are not affected by this. +resources, such as services, containers and stacks are not affected by this +operation and will continue operating as expected. Any long-lasting `exec`, +`logs`, `events` or `attach` operations on the affected manager node will +be disconnected. -To have minimal impact on your business, you should: +Additionally, if UCP is not configured for high availability, you will be +temporarily unable to: + +* Log in to the UCP Web UI +* Perform CLI operations using existing client bundles + +To minimize the impact of the backup policy on your business, you should: -* Schedule the backup to take place outside business hours. * Configure UCP for high availability. This allows load-balancing user requests -across multiple UCP controller nodes. +across multiple UCP manager nodes. +* Schedule the backup to take place outside business hours. ## Backup command -The example below shows how to create a backup of a UCP controller node: +The example below shows how to create a backup of a UCP manager node and +verify its contents: + +```none +# Create a backup, encrypt it, and store it on /tmp/backup.tar +$ docker run --rm -i --name ucp \ + -v /var/run/docker.sock:/var/run/docker.sock \ + {{ page.docker_image }} backup --interactive > /tmp/backup.tar + +# Ensure the backup is a valid tar and list its contents +# In a valid backup file, over 100 files should appear in the list +# and the `./ucp-node-certs/key.pem` file should be present +$ tar --list -f /tmp/backup.tar +``` + +A backup file may optionally be encrypted using a passphrase, as in the +following example: -```bash +```none # Create a backup, encrypt it, and store it on /tmp/backup.tar $ docker run --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ @@ -47,71 +74,83 @@ $ docker run --rm -i --name ucp \ $ gpg --decrypt /tmp/backup.tar | tar --list ``` -## Restore command +## Restore your cluster + +The restore command can be used to create a new UCP cluster from a backup file. +After the restore operation is complete, the following data will be recovered +from the backup file: + +* Users, teams and permissions. +* All UCP configuration options available under `Admin Settings`, such as the +DDC subscription license, scheduling options, Content Trust and authentication +backends. -The example below shows how to restore a UCP controller node from an existing -backup: +There are two ways to restore a UCP cluster: -```bash +* On a manager node of an existing swarm, which is not part of a UCP +installation. In this case, a UCP cluster will be restored from the backup. +* On a docker engine that is not participating in a swarm. In this case, a new +swarm will be created and UCP will be restored on top + +In order to restore an existing UCP installation from a backup, you will need to +first uninstall UCP from the cluster by using the `uninstall-ucp` command + +The example below shows how to restore a UCP cluster from an existing backup +file, presumed to be located at `/tmp/backup.tar`: + +```none $ docker run --rm -i --name ucp \ -v /var/run/docker.sock:/var/run/docker.sock \ - {{ page.docker_image }} restore < backup.tar + {{ page.docker_image }} restore < /tmp/backup.tar ``` -The restore command may also be invoked in interactive mode: +If the backup file is encrypted with a passphrase, you will need to provide the +passphrase to the restore operation: -```bash +```none $ docker run --rm -i --name ucp \ - -v /var/run/docker.sock:/var/run/docker.sock \ - -v /path/to/backup.tar:/config/backup.tar \ - {{ page.docker_image }} restore -i + -v /var/run/docker.sock:/var/run/docker.sock \ + {{ page.docker_image }} restore --passphrase "secret" < /tmp/backup.tar ``` -## Restore your cluster +The restore command may also be invoked in interactive mode, in which case the +backup file should be mounted to the container rather than streamed through +stdin: -The restore command can be used to create a new UCP cluster from a backup file. -After the restore operation is complete, the following data will be copied from -the backup file: - -* Users, Teams and Permissions. -* Cluster Configuration, such as the default Controller Port or the KV store -timeout. -* DDC Subscription License. -* Options on Scheduling, Content Trust, Authentication Methods and Reporting. - -The restore operation may be performed against any Docker Engine, regardless of -swarm membership, as long as the target Engine is not already managed by a UCP -installation. If the Docker Engine is already part of a swarm, that swarm and -all deployed containers and services will be managed by UCP after the restore -operation completes. - -As an example, if you have a cluster with three controller nodes, A, B, and C, -and your most recent backup was of node A: +```none +$ docker run --rm -i --name ucp \ + -v /var/run/docker.sock:/var/run/docker.sock \ + -v /tmp/backup.tar:/config/backup.tar \ + {{ page.docker_image }} restore -i +``` -1. Uninstall UCP from the swarm using the `uninstall-ucp` operation. -2. Restore one of the swarm managers, such as node B, using the most recent - backup from node A. -3. Wait for all nodes of the swarm to become healthy UCP nodes. +## Disaster recovery -You should now have your UCP cluster up and running. +In the event where half or more manager nodes are lost and cannot be recovered +to a healthy state, the system is considered to have lost quorum and can only be +restored through the following disaster recovery procedure. -Additionally, in the event where half or more controller nodes are lost and -cannot be recovered to a healthy state, the system can only be restored through -the following disaster recovery procedure. It is important to note that this -proceedure is not guaranteed to succeed with no loss of either swarm services or -UCP configuration data: +It is important to note that this proceedure is not guaranteed to succeed with +no loss of running services or configuration data. To properly protect against +manager failures, the system should be configured for [high availability](configure/set-up-high-availability.md). 1. On one of the remaining manager nodes, perform `docker swarm init - --force-new-cluster`. This will instantiate a new single-manager swarm by - recovering as much state as possible from the existing manager. This is a - disruptive operation and any existing tasks will be either terminated or - suspended. + --force-new-cluster`. You may need to specify also need to specify an + `--advertise-addr` parameter which is equivalent to the `--host-address` + parameter of the `docker/ucp install` operation. This will instantiate a new + single-manager swarm by recovering as much state as possible from the + existing manager. This is a disruptive operation and existing tasks may be + either terminated or suspended. 2. Obtain a backup of one of the remaining manager nodes if one is not already available. -3. Perform a restore operation on the recovered swarm manager node. -4. For all other nodes of the cluster, perform a `docker swarm leave --force` - and then a `docker swarm join` operation with the cluster's new join-token. -5. Wait for all nodes of the swarm to become healthy UCP nodes. +3. If UCP is still installed on the cluster, uninstall UCP using the + `uninstall-ucp` command. +4. Perform a restore operation on the recovered swarm manager node. +5. Log in to UCP and browse to the nodes page, or use the CLI `docker node ls` + command. +6. If any nodes are listed as `down`, you'll have to manually [remove these + nodes](../configure/scale-your-cluster.md) from the cluster and then re-join + them using a `docker swarm join` operation with the cluster's new join-token. ## Where to go next diff --git a/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services.md b/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services.md index c8c56539351..d89feceefd3 100644 --- a/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services.md +++ b/datacenter/ucp/2.1/guides/admin/configure/use-domain-names-to-access-services.md @@ -192,4 +192,4 @@ If a service is not configured properly for use of the HTTP routing mesh, this information is available in the UI when inspecting the service. More logging from the HTTP routing mesh is available in the logs of the -`ucp-controller` containers on your UCP controller nodes. +`ucp-controller` containers on your UCP manager nodes. diff --git a/datacenter/ucp/2.1/guides/admin/install/plan-installation.md b/datacenter/ucp/2.1/guides/admin/install/plan-installation.md index 5e1b860eef9..82a3b259714 100644 --- a/datacenter/ucp/2.1/guides/admin/install/plan-installation.md +++ b/datacenter/ucp/2.1/guides/admin/install/plan-installation.md @@ -53,7 +53,7 @@ cause poor performance or even failures. ## Load balancing strategy Docker UCP does not include a load balancer. You can configure your own -load balancer to balance user requests across all controller nodes. +load balancer to balance user requests across all manager nodes. If you plan on using a load balancer, you need to decide whether you are going to add the nodes to the load balancer using their IP address, or their FQDN. @@ -83,10 +83,10 @@ need to have a certificate bundle that has: * A ca.pem file with the root CA public certificate, * A cert.pem file with the server certificate and any intermediate CA public certificates. This certificate should also have SANs for all addresses used to -reach the UCP controller, +reach the UCP manager, * A key.pem file with server private key. -You can have a certificate for each controller, with a common SAN. As an +You can have a certificate for each manager, with a common SAN. As an example, on a three node cluster you can have: * node1.company.example.org with SAN ucp.company.org @@ -94,9 +94,9 @@ example, on a three node cluster you can have: * node3.company.example.org with SAN ucp.company.org Alternatively, you can also install UCP with a single externally-signed -certificate for all controllers rather than one for each controller node. +certificate for all managers rather than one for each manager node. In that case, the certificate files will automatically be copied to any new -controller nodes joining the cluster or being promoted into controllers. +manager nodes joining the cluster or being promoted into managers. ## Where to go next diff --git a/datacenter/ucp/2.1/guides/admin/install/uninstall.md b/datacenter/ucp/2.1/guides/admin/install/uninstall.md index 5c8246237f2..caa0536c7b3 100644 --- a/datacenter/ucp/2.1/guides/admin/install/uninstall.md +++ b/datacenter/ucp/2.1/guides/admin/install/uninstall.md @@ -5,21 +5,25 @@ title: Uninstall UCP --- Docker UCP is designed to scale as your applications grow in size and usage. -You can [add and remove nodes](../configure/scale-your-cluster.md) from the cluster, to make -it scale to your needs. +You can [add and remove nodes](../configure/scale-your-cluster.md) from the +cluster, to make it scale to your needs. You can also uninstall Docker Universal Control plane from your cluster. In this case the UCP services are stopped and removed, but your Docker Engines will continue running in swarm mode. You applications will continue running normally. +If you wish to remove a single node from the UCP cluster, you should instead +[Remove that node from the cluster](../configure/scale-your-cluster.md). + After you uninstall UCP from the cluster, you'll no longer be able to enforce role-based access control to the cluster, or have a centralized way to monitor and manage the cluster. -After uninstalling UCP from the cluster, you will no longer be able to -join new nodes using `docker swarm join` unless you reinstall UCP. +After uninstalling UCP from the cluster, you will no longer be able to join new +nodes using `docker swarm join` unless you reinstall UCP. -To uninstall UCP, log in into a manager node using ssh, and run: +To uninstall UCP, log in into a manager node using ssh, and run the following +command: ```bash $ docker run --rm -it \ @@ -29,9 +33,10 @@ $ docker run --rm -it \ ``` This runs the uninstall command in interactive mode, so that you are prompted -for any necessary configuration values. -[Check the reference documentation](../../../reference/cli/index.md) to learn the options -available in the `uninstall-ucp` command. +for any necessary configuration values. Running this command on a single manager +node will uninstall UCP from the entire cluster. [Check the reference +documentation](../../../reference/cli/index.md) to learn the options available +in the `uninstall-ucp` command. ## Swarm mode CA diff --git a/datacenter/ucp/2.1/guides/architecture.md b/datacenter/ucp/2.1/guides/architecture.md index 0a8be715b8a..ab77c123818 100644 --- a/datacenter/ucp/2.1/guides/architecture.md +++ b/datacenter/ucp/2.1/guides/architecture.md @@ -61,7 +61,7 @@ persist the state of UCP. These are the UCP services running on manager nodes: | UCP component | Description | |:--------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ucp-agent | Monitors the node and ensures the right UCP services are running | -| ucp-reconcile | When ucp-agent detects that the node is not running the right UCP services, it starts the ucp-reconcile service to start or stop the necessary services to converge the node to its desired state | +| ucp-reconcile | When ucp-agent detects that the node is not running the right UCP components, it starts the ucp-reconcile container to converge the node to its desired state. It is expected for the ucp-reconcile container to remain in an exited state when the node is healthy. | | ucp-auth-api | The centralized service for identity and authentication used by UCP and DTR | | ucp-auth-store | Stores authentication configurations, and data for users, organizations and teams | | ucp-auth-worker | Performs scheduled LDAP synchronizations and cleans authentication and authorization data | @@ -84,7 +84,7 @@ services running on worker nodes: | UCP component | Description | |:--------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ucp-agent | Monitors the node and ensures the right UCP services are running | -| ucp-reconcile | When ucp-agent detects that the node is not running the right UCP services, it starts the ucp-reconcile service to start or stop the necessary services to converge the node to its desired state | +| ucp-reconcile | When ucp-agent detects that the node is not running the right UCP components, it starts the ucp-reconcile container to converge the node to its desired state. It is expected for the ucp-reconcile container to remain in an exited state when the node is healthy. | | ucp-proxy | A TLS proxy. It allows secure access to the local Docker Engine to UCP components | ## Volumes used by UCP @@ -95,14 +95,14 @@ Docker UCP uses these named volumes to persist data in all nodes where it runs: |:----------------------------|:-----------------------------------------------------------------------------------------| | ucp-auth-api-certs | Certificate and keys for the authentication and authorization service | | ucp-auth-store-certs | Certificate and keys for the authentication and authorization store | -| ucp-auth-store-data | Data of the authentication and authorization store | +| ucp-auth-store-data | Data of the authentication and authorization store, replicated across managers | | ucp-auth-worker-certs | Certificate and keys for authentication worker | | ucp-auth-worker-data | Data of the authentication worker | | ucp-client-root-ca | Root key material for the UCP root CA that issues client certificates | | ucp-cluster-root-ca | Root key material for the UCP root CA that issues certificates for swarm members | | ucp-controller-client-certs | Certificate and keys used by the UCP web server to communicate with other UCP components | | ucp-controller-server-certs | Certificate and keys for the UCP web server running in the node | -| ucp-kv | UCP configuration data | +| ucp-kv | UCP configuration data, replicated across managers. | | ucp-kv-certs | Certificates and keys for the key-value store | | ucp-metrics-data | Monitoring data gathered by UCP | | ucp-metrics-inventory | Configuration file used by the ucp-metrics service | diff --git a/datacenter/ucp/2.1/guides/get-support.md b/datacenter/ucp/2.1/guides/get-support.md index 64d093de843..a07ee8728f8 100644 --- a/datacenter/ucp/2.1/guides/get-support.md +++ b/datacenter/ucp/2.1/guides/get-support.md @@ -17,8 +17,25 @@ Be sure to use your company email when filing tickets. ## Download a support dump -Docker Support engineers may ask you to provide a UCP support dump. For this: +Docker Support engineers may ask you to provide a UCP support dump, which is an +archive that contains UCP system logs and diagnostic information. To obtain a +support dump: -1. Log into UCP with an administrator account. +1. Log into the UCP UI with an administrator account. 2. On the top-right menu, **click your username**, and choose **Support Dump**. + An archive will be downloaded by your browser after a brief time interval. + +If the user interface is not accessible, you may perform the following number of +steps instead to obtain a single-node version of the support dump: + +1. Obtain direct CLI access to the docker daemon on a UCP manager node. + +2. Run the CLI support tool with the following command: + ```bash + $ docker run --rm \ + --name ucp \ + -v /var/run/docker.sock:/var/run/docker.sock \ + {{ page.docker_image }} \ + support > docker-support.tgz + ``` diff --git a/datacenter/ucp/2.1/guides/user/access-ucp/cli-based-access.md b/datacenter/ucp/2.1/guides/user/access-ucp/cli-based-access.md index 70ce67742e6..16b94ea81a1 100644 --- a/datacenter/ucp/2.1/guides/user/access-ucp/cli-based-access.md +++ b/datacenter/ucp/2.1/guides/user/access-ucp/cli-based-access.md @@ -22,7 +22,7 @@ There are two different types of client certificates: * Admin user certificate bundles: allow running docker commands on the Docker Engine of any node, * User certificate bundles: only allow running docker commands through a UCP -controller node. +manager node. ## Download client certificates