-
Notifications
You must be signed in to change notification settings - Fork 33
PG-1127 Rewamped HA solution (17) #679
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nastena1606
wants to merge
21
commits into
17
Choose a base branch
from
PG-1127-HA-rewamp-17
base: 17
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
bb9e170
Extended description of architecture
nastena1606 8f3a3e8
PG-1127 Rewamp HA solution
nastena1606 5673f82
Split setup into individual components pages
nastena1606 bcb0094
Reworked the intro
nastena1606 f86706a
Added diagrams to overview
nastena1606 f9ab900
Moved components interaction to a separate page
nastena1606 f5cde85
Updated how components work together
nastena1606 cbf2a6d
Reworked Components part
nastena1606 0ca8732
Added Patroni description
nastena1606 626f125
Markup polishing
nastena1606 2f4b499
Patroni and pgBackRest config'
nastena1606 c6d2b03
Fixed commands, added disclaier about datadir cleanup for Patroni
nastena1606 a740284
keepalived setup
nastena1606 0d5d349
Added HAproxy description to components
nastena1606 4a4380a
Added pgBackRest info
nastena1606 3cd7746
Updated How components work page
nastena1606 1ffa9b7
Updated images, archtecture with 2 types of diagrams, added 3rd HApro…
nastena1606 82ede94
Updated diagram with watchdog component
nastena1606 6b32bc0
Updated Patroni config for 4.0.x versions
nastena1606 5cc3143
Added Troubleshoot Patroni startup options subsection
nastena1606 51481a5
Modified HAProxy and keepalived configuration
nastena1606 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# Architecture layout | ||
|
||
The following diagram shows the architecture of a three-node PostgreSQL cluster with a single-primary node. | ||
|
||
 | ||
|
||
## Components | ||
|
||
The components in this architecture are: | ||
|
||
- PostgreSQL nodes bearing the user data. | ||
|
||
- Patroni - an automatic failover system. | ||
|
||
- etcd - a Distributed Configuration Store that stores the state of the PostgreSQL cluster and handles the election of a new primary. | ||
|
||
- HAProxy - the load balancer for the cluster and the single point of entry to client applications. | ||
|
||
- pgBackRest - the backup and restore solution for PostgreSQL | ||
|
||
- Percona Monitoring and Management (PMM) - the solution to monitor the health of your cluster | ||
|
||
### How components work together | ||
|
||
Each PostgreSQL instance in the cluster maintains consistency with other members through streaming replication. We use the default asynchronous streaming replication during which the primary doesn't wait for the secondaries to acknowledge the receipt of the data to consider the transaction complete. | ||
|
||
Each PostgreSQL instance also hosts Patroni and etcd. Patroni and etcd are responsible for creating and managing the cluster, monitoring the cluster health and handling failover in the case of the outage. | ||
|
||
Patroni periodically sends heartbeat requests with the cluster status to etcd. etcd writes this information to disk and sends the response back to Patroni. If the current primary fails to renew its status as leader within the specified timeout, Patroni updates the state change in etcd, which uses this information to elect the new primary and keep the cluster up and running. | ||
|
||
The connections to the cluster do not happen directly to the database nodes but are routed via a connection proxy like HAProxy. This proxy determines the active node by querying the Patroni REST API. | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,142 @@ | ||
# Configure etcd distributed store | ||
|
||
The distributed configuration store provides a reliable way to store data that needs to be accessed by large scale distributed systems. The most popular implementation of the distributed configuration store is etcd. etcd is deployed as a cluster for fault-tolerance and requires an odd number of members (n/2+1) to agree on updates to the cluster state. An etcd cluster helps establish a consensus among nodes during a failover and manages the configuration for the three PostgreSQL instances. | ||
|
||
This document provides configuration for etcd version 3.5.x. For how to configure etcd cluster with earlier versions of etcd, read the blog post by _Fernando Laudares Camargos_ and _Jobin Augustine_ [PostgreSQL HA with Patroni: Your Turn to Test Failure Scenarios](https://www.percona.com/blog/postgresql-ha-with-patroni-your-turn-to-test-failure-scenarios/) | ||
|
||
If you [installed the software from tarballs](../tarball.md), check how you [enable etcd](../enable-extensions.md#etcd). | ||
|
||
The `etcd` cluster is first started in one node and then the subsequent nodes are added to the first node using the `add `command. | ||
|
||
!!! note | ||
|
||
Users with deeper understanding of how etcd works can configure and start all etcd nodes at a time and bootstrap the cluster using one of the following methods: | ||
|
||
* Static in the case when the IP addresses of the cluster nodes are known | ||
* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time. | ||
|
||
See the [How to configure etcd nodes simultaneously](../how-to.md#how-to-configure-etcd-nodes-simultaneously) section for details. | ||
|
||
### Configure `node1` | ||
|
||
1. Create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node name and IP address with the actual name and IP address of your node. | ||
|
||
```yaml title="/etc/etcd/etcd.conf.yaml" | ||
name: 'node1' | ||
initial-cluster-token: PostgreSQL_HA_Cluster_1 | ||
initial-cluster-state: new | ||
initial-cluster: node1=http://10.104.0.1:2380 | ||
data-dir: /var/lib/etcd | ||
initial-advertise-peer-urls: http://10.104.0.1:2380 | ||
listen-peer-urls: http://10.104.0.1:2380 | ||
advertise-client-urls: http://10.104.0.1:2379 | ||
listen-client-urls: http://10.104.0.1:2379 | ||
``` | ||
|
||
2. Start the `etcd` service to apply the changes on `node1`. | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo systemctl enable --now etcd | ||
$ sudo systemctl start etcd | ||
$ sudo systemctl status etcd | ||
``` | ||
|
||
3. Check the etcd cluster members on `node1`: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo etcdctl member list --write-out=table --endpoints=http://10.104.0.1:2379 | ||
``` | ||
|
||
Sample output: | ||
|
||
```{.text .no-copy} | ||
+------------------+---------+-------+----------------------------+----------------------------+------------+ | ||
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | | ||
+------------------+---------+-------+----------------------------+----------------------------+------------+ | ||
| 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://10.104.0.1:2379 | false | | ||
+------------------+---------+-------+----------------------------+----------------------------+------------+ | ||
``` | ||
|
||
4. Add the `node2` to the cluster. Run the following command on `node1`: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo etcdctl member add node2 --peer-ulrs=http://10.104.0.2:2380 | ||
``` | ||
|
||
??? example "Sample output" | ||
|
||
```{.text .no-copy} | ||
Added member named node2 with ID 10042578c504d052 to cluster | ||
|
||
etcd_NAME="node2" | ||
etcd_INITIAL_CLUSTER="node2=http://10.104.0.2:2380,node1=http://10.104.0.1:2380" | ||
etcd_INITIAL_CLUSTER_STATE="existing" | ||
``` | ||
|
||
### Configure `node2` | ||
|
||
1. Create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes. | ||
|
||
```yaml title="/etc/etcd/etcd.conf.yaml" | ||
name: 'node2' | ||
initial-cluster-token: PostgreSQL_HA_Cluster_1 | ||
initial-cluster-state: existing | ||
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380 | ||
data-dir: /var/lib/etcd | ||
initial-advertise-peer-urls: http://10.104.0.2:2380 | ||
listen-peer-urls: http://10.104.0.2:2380 | ||
advertise-client-urls: http://10.104.0.2:2379 | ||
listen-client-urls: http://10.104.0.2:2379 | ||
``` | ||
|
||
3. Start the `etcd` service to apply the changes on `node2`: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo systemctl enable --now etcd | ||
$ sudo systemctl start etcd | ||
$ sudo systemctl status etcd | ||
``` | ||
|
||
### Configure `node3` | ||
|
||
1. Add `node3` to the cluster. **Run the following command on `node1`** | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo etcdctl member add node3 http://10.104.0.3:2380 | ||
``` | ||
|
||
2. On `node3`, create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes. | ||
|
||
```yaml title="/etc/etcd/etcd.conf.yaml" | ||
name: 'node3' | ||
initial-cluster-token: PostgreSQL_HA_Cluster_1 | ||
initial-cluster-state: existing | ||
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380 | ||
data-dir: /var/lib/etcd | ||
initial-advertise-peer-urls: http://10.104.0.3:2380 | ||
listen-peer-urls: http://10.104.0.3:2380 | ||
advertise-client-urls: http://10.104.0.3:2379 | ||
listen-client-urls: http://10.104.0.3:2379 | ||
``` | ||
|
||
3. Start the `etcd` service to apply the changes. | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo systemctl enable --now etcd | ||
$ sudo systemctl start etcd | ||
$ sudo systemctl status etcd | ||
``` | ||
|
||
4. Check the etcd cluster members. | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo etcdctl member list | ||
``` | ||
|
||
??? example "Sample output" | ||
|
||
``` | ||
2d346bd3ae7f07c4: name=node2 peerURLs=http://10.104.0.2:2380 clientURLs=http://10.104.0.2:2379 isLeader=false | ||
8bacb519ebdee8db: name=node3 peerURLs=http://10.104.0.3:2380 clientURLs=http://10.104.0.3:2379 isLeader=false | ||
c5f52ea2ade25e1b: name=node1 peerURLs=http://10.104.0.1:2380 clientURLs=http://10.104.0.1:2379 isLeader=true | ||
``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
# Initial setup for high availability | ||
|
||
This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni. This guide relies on the provided [architecture](ha-architecture.md) for high-availability. | ||
|
||
## Preconditions | ||
|
||
1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes. | ||
|
||
If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and PostgreSQL is recommended due to performance reasons. | ||
|
||
2. For this setup, we will use the nodes that have the following IP addresses: | ||
|
||
|
||
| Node name | Public IP address | Internal IP address | ||
|---------------|-------------------|-------------------- | ||
| node1 | 157.230.42.174 | 10.104.0.7 | ||
| node2 | 68.183.177.183 | 10.104.0.2 | ||
| node3 | 165.22.62.167 | 10.104.0.8 | ||
| HAProxy-demo | 134.209.111.138 | 10.104.0.6 | ||
|
||
|
||
!!! note | ||
|
||
We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack. | ||
|
||
## Initial setup | ||
|
||
It’s not necessary to have name resolution, but it makes the whole setup more readable and less error prone. Here, instead of configuring a DNS, we use a local name resolution by updating the file `/etc/hosts`. By resolving their hostnames to their IP addresses, we make the nodes aware of each other’s names and allow their seamless communication. | ||
|
||
1. Set the hostname for nodes. Run the following command on each node. Change the node name to `node1`, `node2` and `node3` respectively: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo hostnamectl set-hostname node1 | ||
``` | ||
|
||
2. Modify the `/etc/hosts` file of each PostgreSQL node to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes: | ||
|
||
=== "node1" | ||
|
||
```text hl_lines="3 4" | ||
# Cluster IP and names | ||
10.104.0.1 node1 | ||
10.104.0.2 node2 | ||
10.104.0.3 node3 | ||
``` | ||
|
||
=== "node2" | ||
|
||
```text hl_lines="2 4" | ||
# Cluster IP and names | ||
10.104.0.1 node1 | ||
10.104.0.2 node2 | ||
10.104.0.3 node3 | ||
``` | ||
|
||
=== "node3" | ||
|
||
```text hl_lines="2 3" | ||
# Cluster IP and names | ||
10.104.0.1 node1 | ||
10.104.0.2 node2 | ||
10.104.0.3 node3 | ||
``` | ||
|
||
=== "HAproxy-demo" | ||
|
||
The HAProxy instance should have the name resolution for all the three nodes in its `/etc/hosts` file. Add the following lines at the end of the file: | ||
|
||
```text hl_lines="4 5 6" | ||
# Cluster IP and names | ||
10.104.0.6 HAProxy-demo | ||
10.104.0.1 node1 | ||
10.104.0.2 node2 | ||
10.104.0.3 node3 | ||
``` | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
# Install the software | ||
|
||
## Install Percona Distribution for PostgreSQL | ||
|
||
Run the following commands as root or with `sudo` privileges. | ||
|
||
=== "On Debian / Ubuntu" | ||
|
||
1. Disable the upstream `postgresql-{{pgversion}}` package. | ||
|
||
2. Install the `percona-release` repository management tool | ||
|
||
--8<-- "percona-release-apt.md" | ||
|
||
|
||
3. Enable the repository | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo percona-release setup ppg{{pgversion}} | ||
``` | ||
|
||
4. Install Percona Distribution for PostgreSQL package | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo apt install percona-postgresql-{{pgversion}} | ||
``` | ||
|
||
=== "On RHEL and derivatives" | ||
|
||
1. Check the [platform specific notes](../yum.md#for-percona-distribution-for-postgresql-packages) | ||
|
||
2. Install the `percona-release` repository management tool | ||
|
||
--8<-- "percona-release-yum.md" | ||
|
||
3. 3. Enable the repository | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo percona-release setup ppg{{pgversion}} | ||
``` | ||
|
||
4. Install Percona Distribution for PostgreSQL package | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo apt install percona-postgresql{{pgversion}}-server | ||
``` | ||
|
||
!!! important | ||
|
||
**Don't** initialize the cluster and start the `postgresql` service. The cluster initialization and setup are handled by Patroni during the bootsrapping stage. | ||
|
||
## Install Patroni, etcd, pgBackRest | ||
|
||
=== "On Debian / Ubuntu" | ||
|
||
1. Install some Python and auxiliary packages to help with Patroni and etcd | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo apt install python3-pip python3-dev binutils | ||
``` | ||
|
||
2. Install etcd, Patroni, pgBackRest packages: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo apt install percona-patroni \ | ||
etcd etcd-server etcd-client \ | ||
percona-pgbackrest | ||
``` | ||
|
||
3. Stop and disable all installed services: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo systemctl stop {etcd,patroni,postgresql} | ||
$ sudo systemctl disable {etcd,patroni,postgresql} | ||
``` | ||
|
||
4. Even though Patroni can use an existing Postgres installation, remove the data directory to force it to initialize a new Postgres cluster instance. | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo systemctl stop postgresql | ||
$ sudo rm -rf /var/lib/postgresql/{{pgversion}}/main | ||
``` | ||
|
||
=== "On RHEL and derivatives" | ||
|
||
1. Install some Python and auxiliary packages to help with Patroni and etcd | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo yum install python3-pip python3-devel binutils | ||
``` | ||
|
||
2. Install etcd, Patroni, pgBackRest packages. Check [platform specific notes for Patroni](../yum.md#for-percona-patroni-package): | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo yum install percona-patroni \ | ||
etcd python3-python-etcd\ | ||
percona-pgbackrest | ||
``` | ||
|
||
3. Stop and disable all installed services: | ||
|
||
```{.bash data-prompt="$"} | ||
$ sudo systemctl stop {etcd,patroni,postgresql} | ||
$ systemctl disable {etcd,patroni,postgresql} | ||
``` | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.