diff --git a/docs/_images/PPG_links.png b/docs/_images/PPG_links.png
deleted file mode 100644
index 9ba9add26..000000000
Binary files a/docs/_images/PPG_links.png and /dev/null differ
diff --git a/docs/_images/diagrams/HA-basic.svg b/docs/_images/diagrams/HA-basic.svg
new file mode 100644
index 000000000..d47d87be8
--- /dev/null
+++ b/docs/_images/diagrams/HA-basic.svg
@@ -0,0 +1,4 @@
+
+
+
+
\ No newline at end of file
diff --git a/docs/_images/diagrams/ha-architecture-patroni.png b/docs/_images/diagrams/ha-architecture-patroni.png
deleted file mode 100644
index 0f18b0d61..000000000
Binary files a/docs/_images/diagrams/ha-architecture-patroni.png and /dev/null differ
diff --git a/docs/_images/diagrams/ha-overview-backup.svg b/docs/_images/diagrams/ha-overview-backup.svg
new file mode 100644
index 000000000..03b06cda1
--- /dev/null
+++ b/docs/_images/diagrams/ha-overview-backup.svg
@@ -0,0 +1,3 @@
+
+
+
\ No newline at end of file
diff --git a/docs/_images/diagrams/ha-overview-failover.svg b/docs/_images/diagrams/ha-overview-failover.svg
new file mode 100644
index 000000000..ea77da45c
--- /dev/null
+++ b/docs/_images/diagrams/ha-overview-failover.svg
@@ -0,0 +1,3 @@
+
+
+
PostgreSQL
Primary
PostgreSQL
Replicas
Replication
Failover
\ No newline at end of file
diff --git a/docs/_images/diagrams/ha-overview-load-balancer.svg b/docs/_images/diagrams/ha-overview-load-balancer.svg
new file mode 100644
index 000000000..318ede1ed
--- /dev/null
+++ b/docs/_images/diagrams/ha-overview-load-balancer.svg
@@ -0,0 +1,3 @@
+
+
+
PostgreSQL
Primary
PostgreSQL
Replicas
Replication
Failover
Client
Load balancing proxy
\ No newline at end of file
diff --git a/docs/_images/diagrams/ha-overview-replication.svg b/docs/_images/diagrams/ha-overview-replication.svg
new file mode 100644
index 000000000..114320498
--- /dev/null
+++ b/docs/_images/diagrams/ha-overview-replication.svg
@@ -0,0 +1,4 @@
+
+
+
+
PostgreSQL
Primary
PostgreSQL
Replicas
Replication
\ No newline at end of file
diff --git a/docs/_images/diagrams/ha-recommended.svg b/docs/_images/diagrams/ha-recommended.svg
new file mode 100644
index 000000000..4fe393fa6
--- /dev/null
+++ b/docs/_images/diagrams/ha-recommended.svg
@@ -0,0 +1,3 @@
+
+
+
Proxy Layer
HAProxy-Node2
HAProxy-Node1
Database layer
DCS Layer
ETCD-Node2
ETCD-Node3
ETCD-Node1
Replica 2
Primary
Replica 1
Stream Replication
PostgreSQL
Patroni
ETCD
PMM Client
PMM Server
pgBackRest (Backup Server)
Stream Replication
PostgreSQL
Patroni
ETCD
PMM Client
PostgreSQL
Patroni
ETCD
PMM Client
Read/write
Read Only
Application
PMM Client
PMM Client
PMM Client
PMM Client
PMM Client
HAProxy-Node3
PMM Client
watchdog
watchdog
watchdog
\ No newline at end of file
diff --git a/docs/_images/diagrams/patroni-architecture.png b/docs/_images/diagrams/patroni-architecture.png
deleted file mode 100644
index 20729d3c4..000000000
Binary files a/docs/_images/diagrams/patroni-architecture.png and /dev/null differ
diff --git a/docs/apt.md b/docs/apt.md
index 455b95652..05f119366 100644
--- a/docs/apt.md
+++ b/docs/apt.md
@@ -140,12 +140,17 @@ Run all the commands in the following sections as root or using the `sudo` comma
Install `pg_gather`
-
```{.bash data-prompt="$"}
$ sudo apt install percona-pg-gather
```
- Some extensions require additional setup in order to use them with Percona Distribution for PostgreSQL. For more information, refer to [Enabling extensions](enable-extensions.md).
+ Install `pgvector`
+
+ ```{.bash data-prompt="$"}
+- $ sudo apt install percona-postgresql-{{pgversion}}-pgvector
+ ```
+
+ Some extensions require additional setup in order to use them with Percona Distribution for PostgreSQL. For more information, refer to [Enabling extensions](enable-extensions.md).
### Start the service
diff --git a/docs/css/design.css b/docs/css/design.css
index 14f9728b6..f4531cdd3 100644
--- a/docs/css/design.css
+++ b/docs/css/design.css
@@ -142,6 +142,16 @@
--md-typeset-table-color: hsla(var(--md-hue),0%,100%,0.25)
}
+[data-md-color-scheme="percona-light"] img[src$="#only-dark"],
+[data-md-color-scheme="percona-light"] img[src$="#gh-dark-mode-only"] {
+ display: none; /* Hide dark images in light mode */
+}
+
+[data-md-color-scheme="percona-dark"] img[src$="#only-light"],
+[data-md-color-scheme="percona-dark"] img[src$="#gh-light-mode-only"] {
+ display: none; /* Hide light images in dark mode */
+}
+
/* Typography */
.md-typeset {
@@ -269,6 +279,7 @@
vertical-align: baseline;
padding: 0 0.2em 0.1em;
border-radius: 0.15em;
+ white-space: pre-wrap; /* Ensure long lines wrap */
}
.md-typeset .highlight code span,
.md-typeset code,
diff --git a/docs/docker.md b/docs/docker.md
index 88ab7cd89..82079add5 100644
--- a/docs/docker.md
+++ b/docs/docker.md
@@ -27,6 +27,7 @@ For more information about using Docker, see the [Docker Docs :octicons-link-ext
| `percona-pgaudit{{pgversion}}_set_user`| An additional layer of logging and control when unprivileged users must escalate themselves to superuser or object owner roles in order to perform needed maintenance tasks.|
| `percona-pg_repack{{pgversion}}`| rebuilds PostgreSQL database objects.|
| `percona-wal2json{{pgversion}}` | a PostgreSQL logical decoding JSON output plugin.|
+ | `percona-pgvector` | A vector similarity search for PostgreSQL|
## Start the container {.power-number}
@@ -97,6 +98,71 @@ Where:
`tag-multi` is the tag specifying the version you need. For example, `{{dockertag}}-multi`. The `multi` part of the tag serves to identify the architecture (x86_64 or ARM64) and pull the respective image.
* `address` is the network address where your database container is running. Use 127.0.0.1, if the database container is running on the local machine/host.
+## Enable encryption
+
+Percona Distribution for PostgreSQL Docker image includes the `pg_tde` extension to provide data encryption. You must explicitly enable it when you start the container.
+
+Here's how to do this:
+{.power-number}
+
+1. Start the container with the `ENABLE_PG_TDE=1` environment variable:
+
+ ```{.bash data-prompt="$"}
+ $ docker run --name container-name -e ENABLE_PG_TDE=1 -e POSTGRES_PASSWORD=sUpers3cRet -d percona/percona-distribution-postgresql:{{dockertag}}-multi
+ ```
+
+ where:
+
+ * `container-name` is the name you assign to your container
+ * `ENABLE_PG_TDE=1` adds the `pg_tde` to the `shared_preload_libraries` and enables the custom storage manager
+ * `POSTGRES_PASSWORD` is the superuser password
+
+
+2. Connect to the container and start the interactive `psql` session:
+
+ ```{.bash data-prompt="$"}
+ $ docker exec -it container-name psql
+ ```
+
+ ??? example "Sample output"
+
+ ```{.text .no-copy}
+ psql ({{dockertag}} - Percona Server for PostgreSQL {{dockertag}}.1)
+ Type "help" for help.
+
+ postgres=#
+ ```
+
+3. Create the extension in the database where you want to encrypt data. This requires superuser privileges.
+
+ ```sql
+ CREATE EXTENSION pg_tde;
+ ```
+
+4. Configure a key provider. In this sample configuration intended for testing and development purpose, we use a local keyring provider.
+
+ For production use, set up an external key management store and configure an external key provider. Refer to the [Setup :octicons-link-external-16:](https://percona.github.io/pg_tde/main/setup.html#key-provider-configuration) chapter in the `pg_tde` documentation.
+
+ :material-information: Warning: This example is for testing purposes only:
+
+ ```sql
+ SELECT pg_tde_add_key_provider_file('file-keyring','/tmp/pg_tde_test_local_keyring.per');
+ ```
+
+5. Add a principal key
+
+ ```sql
+ SELECT pg_tde_set_principal_key('test-db-master-key','file-keyring');
+ ```
+
+ The key is autogenerated. You are ready to use data encryption.
+
+6. Create a table with encryption enabled. Pass the `USING tde_heap` clause to the `CREATE TABLE` command:
+
+ ```sql
+ CREATE TABLE () USING tde_heap;
+ ```
+
## Enable `pg_stat_monitor`
To enable the `pg_stat_monitor` extension after launching the container, do the following:
diff --git a/docs/enable-extensions.md b/docs/enable-extensions.md
index ebc6449cd..8e334eb7d 100644
--- a/docs/enable-extensions.md
+++ b/docs/enable-extensions.md
@@ -137,6 +137,14 @@ wal_level = logical
Start / restart the server to apply the changes.
+## pgvector
+
+To get started, enable the extension for the database where you want to use it:
+
+```sql
+CREATE EXTENSION vector;
+```
+
## Next steps
[Connect to PostgreSQL :material-arrow-right:](connect.md){.md-button}
diff --git a/docs/how-to.md b/docs/how-to.md
deleted file mode 100644
index 86acdd79e..000000000
--- a/docs/how-to.md
+++ /dev/null
@@ -1,75 +0,0 @@
-# How to
-
-## How to configure etcd nodes simultaneously
-
-!!! note
-
- We assume you have a deeper knowledge of how etcd works. Otherwise, refer to the configuration where you add etcd nodes one by one.
-
-Instead of adding `etcd` nodes one by one, you can configure and start all nodes in parallel.
-
-1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
-
- === "node1"
-
- ```yaml title="/etc/etcd/etcd.conf.yaml"
- name: 'node1'
- initial-cluster-token: PostgreSQL_HA_Cluster_1
- initial-cluster-state: new
- initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
- data-dir: /var/lib/etcd
- initial-advertise-peer-urls: http://10.104.0.1:2380
- listen-peer-urls: http://10.104.0.1:2380
- advertise-client-urls: http://10.104.0.1:2379
- listen-client-urls: http://10.104.0.1:2379
- ```
-
- === "node2"
-
- ```yaml title="/etc/etcd/etcd.conf.yaml"
- name: 'node2'
- initial-cluster-token: PostgreSQL_HA_Cluster_1
- initial-cluster-state: new
- initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
- data-dir: /var/lib/etcd
- initial-advertise-peer-urls: http://10.104.0.2:2380
- listen-peer-urls: http://10.104.0.2:2380
- advertise-client-urls: http://10.104.0.2:2379
- listen-client-urls: http://10.104.0.2:2379
- ```
-
- === "node3"
-
- ```yaml title="/etc/etcd/etcd.conf.yaml"
- name: 'node1'
- initial-cluster-token: PostgreSQL_HA_Cluster_1
- initial-cluster-state: new
- initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
- data-dir: /var/lib/etcd
- initial-advertise-peer-urls: http://10.104.0.3:2380
- listen-peer-urls: http://10.104.0.3:2380
- advertise-client-urls: http://10.104.0.3:2379
- listen-client-urls: http://10.104.0.3:2379
- ```
-
-2. Enable and start the `etcd` service on all nodes:
-
- ```{.bash data-prompt="$"}
- $ sudo systemctl enable --now etcd
- ```
-
- During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created.
-
-3. Check the etcd cluster members. Connect to one of the nodes and run the following command:
-
- ```{.bash data-prompt="$"}
- $ sudo etcdctl member list
- ```
-
- The output resembles the following:
-
- ```
- 2d346bd3ae7f07c4: name=node2 peerURLs=http://10.104.0.2:2380 clientURLs=http://10.104.0.2:2379 isLeader=false
- 8bacb519ebdee8db: name=node3 peerURLs=http://10.104.0.3:2380 clientURLs=http://10.104.0.3:2379 isLeader=false
- c5f52ea2ade25e1b: name=node1 peerURLs=http://10.104.0.1:2380 clientURLs=http://10.104.0.1:2379 isLeader=true
- ```
diff --git a/docs/minor-upgrade.md b/docs/minor-upgrade.md
index 61f06bb29..2ffa4a457 100644
--- a/docs/minor-upgrade.md
+++ b/docs/minor-upgrade.md
@@ -24,6 +24,24 @@ Minor upgrade of Percona Distribution for PostgreSQL includes the following step
Before the upgrade, [update the `percona-release` :octicons-link-external-16:](https://www.percona.com/doc/percona-repo-config/percona-release.html#updating-percona-release-to-the-latest-version) utility to the latest version. This is required to install the new version packages of Percona Distribution for PostgreSQL.
+## Before you start
+
+1. [Update the `percona-release` :octicons-link-external-16:](https://www.percona.com/doc/percona-repo-config/percona-release.html#updating-percona-release-to-the-latest-version) utility to the latest version. This is required to install the new version packages of Percona Distribution for PostgreSQL.
+
+2. Starting with version 17.2.1, `pg_tde` is part of the Percona Server for PostgreSQL package. If you installed `pg_tde` from its dedicated package, do the following to avoid conflicts during the upgrade:
+
+ * Drop the extension using the `DROP EXTENSION` with `CASCADE` command.
+
+ :material-alert: Warning: The use of the `CASCADE` parameter deletes all tables that were created in the database with `pg_tde` enabled and also all dependencies upon the encrypted table (e.g. foreign keys in a non-encrypted table used in the encrypted one).
+
+ ```sql
+ DROP EXTENSION pg_tde CASCADE
+ ```
+
+ * Uninstall the `percona-postgresql-17-pg-tde` package for Debian/Ubuntu or the `percona-pg_tde_17` package for RHEL and derivatives.
+
+## Procedure
+
Run **all** commands as root or via **sudo**:
{.power-number}
diff --git a/docs/release-notes-v17.2.md b/docs/release-notes-v17.2.md
new file mode 100644
index 000000000..4558c7a39
--- /dev/null
+++ b/docs/release-notes-v17.2.md
@@ -0,0 +1,48 @@
+# Percona Distribution for PostgreSQL 17.2.1 ({{date.17_2}})
+
+--8<-- "release-notes-intro.md"
+
+This release of Percona Distribution for PostgreSQL is based on Percona Server for PostgreSQL 17.2.1 - a binary compatible, open source drop in replacement of [PostgreSQL Community 17.2](https://www.postgresql.org/docs/17/release-17-2.html).
+
+## Release Highlights
+
+* This release includes fixes for [CVE-2024-10978](https://www.postgresql.org/support/security/CVE-2024-10978/) and for certain PostgreSQL extensions that break because they depend on the modified Application Binary Interface (ABI). These regressions were introduced in PostgreSQL 17.1, 16.5, 15.9, 14.14, 13.17, and 12.21. For this reason, the release of Percona Distribution for PostgreSQL 17.1.1 has been skipped.
+* Percona Distribution for PostgreSQL includes [`pgvector` :octicons-link-external-16](https://github.com/pgvector/pgvector) - an open source extension that enables you to use PostgreSQL as a vector database. It brings vector data type and vector operations (mainly similarity search) to PostgreSQL. You can install `pgvector` from repositories, tarballs, and it is also available as a Docker image.
+* The new version of `pg_tde` extension features index encryption and the support of storing encryption keys in KMIP-compatible servers. These feature come with the Beta version of the `tde_heap` access method. Learn more in the [pg_tde release notes :octicons-link-external-16:](https://percona.github.io/pg_tde/main/release-notes/release-notes.html)
+* The `pg_tde` extension itself is now a part of the Percona Server for PostgreSQL server package and a Docker image. If you installed the extension before, from its individual package, uninstall it first to avoid conflicts during the upgrade. See the [Minor Upgrade of Percona Distribution for PostgreSQL](minor-upgrade.md#preconditions) for details.
+For how to run `pg_tde` in Docker, check the [Enable encryption](docker.md#enable-encryption) section in the documentation.
+* Percona Distribution for PostgreSQL now statically links `llvmjit.so` library for Red Hat Enterprise Linux 8 and 9 and compatible derivatives. This resolves the conflict between the LLVM version required by Percona Distribution for PostgreSQL and the one supplied with the operating system. This also enables you to use the LLVM modules supplied with the operating system for other software you require.
+* Percona Monitoring and Management (PMM) 2.43.2 is now compatible with `pg_stat_monitor` 2.1.0 to monitor PostgreSQL 17.
+
+------------------------------------------------------------------------------
+
+
+The following is the list of extensions available in Percona Distribution for PostgreSQL.
+
+| Extension | Version | Description |
+| ------------------- | -------------- | ---------------------------- |
+| [etcd](https://etcd.io/)| 3.5.16 | A distributed, reliable key-value store for setting up high available Patroni clusters |
+|[HAProxy :octicons-link-external-16:](http://www.haproxy.org/) | 2.8.11 | a high-availability and load-balancing solution |
+| [Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/) | 4.0.3 | a HA (High Availability) solution for PostgreSQL |
+| [PgAudit :octicons-link-external-16:](https://www.pgaudit.org/) | 17.0 | provides detailed session or object audit logging via the standard logging facility provided by PostgreSQL |
+| [pgAudit set_user :octicons-link-external-16:](https://github.com/pgaudit/set_user)| 4.1.0 | provides an additional layer of logging and control when unprivileged users must escalate themselves to superusers or object owner roles in order to perform needed maintenance tasks.|
+| [pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) | 2.54.0 | a backup and restore solution for PostgreSQL |
+|[pgBadger :octicons-link-external-16:](https://github.com/darold/pgbadger) | 12.4 | a fast PostgreSQL Log Analyzer.|
+|[PgBouncer :octicons-link-external-16:](https://www.pgbouncer.org/) |1.23.1 | a lightweight connection pooler for PostgreSQL|
+| [pg_gather :octicons-link-external-16:](https://github.com/jobinau/pg_gather)| v28 | an SQL script for running the diagnostics of the health of PostgreSQL cluster |
+| [pgpool2 :octicons-link-external-16:](https://git.postgresql.org/gitweb/?p=pgpool2.git;a=summary) | 4.5.4 | a middleware between PostgreSQL server and client for high availability, connection pooling and load balancing.|
+| [pg_repack :octicons-link-external-16:](https://github.com/reorg/pg_repack) | 1.5.1 | rebuilds PostgreSQL database objects |
+| [pg_stat_monitor :octicons-link-external-16:](https://github.com/percona/pg_stat_monitor)|{{pgsmversion}} | collects and aggregates statistics for PostgreSQL and provides histogram information.|
+|[pgvector :octicons-link-external-16:](https://github.com/pgvector/pgvector)| v0.8.0 | A vector similarity search for PostgreSQL|
+| [PostGIS :octicons-link-external-16:](https://github.com/postgis/postgis) | 3.3.7 | a spatial extension for PostgreSQL.|
+| [PostgreSQL Common :octicons-link-external-16:](https://salsa.debian.org/postgresql/postgresql-common)| 265 | PostgreSQL database-cluster manager. It provides a structure under which multiple versions of PostgreSQL may be installed and/or multiple clusters maintained at one time.|
+|[wal2json :octicons-link-external-16:](https://github.com/eulerto/wal2json) |2.6 | a PostgreSQL logical decoding JSON output plugin|
+
+For Red Hat Enterprise Linux 8 and 9 and compatible derivatives, Percona Distribution for PostgreSQL also includes the following packages:
+
+* `llvm` 17.0.6 packages. This fixes compatibility issues with LLVM from upstream.
+* supplemental `python3-etcd` 0.4.5 packages, which can be used for setting up Patroni clusters.
+
+Percona Distribution for PostgreSQL is also shipped with the [libpq](https://www.postgresql.org/docs/17/libpq.html) library. It contains "a set of
+library functions that allow client programs to pass queries to the PostgreSQL
+backend server and to receive the results of these queries."
\ No newline at end of file
diff --git a/docs/release-notes.md b/docs/release-notes.md
index a35d71acb..d9908b262 100644
--- a/docs/release-notes.md
+++ b/docs/release-notes.md
@@ -1,3 +1,4 @@
# Percona Distribution for PostgreSQL release notes
-* [Percona Distribution for PostgreSQL 17](release-notes-v17.0.md) ({{date.17_0}})
+* [Percona Distribution for PostgreSQL 17.2.1](release-notes-v17.2.md) ({{date.17_2}})
+* [Percona Distribution for PostgreSQL 17.0.1](release-notes-v17.0.md) ({{date.17_0}})
diff --git a/docs/solutions/dr-pgbackrest-setup.md b/docs/solutions/dr-pgbackrest-setup.md
index af548538e..74cba57e0 100644
--- a/docs/solutions/dr-pgbackrest-setup.md
+++ b/docs/solutions/dr-pgbackrest-setup.md
@@ -239,7 +239,7 @@ log-level-console=info
log-level-file=debug
[prod_backup]
-pg1-path=/var/lib/postgresql/14/main
+pg1-path=/var/lib/postgresql/{{pgversion}}/main
```
diff --git a/docs/solutions/etcd-info.md b/docs/solutions/etcd-info.md
new file mode 100644
index 000000000..390fdbbe3
--- /dev/null
+++ b/docs/solutions/etcd-info.md
@@ -0,0 +1,60 @@
+# ETCD
+
+`etcd` is one of the key components in high availability architecture, therefore, it's important to understand it.
+
+`etcd` is a distributed key-value consensus store that helps applications store and manage cluster configuration data and perform distributed coordination of a PostgreSQL cluster.
+
+`etcd` runs as a cluster of nodes that communicate with each other to maintain a consistent state. The primary node in the cluster is called the "leader", and the remaining nodes are the "followers".
+
+## How `etcd` works
+
+Each node in the cluster stores data in a structured format and keeps a copy of the same data to ensure redundancy and fault tolerance. When you write data to `etcd`, the change is sent to the leader node, which then replicates it to the other nodes in the cluster. This ensures that all nodes remain synchronized and maintain data consistency.
+
+When a client wants to change data, it sends the request to the leader. The leader accepts the writes and proposes this change to the followers. The followers vote on the proposal. If a majority of followers agree (including the leader), the change is committed, ensuring consistency. The leader then confirms the change to the client.
+
+## Leader election
+
+An `etcd` cluster can have only one leader node at a time. The leader is responsible for receiving client requests, proposing changes, and ensuring they are replicated to the followers. When an `etcd` cluster starts, or if the current leader fails, the nodes hold an election to choose a new leader. Each node waits for a random amount of time before sending a vote request to other nodes, and the first node to get a majority of votes becomes the new leader. The cluster remains available as long as a majority of nodes (quorum) are still running.
+
+### How many members to have in a cluster
+
+The recommended approach is to deploy an odd-sized cluster (e.g., 3, 5, or 7 nodes). The odd number of nodes ensures that there is always a majority of nodes available to make decisions and keep the cluster running smoothly. This majority is crucial for maintaining consistency and availability, even if one node fails. For a cluster with `n` members, the majority is `(n/2)+1`.
+
+To better illustrate this concept, take an example of clusters with 3 nodes and 4 nodes. In a 3-node cluster, if one node fails, the remaining 2 nodes still form a majority (2 out of 3), and the cluster can continue to operate. In a 4-node cluster, if one node fails, there are only 3 nodes left, which is not enough to form a majority (3 out of 4). The cluster stops functioning.
+
+## `etcd` Raft consensus
+
+The heart of `etcd`'s reliability is the Raft consensus algorithm. Raft ensures that all nodes in the cluster agree on the same data. This ensures a consistent view of the data, even if some nodes are unavailable or experiencing network issues.
+
+An example of the Raft's role in `etcd` is the situation when there is no majority in the cluster. If a majority of nodes can't communicate (for example, due to network partitions), no new leader can be elected, and no new changes can be committed. This prevents the system from getting into an inconsistent state. The system waits for the network to heal and a majority to be re-established. This is crucial for data integrity.
+
+## etcd logs and performance considerations
+
+`etcd` keeps a detailed log of every change made to the data. These logs are essential for several reasons, including the ensurance of consistency, fault tolerance, leader elections, auditing, and others, maintaining a consistent state across nodes. For example, if a node fails, it can use the logs to catch up with the other nodes and restore its data. The logs also provide a history of all changes, which can be useful for debugging and security analysis if needed.
+
+### Slow disk performance
+
+`etcd` is very sensitive to disk I/O performance. Writing to the logs is a frequent operation and will be slow if the disk is slow. This can lead to timeouts, delaying consensus, instability, and even data loss. In extreme cases, slow disk performance can cause a leader to fail health checks, triggering unnecessary leader elections. Always use fast, reliable storage for `etcd`.
+
+### Slow or high-latency networks
+
+Communication between `etcd` nodes is critical. A slow or unreliable network can cause delays in replicating data, increasing the risk of stale reads. This can trigger premature timeouts leading to leader elections happening more frequently, and even delays in leader elections in some cases, impacting performance and stability. Also keep in mind that if nodes cannot reach each other in a timely manner, the cluster may lose quorum and become unavailable.
+
+## etcd Locks
+
+`etcd` provides a distributed locking mechanism, which helps applications coordinate actions across multiple nodes and access to shared resources preventing conflicts. Locks ensure that only one process can hold a resource at a time, avoiding race conditions and inconsistencies. Patroni is an example of an application that uses `etcd` locks for primary election control in the PostgreSQL cluster.
+
+### Deployment considerations
+
+We recommend to deploy `ectd` on separate hosts. The reasons for that are the following:
+
+* Both PostgreSQL and `etcd` are highly dependant on I/O. And running them on the same host may cause performance issues.
+
+* A higher resilience. If one or even two PostgreSQL node crash, the `etcd` cluster remains healthy and can trigger a new primary election.
+
+* Scalability and better performance. You can scale the `etcd` cluster separately from PostgreSQL based on the load and thus achieve better performance.
+
+Note that separate deployment increases the complexity of the infrastructure and requires additional effort on maintenance. Also, pay close attention to network configuration to eliminate the latency that might occur due to the communication between `etcd` and Patroni nodes over the network.
+
+
+
\ No newline at end of file
diff --git a/docs/solutions/ha-architecture.md b/docs/solutions/ha-architecture.md
new file mode 100644
index 000000000..e8c1abe1a
--- /dev/null
+++ b/docs/solutions/ha-architecture.md
@@ -0,0 +1,60 @@
+# Architecture
+
+In the [overview of high availability](high-availability.md), we discussed the required components to achieve high-availability.
+
+Our recommended minimalistic approach to a highly-available deployment is to have a three-node PostgreSQL cluster with the cluster management and failover mechanisms, load balancer and a backup / restore solution.
+
+The following diagram shows this architecture with the tools we recommend to use. If the cost and the number of nodes is a constraint, refer to the [Bare-minimum architecture](#bare-minimum-architecture) section.
+
+
+
+## Components
+
+The components in this architecture are:
+
+### Database layer
+
+- PostgreSQL nodes bearing the user data.
+
+- Patroni - an automatic failover system. Patroni requires and uses the Distributed Configuration Store to store the cluster configuration, health and status.
+
+- watchdog - a mechanism that will reset the whole system when they do not get a keepalive heartbeat within a specified timeframe. This adds an additional layer of fail safe in case usual Patroni split-brain protection mechanisms fail.
+
+### DCS layer
+
+- etcd - a Distributed Configuration Store. It stores the state of the PostgreSQL cluster and handles the election of a new primary. The odd number of nodes (minimum three) is required to always have the majority to agree on updates to the cluster state.
+
+### Load balancing layer
+
+- HAProxy - the load balancer and the single point of entry to the cluster for client applications. Minimum two instances are required for redundancy.
+
+- keepalived - a high-availability and failover solution for HAProxy. It provides a virtual IP (VIP) address for HAProxy and prevents its single point of failure by failing over the services to the operational instance
+
+- (Optional) pgbouncer - a connection pooler for PostgreSQL. The aim of pgbouncer is to lower the performance impact of opening new connections to PostgreSQL.
+
+### Services layer
+
+- pgBackRest - the backup and restore solution for PostgreSQL. It should also be redundant to eliminate a single point of failure.
+
+- (Optional) Percona Monitoring and Management (PMM) - the solution to monitor the health of your cluster
+
+## Bare-minimum architecture
+
+There may be constraints to use the [recommended reference architecture](#architecture), like the number of available servers or the cost for additional hardware. You can still achieve high-availability with the minimum two database nodes and three `etcd` instances. The following diagram shows this architecture:
+
+
+
+Using such architecture has the following limitations:
+
+* This setup only protects against a one node failure, either a database or a etcd node. Losing more than one node results in the read-only database.
+* The application must be able to connect to multiple database nodes and fail over to the new primary in the case of outage.
+* The application must act as the load-balancer. It must be able to determine read/write and read-only requests and distribute them across the cluster.
+* The `pbBackRest` component is optional but highly-recommended for disaster recovery. To eliminate a single point of failure, it should also be redundant but we're not discussing redundancy in this solution. [Contact us](https://www.percona.com/about/contact) to discuss it if this is the requirement for you.
+
+## Additional reading
+
+[How components work together](ha-components.md){.md-button}
+
+## Next steps
+
+[Deployment - initial setup](ha-init-setup.md){.md-button}
\ No newline at end of file
diff --git a/docs/solutions/ha-components.md b/docs/solutions/ha-components.md
new file mode 100644
index 000000000..cbfce0598
--- /dev/null
+++ b/docs/solutions/ha-components.md
@@ -0,0 +1,50 @@
+# How components work together
+
+This document explains how components of the proposed [high-availability architecture](ha-architecture.md) work together.
+
+## Database and DSC layers
+
+Let's start with the database and DCS layers as they are interconnected and work closely together.
+
+Every database node hosts PostgreSQL and Patroni instances.
+
+Each PostgreSQL instance in the cluster maintains consistency with other members through streaming replication. Streaming replication is asynchronous by default, meaning that the primary does not wait for the secondaries to acknowledge the receipt of the data to consider the transaction complete.
+
+Each Patroni instance runs on top of and manages its own PostgreSQL instance. This means that Patroni starts and stops PostgreSQL and manages its configuration.
+
+Patroni is also responsible for creating and managing the PostgreSQL cluster. It performs the initial cluster initialization and monitors the cluster state. To do so, Patroni relies on and uses the Distributed Configuration Store (DCS), represented by `etcd` in our architecture.
+
+Though Patroni supports various Distributed Configuration Stores like ZooKeeper, etcd, Consul or Kubernetes, we recommend and support `etcd` as the most popular DCS due to its simplicity, consistency and reliability.
+
+Note that the PostgreSQL cluster and Patroni cluster are the same thing, and we will use these names interchangeably.
+
+When you start Patroni, it writes the cluster configuration information in `etcd`. During the initial cluster initialization, Patroni uses the `etcd` locking mechanism to ensure that only one instance becomes the primary. This mechanism ensures that only a single process can hold a resource at a time avoiding race conditions and inconsistencies.
+
+You start Patroni instances one by one so the first instance acquires the lock with a lease in `etcd` and becomes the primary PostgreSQL node. The other instances join the primary as replicas, waiting for the lock to be released.
+
+If the current primary node crashes, its lease on the lock in `etcd` expires. The lock is automatically released after its expiration time. `etcd` the starts a new election and a standby node attempts to acquire the lock to become the new primary.
+
+Patroni uses not only `etcd` locking mechanism. It also uses `etcd` to store the current state of the cluster, ensuring that all nodes are aware of the latest changes.
+
+Another important component is the watchdog. It runs on each database node. The purpose of watchdog is to prevent split-brain scenarios, where multiple nodes might mistakenly think they are the primary node. The watchdog monitors the node's health by receiving periodic "keepalive" signals from Patroni. If these signals stop due to a crash, high system load or any other reason, the watchdog resets the node to ensure it does not cause inconsistencies.
+
+## Load balancing layer
+
+This layer consists of HAProxy and keepalived.
+
+HAProxy acts as a single point of entry to your cluster for client applications. It accepts all requests from client applications and distributes the load evenly across the cluster nodes. It can route read/write requests to the primary and read-only requests to the secondary nodes. This behavior is defined within HAProxy configuration. To determine the current primary node, HAProxy queries the Patroni REST API.
+
+HAProxy also serves as the connection pooler. It manages a pool of reusable database connections to optimize performance and resource usage. Instead of creating and closing a new connection for every database request, HAProxy maintains a set of open connections that can be shared among multiple clients.
+
+HAProxy must be also redundant. You need minimum 2 HAProxy instances (one active and another one standby) to eliminate the single point of failure and be able to perform failover. This is where keepalived comes in.
+
+Keepalived is the failover tool for HAProxy. It provides the virtual IP address (VIP) for HAProxy and monitors its state. When the current active HAProxy node is down, it transfers the VIP to the remaining node and fails over the services there.
+
+## Services layer
+
+Finally, the services layer is represented by `pgBackRest` and PMM.
+
+`pgBackRest` is deployed as the separate backup server and also as the agents on every database node. `pgBackRest` makes backups from the one of the secondary nodes and WAL archiving - from the primary. By communicating with its agents, `pgBackRest` determines the current primary PostgreSQL node.
+
+The monitoring solution is optional but nice to have. It enables you to monitor the health of your high-availability architecture, receive timely alerts should performance issues occur and proactively react to them.
+
diff --git a/docs/solutions/ha-etcd-config.md b/docs/solutions/ha-etcd-config.md
new file mode 100644
index 000000000..68fc52b5c
--- /dev/null
+++ b/docs/solutions/ha-etcd-config.md
@@ -0,0 +1,167 @@
+# Etcd setup
+
+In our solutions, we use etcd distributed configuration store. [Refresh your knowledge about etcd](ha-components.md#etcd).
+
+## Install etcd
+
+Install etcd on all PostgreSQL nodes: `node1`, `node2` and `node3`.
+
+=== ":material-debian: On Debian / Ubuntu"
+
+ 1. Install etcd:
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install etcd etcd-server etcd-client
+ ```
+
+ 3. Stop and disable etcd:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl stop etcd
+ $ sudo systemctl disable etcd
+ ```
+
+=== ":material-redhat: On RHEL and derivatives"
+
+
+ 1. Install etcd.
+
+ ```{.bash data-prompt="$"}
+ $ sudo yum install
+ etcd python3-python-etcd\
+ ```
+
+ 3. Stop and disable etcd:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl stop etcd
+ $ systemctl disable etcd
+ ```
+
+!!! note
+
+ If you [installed etcd from tarballs](../tarball.md), you must first [enable it](../enable-extensions.md#etcd) before configuring it.
+
+## Configure etcd
+
+To get started with `etcd` cluster, you need to bootstrap it. This means setting up the initial configuration and starting the etcd nodes so they can form a cluster. There are the following bootstrapping mechanisms:
+
+* Static in the case when the IP addresses of the cluster nodes are known
+* Discovery service - for cases when the IP addresses of the cluster are not known ahead of time.
+
+Since we know the IP addresses of the nodes, we will use the static method. For using the discovery service, please refer to the [etcd documentation :octicons-link-external-16:](https://etcd.io/docs/v3.5/op-guide/clustering/#etcd-discovery){:target="_blank"}.
+
+We will configure and start all etcd nodes in parallel. This can be done either by modifying each node's configuration or using the command line options. Use the method that you prefer more.
+
+### Method 1. Modify the configuration file
+
+1. Create the etcd configuration file on every node. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
+
+ === "node1"
+
+ ```yaml title="/etc/etcd/etcd.conf.yaml"
+ name: 'node1'
+ initial-cluster-token: PostgreSQL_HA_Cluster_1
+ initial-cluster-state: new
+ initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
+ data-dir: /var/lib/etcd
+ initial-advertise-peer-urls: http://10.104.0.1:2380
+ listen-peer-urls: http://10.104.0.1:2380
+ advertise-client-urls: http://10.104.0.1:2379
+ listen-client-urls: http://10.104.0.1:2379
+ ```
+
+ === "node2"
+
+ ```yaml title="/etc/etcd/etcd.conf.yaml"
+ name: 'node2'
+ initial-cluster-token: PostgreSQL_HA_Cluster_1
+ initial-cluster-state: new
+ initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
+ data-dir: /var/lib/etcd
+ initial-advertise-peer-urls: http://10.104.0.2:2380
+ listen-peer-urls: http://10.104.0.2:2380
+ advertise-client-urls: http://10.104.0.2:2379
+ listen-client-urls: http://10.104.0.2:2379
+ ```
+
+ === "node3"
+
+ ```yaml title="/etc/etcd/etcd.conf.yaml"
+ name: 'node3'
+ initial-cluster-token: PostgreSQL_HA_Cluster_1
+ initial-cluster-state: new
+ initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380, node3=http://10.104.0.3:2380
+ data-dir: /var/lib/etcd
+ initial-advertise-peer-urls: http://10.104.0.3:2380
+ listen-peer-urls: http://10.104.0.3:2380
+ advertise-client-urls: http://10.104.0.3:2379
+ listen-client-urls: http://10.104.0.3:2379
+ ```
+
+2. Enable and start the `etcd` service on all nodes:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl enable --now etcd
+ $ sudo systemctl status etcd
+ ```
+
+ During the node start, etcd searches for other cluster nodes defined in the configuration. If the other nodes are not yet running, the start may fail by a quorum timeout. This is expected behavior. Try starting all nodes again at the same time for the etcd cluster to be created.
+
+--8<-- "check-etcd.md"
+
+### Method 2. Start etcd nodes with command line options
+
+1. On each etcd node, set the environment variables for the cluster members, the cluster token and state:
+
+ ```
+ TOKEN=PostgreSQL_HA_Cluster_1
+ CLUSTER_STATE=new
+ NAME_1=node1
+ NAME_2=node2
+ NAME_3=node3
+ HOST_1=10.104.0.1
+ HOST_2=10.104.0.2
+ HOST_3=10.104.0.3
+ CLUSTER=${NAME_1}=http://${HOST_1}:2380,${NAME_2}=http://${HOST_2}:2380,${NAME_3}=http://${HOST_3}:2380
+ ```
+
+2. Start each etcd node in parallel using the following command:
+
+ === "node1"
+
+ ```{.bash data-prompt="$"}
+ THIS_NAME=${NAME_1}
+ THIS_IP=${HOST_1}
+ etcd --data-dir=data.etcd --name ${THIS_NAME} \
+ --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
+ --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
+ --initial-cluster ${CLUSTER} \
+ --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} &
+ ```
+
+ === "node2"
+
+ ```{.bash data-prompt="$"}
+ THIS_NAME=${NAME_2}
+ THIS_IP=${HOST_2}
+ etcd --data-dir=data.etcd --name ${THIS_NAME} \
+ --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
+ --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
+ --initial-cluster ${CLUSTER} \
+ --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} &
+ ```
+
+ === "node3"
+
+ ```{.bash data-prompt="$"}
+ THIS_NAME=${NAME_3}
+ THIS_IP=${HOST_3}
+ etcd --data-dir=data.etcd --name ${THIS_NAME} \
+ --initial-advertise-peer-urls http://${THIS_IP}:2380 --listen-peer-urls http://${THIS_IP}:2380 \
+ --advertise-client-urls http://${THIS_IP}:2379 --listen-client-urls http://${THIS_IP}:2379 \
+ --initial-cluster ${CLUSTER} \
+ --initial-cluster-state ${CLUSTER_STATE} --initial-cluster-token ${TOKEN} &
+ ```
+
+--8<-- "check-etcd.md"
\ No newline at end of file
diff --git a/docs/solutions/ha-haproxy.md b/docs/solutions/ha-haproxy.md
new file mode 100644
index 000000000..b58096110
--- /dev/null
+++ b/docs/solutions/ha-haproxy.md
@@ -0,0 +1,440 @@
+# Configure HAProxy
+
+HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene, HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001.
+
+This way, a client application doesn't know what node in the underlying cluster is the current primary. HAProxy sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that client application requests are never rejected.
+
+To eliminate a single point of failure for HAProxy, we use `keepalived` - the failover tool for it.
+
+## HAProxy setup
+
+1. Install HAProxy on the HAProxy nodes: `HAProxy1`, `HAProxy2` and `HAProxy3`:
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install percona-haproxy
+ ```
+
+2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file for every node.
+
+ ```
+ global
+ maxconn 100
+
+ defaults
+ log global
+ mode tcp
+ retries 2
+ timeout client 30m
+ timeout connect 4s
+ timeout server 30m
+ timeout check 5s
+
+ listen stats
+ mode http
+ bind *:7000
+ stats enable
+ stats uri /
+
+ listen primary
+ bind *:5000
+ option httpchk /primary
+ http-check expect status 200
+ default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
+ server node1 node1:5432 maxconn 100 check port 8008
+ server node2 node2:5432 maxconn 100 check port 8008
+ server node3 node3:5432 maxconn 100 check port 8008
+
+ listen standbys
+ balance roundrobin
+ bind *:5001
+ option httpchk /replica
+ http-check expect status 200
+ default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
+ server node1 node1:5432 maxconn 100 check port 8008
+ server node2 node2:5432 maxconn 100 check port 8008
+ server node3 node3:5432 maxconn 100 check port 8008
+ ```
+
+
+ HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately.
+
+3. Restart HAProxy:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl restart haproxy
+ ```
+
+4. Check the HAProxy logs to see if there are any errors:
+
+ ```{.bash data-prompt="$"}
+ $ sudo journalctl -u haproxy.service -n 100 -f
+ ```
+
+## Keepalived setup
+
+1. Install `keepalived` on all HAProxy nodes:
+
+ === ":material-debian: On Debian and Ubuntu"
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install keepalived
+ ```
+
+ === "material-redhat "On RHEL and derivatives"
+
+ ```{.bash data-prompt="$"}
+ $ sudo dnf install keepalived
+ ```
+
+3. Use the script to check the state of the primary node. This script makes an HTTP request to the primary Patroni node, checks if the response status code is `200`, and exits with a success code (0) if it is. If the status code is anything other than 200, it exits with a failure code (1). Create the `chk_primary.sh` file and specify the following within:
+
+ ```bash title="chk_primary.sh"
+ #!/bin/bash
+
+ RET=`/usr/bin/curl -s -o /dev/null -w "%{http_code}" http://localhost:8008/primary`
+
+ if [[ $RET -eq "200" ]]
+ then
+ exit 0
+ fi
+
+ exit 1
+ ```
+
+4. Make the script executable:
+
+ ```{.bash data-prompt="$"}
+ $ sudo chmod +x /path/to/chk_primary.sh
+ ```
+
+4. The path to the `keepalived` configuration file is `/etc/keepalived/keepalived.conf`. Configure a primary and a secondary HAProxy nodes separately.
+
+ Edit the `/etc/keepalived/keepalived.conf` configuration file. Specify the following information:
+
+ * `vrrp_instance string` - the name of Patroni cluster, `CLUSTER_1` in our case.
+ * `interface` - The interface where Patroni nodes reside
+ * `unicast_src_ip` - The IP address of the HAProxy node you currently configure.
+ * `unicast_peer` - The IP address of the remaining HAProxy nodes
+ * `virtual_ipaddress` - A public IP address of HAProxy, its subnet and the interface where it resides
+ * `vrrp_script chk_patroni` - The path to the `chk_primary.sh` script
+
+ === "Primary HAProxy (HAProxy1)"
+
+ ```ini
+ global_defs {
+ process_names
+
+ enable_script_security # Check that the script can only be edited by root
+
+ script_user root # Systemctl does only work with root
+
+ vrrp_version 3 # Using the latest protocol version allows for dynamic_interfaces
+
+ # vrrp_min_garp true # After switching to MASTER state 5 gratuitous arp (garp) are send and
+ # after 5 seconds another 5 garp are send. (For the switches to update the arp table)
+ # This option disables the second time "5 garp are send (as this is not necessary with modern switches)
+ }
+
+ vrrp_script chk_patroni {
+ script "/usr/local/bin/chk_primary.sh"
+
+ # script "/usr/bin/killall -0 haproxy" # Sending the zero signal returns OK (0) if the process or process group ID exists,
+ # otherwise, it returns ERR (-1 and sets errno to ESRCH).
+ # Note that the kill(2) man page states that error checking is still performed,
+ # meaning it will return na error (-1) and set errno to EPERM if:
+ # - The target process doesn't exists
+ # - The target process exists but the sending process does not have enough permissions to send it a signal
+
+ # script "/usr/bin/systemctl is-active --quiet haproxy" # The more intelligent way of checking the haproxy process
+ # Simpler way of checking haproxy process:
+ # script "/usr/bin/killall -0 haproxy"
+
+ fall 2 # 2 fails required for 'ilure
+
+ rise 2 # 2 OKs required to consider the process up after failure
+
+ interval 1 # check every X seconds
+
+ weight -10 # add 10 points rc=0
+ }
+
+ vrrp_instance CLUSTER_1 {
+ state MASTER # Initial state, MASTER|BACKUP
+ # MASTER on haproxy1, BACKUP on haproxy2, BACKUP on haproxy3, etc
+ # NOTE that if the priority is 255, then the instance will transition immediately
+ # to MASTER if state MASTER is specified; otherwise the instance will
+ # wait between 3 and 4 advert intervals before it can transition,
+ # depending on the priority
+
+ interface eth1 # interface for inside_network, bound by vrrp.
+ # Note: if using unicasting, the interface can be omitted as long
+ # as the unicast addresses are not IPv6 link local addresses (this is
+ # necessary, for example, if using asymmetric routing).
+ # If the interface is omitted, then all VIPs and eVIPs should specify
+ # the interface they are to be configured on, otherwise they will be
+ # added to the default interface.
+
+ virtual_router_id 99 # Needs to be the same value in all nodes of the same cluster
+ # HOWEVER, each cluster needs to have an UNIQUE ID
+
+ priority 95 # The higher the priority the higher the chance to be promoted to MASTER
+ advert_int 1 # Specify the VRRP Advert interval in seconds
+
+ # authentication { # Non compliant but good to have with unicast
+ # auth_type PASS
+ # auth_pass passw123
+ # }
+
+ unicast_src_ip 10.104.0.6 # The default IP for binding vrrpd is the primary IP
+ # on the defined interface. If you want to hide the location of vrrpd,
+ # use this IP as src_addr for multicast or unicast vrrp packets.
+
+ unicast_peer { # Do not send VRRP adverts over a VRRP multicast group.
+ # Instead it sends adverts to the following list of
+ # ip addresses using unicast. It can be cool to use
+ # the VRRP FSM and features in a networking
+ # environment where multicast is not supported!
+ # IP addresses specified can be IPv4 as well as IPv6.
+ # If min_ttl and/or max_ttl are specified, the TTL/hop limit
+ # of any received packet is checked against the specified
+ # TTL range, and is discarded if it is outside the range.
+ # Specifying min_ttl or max_ttl turns on check_unicast_src.
+ 10.104.0.5
+ 10.104.0.3
+ }
+
+ unicast_fault_no_peer # It is not possible to operate in unicast mode without any peers.
+ # Until v2.2.4 keepalived would silently operate in multicast mode
+ # if no peers were specified but a unicast keyword had been specified.
+ # Using this keywork stops defaulting to multicast if no peers are
+ # specified and puts the VRRP instance into fault state.
+
+ virtual_ipaddress {
+ 134.209.111.138/24 brd + dev eth1 label eth1:0
+ }
+
+ track_script { # Check that haproxy is up
+ chk_patroni
+ }
+ }
+ ```
+
+ === "HAProxy2"
+
+ ```ini
+ global_defs {
+ process_names
+
+ enable_script_security # Check that the script can only be edited by root
+
+ script_user root # Systemctl does only work with root
+
+ vrrp_version 3 # Using the latest protocol version allows for dynamic_interfaces
+
+ # vrrp_min_garp true # After switching to MASTER state 5 gratuitous arp (garp) are send and
+ # after 5 seconds another 5 garp are send. (For the switches to update the arp table)
+ # This option disables the second time "5 garp are send (as this is not necessary with modern switches)
+ }
+
+ vrrp_script chk_patroni {
+ script "/usr/local/bin/chk_primary.sh"
+
+ # script "/usr/bin/killall -0 haproxy" # Sending the zero signal returns OK (0) if the process or process group ID exists,
+ # otherwise, it returns ERR (-1 and sets errno to ESRCH).
+ # Note that the kill(2) man page states that error checking is still performed,
+ # meaning it will return na error (-1) and set errno to EPERM if:
+ # - The target process doesn't exists
+ # - The target process exists but the sending process does not have enough permissions to send it a signal
+
+ # script "/usr/bin/systemctl is-active --quiet haproxy" # The more intelligent way of checking the haproxy process
+ # Simpler way of checking haproxy process:
+ # script "/usr/bin/killall -0 haproxy"
+
+ fall 2 # 2 fails required for 'ilure
+
+ rise 2 # 2 OKs required to consider the process up after failure
+
+ interval 1 # check every X seconds
+
+ weight -10 # add 10 points rc=0
+ }
+
+ vrrp_instance CLUSTER_1 {
+ state BACKUP # Initial state, MASTER|BACKUP
+ # MASTER on haproxy1, BACKUP on haproxy2, BACKUP on haproxy3, etc
+ # NOTE that if the priority is 255, then the instance will transition immediately
+ # to MASTER if state MASTER is specified; otherwise the instance will
+ # wait between 3 and 4 advert intervals before it can transition,
+ # depending on the priority
+
+ interface eth1 # interface for inside_network, bound by vrrp.
+ # Note: if using unicasting, the interface can be omitted as long
+ # as the unicast addresses are not IPv6 link local addresses (this is
+ # necessary, for example, if using asymmetric routing).
+ # If the interface is omitted, then all VIPs and eVIPs should specify
+ # the interface they are to be configured on, otherwise they will be
+ # added to the default interface.
+
+ virtual_router_id 99 # Needs to be the same value in all nodes of the same cluster
+ # HOWEVER, each cluster needs to have an UNIQUE ID
+
+ priority 95 # The higher the priority the higher the chance to be promoted to MASTER
+ advert_int 1 # Specify the VRRP Advert interval in seconds
+
+ # authentication { # Non compliant but good to have with unicast
+ # auth_type PASS
+ # auth_pass passw123
+ # }
+
+ unicast_src_ip 10.104.0.5 # The default IP for binding vrrpd is the primary IP
+ # on the defined interface. If you want to hide the location of vrrpd,
+ # use this IP as src_addr for multicast or unicast vrrp packets.
+
+ unicast_peer { # Do not send VRRP adverts over a VRRP multicast group.
+ # Instead it sends adverts to the following list of
+ # ip addresses using unicast. It can be cool to use
+ # the VRRP FSM and features in a networking
+ # environment where multicast is not supported!
+ # IP addresses specified can be IPv4 as well as IPv6.
+ # If min_ttl and/or max_ttl are specified, the TTL/hop limit
+ # of any received packet is checked against the specified
+ # TTL range, and is discarded if it is outside the range.
+ # Specifying min_ttl or max_ttl turns on check_unicast_src.
+ 10.104.0.6
+ 10.104.0.3
+ }
+
+ unicast_fault_no_peer # It is not possible to operate in unicast mode without any peers.
+ # Until v2.2.4 keepalived would silently operate in multicast mode
+ # if no peers were specified but a unicast keyword had been specified.
+ # Using this keywork stops defaulting to multicast if no peers are
+ # specified and puts the VRRP instance into fault state.
+
+ virtual_ipaddress {
+ 134.209.111.138/24 brd + dev eth1 label eth1:0
+ }
+
+ track_script { # Check that haproxy is up
+ chk_patroni
+ }
+ }
+ ```
+
+ === "HAProxy3"
+
+ ```ini
+ global_defs {
+ process_names
+
+ enable_script_security # Check that the script can only be edited by root
+
+ script_user root # Systemctl does only work with root
+
+ vrrp_version 3 # Using the latest protocol version allows for dynamic_interfaces
+
+ # vrrp_min_garp true # After switching to MASTER state 5 gratuitous arp (garp) are send and
+ # after 5 seconds another 5 garp are send. (For the switches to update the arp table)
+ # This option disables the second time "5 garp are send (as this is not necessary with modern switches)
+ }
+
+ vrrp_script chk_patroni {
+ script "/usr/local/bin/chk_primary.sh"
+
+ # script "/usr/bin/killall -0 haproxy" # Sending the zero signal returns OK (0) if the process or process group ID exists,
+ # otherwise, it returns ERR (-1 and sets errno to ESRCH).
+ # Note that the kill(2) man page states that error checking is still performed,
+ # meaning it will return na error (-1) and set errno to EPERM if:
+ # - The target process doesn't exists
+ # - The target process exists but the sending process does not have enough permissions to send it a signal
+
+ # script "/usr/bin/systemctl is-active --quiet haproxy" # The more intelligent way of checking the haproxy process
+ # Simpler way of checking haproxy process:
+ # script "/usr/bin/killall -0 haproxy"
+
+ fall 2 # 2 fails required for 'ilure
+
+ rise 2 # 2 OKs required to consider the process up after failure
+
+ interval 1 # check every X seconds
+
+ weight -10 # add 10 points rc=0
+ }
+
+ vrrp_instance CLUSTER_1 {
+ state BACKUP # Initial state, MASTER|BACKUP
+ # MASTER on haproxy1, BACKUP on haproxy2, BACKUP on haproxy3, etc
+ # NOTE that if the priority is 255, then the instance will transition immediately
+ # to MASTER if state MASTER is specified; otherwise the instance will
+ # wait between 3 and 4 advert intervals before it can transition,
+ # depending on the priority
+
+ interface eth1 # interface for inside_network, bound by vrrp.
+ # Note: if using unicasting, the interface can be omitted as long
+ # as the unicast addresses are not IPv6 link local addresses (this is
+ # necessary, for example, if using asymmetric routing).
+ # If the interface is omitted, then all VIPs and eVIPs should specify
+ # the interface they are to be configured on, otherwise they will be
+ # added to the default interface.
+
+ virtual_router_id 99 # Needs to be the same value in all nodes of the same cluster
+ # HOWEVER, each cluster needs to have an UNIQUE ID
+
+ priority 95 # The higher the priority the higher the chance to be promoted to MASTER
+ advert_int 1 # Specify the VRRP Advert interval in seconds
+
+ # authentication { # Non compliant but good to have with unicast
+ # auth_type PASS
+ # auth_pass passw123
+ # }
+
+ unicast_src_ip 10.104.0.3 # The default IP for binding vrrpd is the primary IP
+ # on the defined interface. If you want to hide the location of vrrpd,
+ # use this IP as src_addr for multicast or unicast vrrp packets.
+
+ unicast_peer { # Do not send VRRP adverts over a VRRP multicast group.
+ # Instead it sends adverts to the following list of
+ # ip addresses using unicast. It can be cool to use
+ # the VRRP FSM and features in a networking
+ # environment where multicast is not supported!
+ # IP addresses specified can be IPv4 as well as IPv6.
+ # If min_ttl and/or max_ttl are specified, the TTL/hop limit
+ # of any received packet is checked against the specified
+ # TTL range, and is discarded if it is outside the range.
+ # Specifying min_ttl or max_ttl turns on check_unicast_src.
+ 10.104.0.6
+ 10.104.0.5
+ }
+
+ unicast_fault_no_peer # It is not possible to operate in unicast mode without any peers.
+ # Until v2.2.4 keepalived would silently operate in multicast mode
+ # if no peers were specified but a unicast keyword had been specified.
+ # Using this keywork stops defaulting to multicast if no peers are
+ # specified and puts the VRRP instance into fault state.
+
+ virtual_ipaddress {
+ 134.209.111.138/24 brd + dev eth1 label eth1:0
+ }
+
+ track_script { # Check that haproxy is up
+ chk_patroni
+ }
+ }
+ ```
+
+5. Start `keepalived`:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl start keepalived
+ ```
+
+6. Check the `keepalived` status:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl status keepalived
+ ```
+
+Congratulations! You have successfully configured your HAProxy solution. Now you can proceed to testing it.
+
+## Next steps
+
+[Test Patroni PostgreSQL cluster](ha-test.md){.md-button}
\ No newline at end of file
diff --git a/docs/solutions/ha-init-setup.md b/docs/solutions/ha-init-setup.md
new file mode 100644
index 000000000..b2761e271
--- /dev/null
+++ b/docs/solutions/ha-init-setup.md
@@ -0,0 +1,79 @@
+# Initial setup for high availability
+
+This guide provides instructions on how to set up a highly available PostgreSQL cluster with Patroni. This guide relies on the provided [architecture](ha-architecture.md) for high-availability.
+
+## Considerations
+
+1. This is an example deployment where etcd runs on the same host machines as the Patroni and PostgreSQL and there is a single dedicated HAProxy host. Alternatively etcd can run on different set of nodes.
+
+ If etcd is deployed on the same host machine as Patroni and PostgreSQL, separate disk system for etcd and PostgreSQL is recommended due to performance reasons.
+
+2. For this setup, we will use the nodes that have the following IP addresses:
+
+
+ | Node name | Public IP address | Internal IP address
+ |---------------|-------------------|--------------------
+ | node1 | 157.230.42.174 | 10.104.0.7
+ | node2 | 68.183.177.183 | 10.104.0.2
+ | node3 | 165.22.62.167 | 10.104.0.8
+ | HAProxy1 | 112.209.126.159 | 10.104.0.6
+ | HAProxy2 | 134.209.111.138 | 10.104.0.5
+ | HAProxy3 | 134.60.204.27 | 10.104.0.3
+ | backup | 97.78.129.11 | 10.104.0.9
+
+
+!!! important
+
+ We recommend not to expose the hosts/nodes where Patroni / etcd / PostgreSQL are running to public networks due to security risks. Use Firewalls, Virtual networks, subnets or the like to protect the database hosts from any kind of attack.
+
+## Configure name resolution
+
+It’s not necessary to have name resolution, but it makes the whole setup more readable and less error prone. Here, instead of configuring a DNS, we use a local name resolution by updating the file `/etc/hosts`. By resolving their hostnames to their IP addresses, we make the nodes aware of each other’s names and allow their seamless communication.
+
+Run the following commands on each node.
+
+1. Set the hostname for nodes. Change the node name to `node1`, `node2`, `node3`, `HAProxy1`, `HAProxy2` and `backup`, respectively:
+
+ ```{.bash data-prompt="$"}
+ $ sudo hostnamectl set-hostname node1
+ ```
+
+2. Modify the `/etc/hosts` file of each node to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes:
+
+ ```text
+ # Cluster IP and names
+
+ 10.104.0.7 node1
+ 10.104.0.2 node2
+ 10.104.0.8 node3
+ 10.104.0.6 HAProxy1
+ 10.104.0.5 HAProxy2
+ 10.104.0.3 HAProxy3
+ 10.104.0.9 backup
+ ```
+
+## Configure Percona repository
+
+To install the software from Percona, you need to subscribe to Percona repositories. To do this, you require `percona-release` - the repository management tool.
+
+Run the following commands on each node as the root user or with `sudo` privileges.
+
+1. Install `percona-release`
+
+ === ":material-debian: On Debian and Ubuntu"
+
+ --8<-- "percona-release-apt.md"
+
+ === ":material-redhat: On RHEL and derivatives"
+
+ --8<-- "percona-release-yum.md"
+
+2. Enable the repository:
+
+ ```{.bash data-prompt="$"}
+ $ sudo percona-release setup ppg{{pgversion}}
+ ```
+
+## Next steps
+
+[Install Percona Distribution for PostgreSQL](ha-install-postgres.md){.md-button}
\ No newline at end of file
diff --git a/docs/solutions/ha-measure.md b/docs/solutions/ha-measure.md
new file mode 100644
index 000000000..59c4129c5
--- /dev/null
+++ b/docs/solutions/ha-measure.md
@@ -0,0 +1,22 @@
+# Measuring high availability
+
+The need for high availability is determined by the business requirements, potential risks, and operational limitations (e.g. the more components you add to your infrastructure, the more complex and time-consuming it is to maintain).
+
+The level of high availability depends on the following:
+
+* how much downtime you can bear without negatively impacting your users and
+* how much data loss you can tolerate during the system outage.
+
+The measurement of availability is done by establishing a measurement time frame and dividing it by the time that it was available. This ratio will rarely be one, which is equal to 100% availability. At Percona, we don’t consider a solution to be highly available if it is not at least 99% or two nines available.
+
+The following table shows the amount of downtime for each level of availability from two to five nines.
+
+| Availability % | Downtime per year | Downtime per month | Downtime per week | Downtime per day |
+|--------------------------|-------------------|--------------------|-------------------|-------------------|
+| 99% (“two nines”) | 3.65 days | 7.31 hours | 1.68 hours | 14.40 minutes |
+| 99.5% (“two nines five”) | 1.83 days | 3.65 hours | 50.40 minutes | 7.20 minutes |
+| 99.9% (“three nines”) | 8.77 hours | 43.83 minutes | 10.08 minutes | 1.44 minutes |
+| 99.95% (“three nines five”) | 4.38 hours | 21.92 minutes | 5.04 minutes | 43.20 seconds |
+| 99.99% (“four nines”) | 52.60 minutes | 4.38 minutes | 1.01 minutes | 8.64 seconds |
+| 99.995% (“four nines five”) | 26.30 minutes | 2.19 minutes | 30.24 seconds | 4.32 seconds |
+| 99.999% (“five nines”) | 5.26 minutes | 26.30 seconds | 6.05 seconds | 864.00 milliseconds |
diff --git a/docs/solutions/ha-patroni.md b/docs/solutions/ha-patroni.md
new file mode 100644
index 000000000..2413c88c3
--- /dev/null
+++ b/docs/solutions/ha-patroni.md
@@ -0,0 +1,352 @@
+# Patroni setup
+
+## Install Percona Distribution for PostgreSQL and Patroni
+
+Run the following commands as root or with `sudo` privileges on `node1`, `node2` and `node3`.
+
+=== "On Debian / Ubuntu"
+
+ 1. Disable the upstream `postgresql-{{pgversion}}` package.
+
+ 2. Install Percona Distribution for PostgreSQL package
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install percona-postgresql-{{pgversion}}
+ ```
+
+ 3. Install some Python and auxiliary packages to help with Patroni
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install python3-pip python3-dev binutils
+ ```
+
+ 4. Install Patroni
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install percona-patroni
+ ```
+
+ 5. Stop and disable all installed services:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl stop {patroni,postgresql}
+ $ sudo systemctl disable {patroni,postgresql}
+ ```
+
+ 6. Even though Patroni can use an existing Postgres installation, our recommendation for a **new cluster that has no data** is to remove the data directory. This forces Patroni to initialize a new Postgres cluster instance.
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl stop postgresql
+ $ sudo rm -rf /var/lib/postgresql/{{pgversion}}/main
+ ```
+
+=== "On RHEL and derivatives"
+
+ 1. Install Percona Distribution for PostgreSQL package
+
+ ```{.bash data-prompt="$"}
+ $ sudo yum install percona-postgresql{{pgversion}}-server
+ ```
+
+ 2. Check the [platform specific notes for Patroni](../yum.md#for-percona-distribution-for-postgresql-packages)
+
+ 3. Install some Python and auxiliary packages to help with Patroni and etcd
+
+ ```{.bash data-prompt="$"}
+ $ sudo yum install python3-pip python3-devel binutils
+ ```
+
+ 4. Install Patroni
+
+ ```{.bash data-prompt="$"}
+ $ sudo yum install percona-patroni
+ ```
+
+ 3. Stop and disable all installed services:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl stop {patroni,postgresql-{{pgversion}}}
+ $ sudo systemctl disable {patroni,postgresql-{{pgversion}}}
+ ```
+
+ !!! important
+
+ **Don't** initialize the cluster and start the `postgresql` service. The cluster initialization and setup are handled by Patroni during the bootsrapping stage.
+
+## Configure Patroni
+
+Run the following commands on all nodes. You can do this in parallel:
+
+### Create environment variables
+
+Environment variables simplify the config file creation:
+
+1. Node name:
+
+ ```{.bash data-prompt="$"}
+ $ export NODE_NAME=`hostname -f`
+ ```
+
+2. Node IP:
+
+ ```{.bash data-prompt="$"}
+ $ export NODE_IP=`getent hosts $(hostname -f) | awk '{ print $1 }' | grep -v grep | grep -v '127.0.1.1'`
+ ```
+
+ * Check that the correct IP address is defined:
+
+ ```{.bash data-prompt="$"}
+ $ echo $NODE_IP
+ ```
+
+ ??? admonition "Sample output `node1`"
+
+ ```{text .no-copy}
+ 10.104.0.7
+ ```
+
+ If you have multiple IP addresses defined on your server and the environment variable contains the wrong one, you can manually redefine it. For example, run the following command for `node1`:
+
+ ```{.bash data-prompt="$"}
+ $ NODE_IP=10.104.0.7
+ ```
+
+3. Create variables to store the `PATH`. Check the path to the `data` and `bin` folders on your operating system and change it for the variables accordingly:
+
+ === ":material-debian: Debian and Ubuntu"
+
+ ```bash
+ DATA_DIR="/var/lib/postgresql/{{pgversion}}/main"
+ PG_BIN_DIR="/usr/lib/postgresql/{{pgversion}}/bin"
+ ```
+
+ === ":material-redhat: RHEL and derivatives"
+
+ ```bash
+ DATA_DIR="/var/lib/pgsql/data/"
+ PG_BIN_DIR="/usr/pgsql-{{pgversion}}/bin"
+ ```
+
+4. Patroni information:
+
+ ```bash
+ NAMESPACE="percona_lab"
+ SCOPE="cluster_1"
+ ```
+
+### Create the directories required by Patroni
+
+Create the directory to store the configuration file and make it owned by the `postgres` user.
+
+```{.bash data-prompt="$"}
+$ sudo mkdir -p /etc/patroni/
+$ sudo chown -R postgres:postgres /etc/patroni/
+```
+
+### Patroni configuration file
+
+Use the following command to create the `/etc/patroni/patroni.yml` configuration file and add the following configuration for every node:
+
+```bash
+echo "
+namespace: ${NAMESPACE}
+scope: ${SCOPE}
+name: ${NODE_NAME}
+
+restapi:
+ listen: 0.0.0.0:8008
+ connect_address: ${NODE_IP}:8008
+
+etcd3:
+ host: ${NODE_IP}:2379
+
+bootstrap:
+ # this section will be written into Etcd:///config after initializing new cluster
+ dcs:
+ ttl: 30
+ loop_wait: 10
+ retry_timeout: 10
+ maximum_lag_on_failover: 1048576
+
+ postgresql:
+ use_pg_rewind: true
+ use_slots: true
+ parameters:
+ wal_level: replica
+ hot_standby: "on"
+ wal_keep_segments: 10
+ max_wal_senders: 5
+ max_replication_slots: 10
+ wal_log_hints: "on"
+ logging_collector: 'on'
+ max_wal_size: '10GB'
+ archive_mode: "on"
+ archive_timeout: 600s
+ archive_command: "cp -f %p /home/postgres/archived/%f"
+
+ # some desired options for 'initdb'
+ initdb: # Note: It needs to be a list (some options need values, others are switches)
+ - encoding: UTF8
+ - data-checksums
+
+ pg_hba: # Add following lines to pg_hba.conf after running 'initdb'
+ - host replication replicator 127.0.0.1/32 trust
+ - host replication replicator 0.0.0.0/0 md5
+ - host all all 0.0.0.0/0 md5
+ - host all all ::0/0 md5
+
+ # Some additional users which needs to be created after initializing new cluster
+ users:
+ admin:
+ password: qaz123
+ options:
+ - createrole
+ - createdb
+ percona:
+ password: qaz123
+ options:
+ - createrole
+ - createdb
+
+postgresql:
+ cluster_name: cluster_1
+ listen: 0.0.0.0:5432
+ connect_address: ${NODE_IP}:5432
+ data_dir: ${DATA_DIR}
+ bin_dir: ${PG_BIN_DIR}
+ pgpass: /tmp/pgpass0
+ authentication:
+ replication:
+ username: replicator
+ password: replPasswd
+ superuser:
+ username: postgres
+ password: qaz123
+ parameters:
+ unix_socket_directories: "/var/run/postgresql/"
+ create_replica_methods:
+ - basebackup
+ basebackup:
+ checkpoint: 'fast'
+
+tags:
+ nofailover: false
+ noloadbalance: false
+ clonefrom: false
+ nosync: false
+" | sudo tee /etc/patroni/patroni.yml
+```
+
+??? admonition "Patroni configuration file"
+
+ Let’s take a moment to understand the contents of the `patroni.yml` file.
+
+ The first section provides the details of the node and its connection ports. After that, we have the `etcd` service and its port details.
+
+ Following these, there is a `bootstrap` section that contains the PostgreSQL configurations and the steps to run once
+
+### Systemd configuration
+
+1. Check that the systemd unit file `percona-patroni.service` is created in `/etc/systemd/system`. If it is created, skip this step.
+
+ If it's **not created**, create it manually and specify the following contents within:
+
+ ```ini title="/etc/systemd/system/percona-patroni.service"
+ [Unit]
+ Description=Runners to orchestrate a high-availability PostgreSQL
+ After=syslog.target network.target
+
+ [Service]
+ Type=simple
+
+ User=postgres
+ Group=postgres
+
+ # Start the patroni process
+ ExecStart=/bin/patroni /etc/patroni/patroni.yml
+
+ # Send HUP to reload from patroni.yml
+ ExecReload=/bin/kill -s HUP $MAINPID
+
+ # only kill the patroni process, not its children, so it will gracefully stop postgres
+ KillMode=process
+
+ # Give a reasonable amount of time for the server to start up/shut down
+ TimeoutSec=30
+
+ # Do not restart the service if it crashes, we want to manually inspect database on failure
+ Restart=no
+
+ [Install]
+ WantedBy=multi-user.target
+ ```
+
+2. Make `systemd` aware of the new service:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl daemon-reload
+ ```
+
+3. Make sure you have the configuration file and the `systemd` unit file created on every node.
+
+### Start Patroni
+
+Now it's time to start Patroni. You need the following commands on all nodes but **not in parallel**.
+
+1. Start Patroni on `node1` first, wait for the service to come to live, and then proceed with the other nodes one-by-one, always waiting for them to sync with the primary node:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl enable --now percona-patroni
+ ```
+
+ When Patroni starts, it initializes PostgreSQL (because the service is not currently running and the data directory is empty) following the directives in the bootstrap section of the configuration file.
+
+2. Check the service to see if there are errors:
+
+ ```{.bash data-prompt="$"}
+ $ sudo journalctl -fu percona-patroni
+ ```
+
+ A common error is Patroni complaining about the lack of proper entries in the `pg_hba.conf` file. If you see such errors, you must manually add or fix the entries in that file and then restart the service.
+
+ Changing the `patroni.yml` file and restarting the service will not have any effect here because the bootstrap section specifies the configuration to apply when PostgreSQL is first started in the node. It will not repeat the process even if the Patroni configuration file is modified and the service is restarted.
+
+ If Patroni has started properly, you should be able to locally connect to a PostgreSQL node using the following command:
+
+ ```{.bash data-prompt="$"}
+ $ sudo psql -U postgres
+
+ psql ({{dockertag}})
+ Type "help" for help.
+
+ postgres=#
+ ```
+
+9. When all nodes are up and running, you can check the cluster status using the following command:
+
+ ```{.bash data-prompt="$"}
+ $ sudo patronictl -c /etc/patroni/patroni.yml list
+ ```
+
+ The output resembles the following:
+
+ ??? admonition "Sample output node1"
+
+ ```{.text .no-copy}
+ + Cluster: cluster_1 (7440127629342136675) -----+----+-------+
+ | Member | Host | Role | State | TL | Lag in MB |
+ +--------+------------+---------+-----------+----+-----------+
+ | node1 | 10.0.100.1 | Leader | running | 1 | |
+ ```
+
+ ??? admonition "Sample output node3"
+
+ ```{.text .no-copy}
+ + Cluster: cluster_1 (7440127629342136675) -----+----+-------+
+ | Member | Host | Role | State | TL | Lag in MB |
+ +--------+------------+---------+-----------+----+-----------+
+ | node1 | 10.0.100.1 | Leader | running | 1 | |
+ | node2 | 10.0.100.2 | Replica | streaming | 1 | 0 |
+ | node3 | 10.0.100.3 | Replica | streaming | 1 | 0 |
+ +--------+------------+---------+-----------+----+-----------+
+ ```
\ No newline at end of file
diff --git a/docs/solutions/ha-setup-apt.md b/docs/solutions/ha-setup-apt.md
index 197027aad..9bfa864f9 100644
--- a/docs/solutions/ha-setup-apt.md
+++ b/docs/solutions/ha-setup-apt.md
@@ -28,10 +28,10 @@ This guide provides instructions on how to set up a highly available PostgreSQL
It’s not necessary to have name resolution, but it makes the whole setup more readable and less error prone. Here, instead of configuring a DNS, we use a local name resolution by updating the file `/etc/hosts`. By resolving their hostnames to their IP addresses, we make the nodes aware of each other’s names and allow their seamless communication.
-1. Run the following command on each node. Change the node name to `node1`, `node2` and `node3` respectively:
+1. Run the following command on each node. Change the node name to `node1`, `node2`, `node3` and `HAProxy-demo` respectively:
```{.bash data-prompt="$"}
- $ sudo hostnamectl set-hostname node-1
+ $ sudo hostnamectl set-hostname node1
```
2. Modify the `/etc/hosts` file of each PostgreSQL node to include the hostnames and IP addresses of the remaining nodes. Add the following at the end of the `/etc/hosts` file on all nodes:
@@ -78,11 +78,14 @@ It’s not necessary to have name resolution, but it makes the whole setup more
### Install the software
+
Run the following commands on `node1`, `node2` and `node3`:
1. Install Percona Distribution for PostgreSQL
- * [Install `percona-release` :octicons-link-external-16:](https://www.percona.com/doc/percona-repo-config/installing.html).
+ * Install `percona-release`.
+
+ --8<-- "percona-release-apt.md"
* Enable the repository:
@@ -111,7 +114,7 @@ Run the following commands on `node1`, `node2` and `node3`:
```{.bash data-prompt="$"}
$ sudo systemctl stop {etcd,patroni,postgresql}
- $ systemctl disable {etcd,patroni,postgresql}
+ $ sudo systemctl disable {etcd,patroni,postgresql}
```
5. Even though Patroni can use an existing Postgres installation, remove the data directory to force it to initialize a new Postgres cluster instance.
@@ -227,7 +230,7 @@ The `etcd` cluster is first started in one node and then the subsequent nodes ar
2. On `node3`, create the configuration file. You can edit the sample configuration file `/etc/etcd/etcd.conf.yaml` or create your own one. Replace the node names and IP addresses with the actual names and IP addresses of your nodes.
```yaml title="/etc/etcd/etcd.conf.yaml"
- name: 'node1'
+ name: 'node3'
initial-cluster-token: PostgreSQL_HA_Cluster_1
initial-cluster-state: existing
initial-cluster: node1=http://10.104.0.1:2380,node2=http://10.104.0.2:2380,node3=http://10.104.0.3:2380
@@ -401,32 +404,32 @@ Run the following commands on all nodes. You can do this in parallel:
```ini title="/etc/systemd/system/patroni.service"
[Unit]
- Description=Runners to orchestrate a high-availability PostgreSQL
- After=syslog.target network.target
+ Description=Runners to orchestrate a high-availability PostgreSQL
+ After=syslog.target network.target
- [Service]
- Type=simple
+ [Service]
+ Type=simple
- User=postgres
- Group=postgres
+ User=postgres
+ Group=postgres
- # Start the patroni process
- ExecStart=/bin/patroni /etc/patroni/patroni.yml
+ # Start the patroni process
+ ExecStart=/bin/patroni /etc/patroni/patroni.yml
- # Send HUP to reload from patroni.yml
- ExecReload=/bin/kill -s HUP $MAINPID
+ # Send HUP to reload from patroni.yml
+ ExecReload=/bin/kill -s HUP $MAINPID
- # only kill the patroni process, not its children, so it will gracefully stop postgres
- KillMode=process
+ # only kill the patroni process, not its children, so it will gracefully stop postgres
+ KillMode=process
- # Give a reasonable amount of time for the server to start up/shut down
- TimeoutSec=30
+ # Give a reasonable amount of time for the server to start up/shut down
+ TimeoutSec=30
- # Do not restart the service if it crashes, we want to manually inspect database on failure
- Restart=no
+ # Do not restart the service if it crashes, we want to manually inspect database on failure
+ Restart=no
- [Install]
- WantedBy=multi-user.target
+ [Install]
+ WantedBy=multi-user.target
```
4. Make systemd aware of the new service:
diff --git a/docs/solutions/haproxy-info.md b/docs/solutions/haproxy-info.md
new file mode 100644
index 000000000..8b890e36a
--- /dev/null
+++ b/docs/solutions/haproxy-info.md
@@ -0,0 +1,51 @@
+# HAProxy
+
+HAProxy (High Availability Proxy) is a powerful, open-source load balancer and
+proxy server used to improve the performance and reliability of web services by
+distributing network traffic across multiple servers. It is widely used to enhance the scalability, availability, and reliability of web applications by balancing client requests among backend servers.
+
+HAProxy architecture is
+optimized to move data as fast as possible with the least possible operations.
+It focuses on optimizing the CPU cache's efficiency by sticking connections to
+the same CPU as long as possible.
+
+## How HAProxy works
+
+HAProxy operates as a reverse proxy, which means it accepts client requests and distributes them to one or more backend servers using the configured load-balancing algorithm. This ensures efficient use of server resources and prevents any single server from becoming overloaded.
+
+- **Client request processing**:
+
+ 1. A client application connects to HAProxy instead of directly to the server.
+ 2. HAProxy analyzes the requests and determines what server to route it to for further processing.
+ 3. HAProxy forwards the request to the selected server using the routing algorithm defined in its configuration. It can be round robin, least connections, and others.
+ 4. HAProxy receives the response from the server and forwards it back to the client.
+ 5. After sending the response, HAProxy either closes the connection or keeps it open, depending on the configuration.
+
+- **Load balancing**: HAProxy distributes incoming traffic using various algorithms such as round-robin, least connections, and IP hash.
+- **Health checks**: HAProxy continuously monitors the health of backend servers to ensure requests are only routed to healthy servers.
+- **SSL termination**: HAProxy offloads SSL/TLS encryption and decryption, reducing the workload on backend servers.
+- **Session persistence**: HAProxy ensures that requests from the same client are routed to the same server for session consistency.
+- **Traffic management**: HAProxy supports rate limiting, request queuing, and connection pooling for optimal resource utilization.
+- **Security**: HAProxy supports SSL/TLS, IP filtering, and integration with Web Application Firewalls (WAF).
+
+## Role in a HA Patroni cluster
+
+HAProxy plays a crucial role in managing PostgreSQL high availability in a Patroni cluster. Patroni is an open-source tool that automates PostgreSQL cluster management, including failover and replication. HAProxy acts as a load balancer and proxy, distributing client connections across the cluster nodes.
+
+Client applications connect to HAProxy, which transparently forwards their requests to the appropriate PostgreSQL node. This ensures that clients always connect to the active primary node without needing to know the cluster's internal state and topology.
+
+HAProxy monitors the health of PostgreSQL nodes using Patroni's API and routes traffic to the primary node. If the primary node fails, Patroni promotes a secondary node to a new primary, and HAProxy updates its routing to reflect the change. You can configure HAProxy to route write requests to the primary node and read requests - to the secondary nodes.
+
+## Redundancy for HAProxy
+
+Using a single HAProxy node in your deployment opens a risk for single point of failure: when HAProxy is down, clinents lose connection to the cluster. To eliminate this risk, add redundancy to HAProxy. To achieve this, you can set up multiple HAProxy instances with a failover mechanism. This ensures that if one HAProxy instance fails, another takes over, maintaining high availability.
+
+To make this happen, you need the following:
+
+1. Configure a virtual IP address that can be moved between HAProxy instances.
+
+2. Install and configure Keepalived - a failover mechanism for HAProxy. Keepalived monitors the health of HAProxy instances.
+If the primary HAProxy instance fails, Keepalived moves the virtual IP address to a backup instance.
+
+3. Ensure that HAProxy configurations are synchronized across all instances to maintain consistency.
+
diff --git a/docs/solutions/high-availability.md b/docs/solutions/high-availability.md
index f79e3a1b5..6b4d4846c 100644
--- a/docs/solutions/high-availability.md
+++ b/docs/solutions/high-availability.md
@@ -1,95 +1,97 @@
# High Availability in PostgreSQL with Patroni
-PostgreSQL has been widely adopted as a modern, high-performance transactional database. A highly available PostgreSQL cluster can withstand failures caused by network outages, resource saturation, hardware failures, operating system crashes or unexpected reboots. Such cluster is often a critical component of the enterprise application landscape, where [four nines of availability :octicons-link-external-16:](https://en.wikipedia.org/wiki/High_availability#Percentage_calculation) is a minimum requirement.
+Whether you are a small startup or a big enterprise, downtime of your services may cause severe consequences, such as loss of customers, impact on your reputation, and penalties for not meeting the Service Level Agreements (SLAs). That’s why ensuring a highly-available deployment is crucial.
-There are several methods to achieve high availability in PostgreSQL. This solution document provides [Patroni](#patroni) - the open-source extension to facilitate and manage the deployment of high availability in PostgreSQL.
+But what does it mean, high availability? And how to achieve it? This document answers these questions.
-??? admonition "High availability methods"
+After reading this document, you will learn the following:
- There are several native methods for achieving high availability with PostgreSQL:
+* [what is high availability](#what-is-high-availability)
+* the recommended [reference architecture](ha-architecture.md) to achieve it
+* how to deploy it using our step-by-step deployment guides for each component. The deployment instructions focus on the minimalistic approach to high availability that we recommend. It also gives instructions how to deploy additional components that you can add when your infrastructure grows.
+* how to verify that your high availability deployment works as expected, providing replication and failover with the [testing guidelines](ha-test.md)
- - shared disk failover,
- - file system replication,
- - trigger-based replication,
- - statement-based replication,
- - logical replication,
- - Write-Ahead Log (WAL) shipping, and
- - [streaming replication](#streaming-replication)
+## What is high availability
+High availability is the ability of the system to operate continuously without the interruption of services. During the outage, the system must be able to transfer the services from the database node that is down to one of the remaining nodes.
- ## Streaming replication
+### How to achieve it?
- Streaming replication is part of Write-Ahead Log shipping, where changes to the WALs are immediately made available to standby replicas. With this approach, a standby instance is always up-to-date with changes from the primary node and can assume the role of primary in case of a failover.
+A short answer is: add redundancy to your deployment, eliminate a single point of failure (SPOF) and have the mechanism to transfer the services from a failed member to the healthy one.
+For a long answer, let's break it down into steps.
- ### Why native streaming replication is not enough
+#### Step 1. Replication
- Although the native streaming replication in PostgreSQL supports failing over to the primary node, it lacks some key features expected from a truly highly-available solution. These include:
+First, you should have more than one copy of your data. This means, you need to have several instances of your database where one is the primary instance that accepts reads and writes. Other instances are replicas – they must have an up-to-date copy of the data from the primary and remain in sync with it. They may also accept reads to offload your primary.
+You typically deploy these instances on separate servers or nodes. The minimum number of database nodes is two: one primary and one replica.
- * No consensus-based promotion of a “leader” node during a failover
- * No decent capability for monitoring cluster status
- * No automated way to bring back the failed primary node to the cluster
- * A manual or scheduled switchover is not easy to manage
+The recommended deployment is a three-instance cluster consisting of one primary and two replica nodes. The replicas receive the data via the replication mechanism.
- To address these shortcomings, there are a multitude of third-party, open-source extensions for PostgreSQL. The challenge for a database administrator here is to select the right utility for the current scenario.
+
- Percona Distribution for PostgreSQL solves this challenge by providing the [Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/) extension for achieving PostgreSQL high availability.
+PostgreSQL natively supports logical and streaming replication. For high availability we recommend streaming replication as it happens in real time, minimizing the delay between the primary and replica nodes.
-## Patroni
+#### Step 2. Failover
-[Patroni :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/) is a template for you to create your own customized, high-availability solution using Python and - for maximum accessibility - a distributed configuration store like ZooKeeper, etcd, Consul or Kubernetes.
+Next, you may have a situation when a primary node is down or not responding. Reasons for that can be different – from hardware or network issues to software failures, power outages, and scheduled maintenance. In this case, you must have the way to know about it and to transfer the operation from the primary node to one of the secondaries. This process is called failover.
-### Key benefits of Patroni:
+
-* Continuous monitoring and automatic failover
-* Manual/scheduled switchover with a single command
-* Built-in automation for bringing back a failed node to cluster again.
-* REST APIs for entire cluster configuration and further tooling.
-* Provides infrastructure for transparent application failover
-* Distributed consensus for every action and configuration.
-* Integration with Linux watchdog for avoiding split-brain syndrome.
+You can do a manual failover. It suits for environments where downtime does not impact operations or revenue. However, this requires dedicated personnel and may lead to additional downtime.
+Another option is automated failover, which significantly minimizes downtime and is less error-prone than manual one. Automated failover can be accomplished by adding an open-source failover tool to your deployment.
-!!! admonition "See also"
+#### Step 3. Load balancer
- - [Patroni documentation :octicons-link-external-16:](https://patroni.readthedocs.io/en/latest/SETTINGS.html#settings)
+Instead of a single node you now have a cluster. How to enable users to connect to the cluster and ensure they always connect to the correct node, especially when the primary node changes?
- - Percona Blog:
+One option is to configure a DNS resolution that resolves the IPs of all cluster nodes. A drawback here is that only the primary node accepts all requests. When your system grows, so does the load and it may lead to overloading the primary node and result in performance degradation.
- - [PostgreSQL HA with Patroni: Your Turn to Test Failure Scenarios :octicons-link-external-16:](https://www.percona.com/blog/2021/06/11/postgresql-ha-with-patroni-your-turn-to-test-failure-scenarios/)
+You can write your application to send read/write requests to the primary and read-only requests to the secondary nodes. This requires significant programming experience.
-## Architecture layout
+
-The following diagram shows the architecture of a three-node PostgreSQL cluster with a single-leader node.
+Another option is to use a load-balancing proxy. Instead of connecting directly to the IP address of the primary node, which can change during a failover, you use a proxy that acts as a single point of entry for the entire cluster. This proxy provides the IP address visible for user applications. It also knows which node is currently the primary and directs all incoming write requests to it. At the same time, it can distribute read requests among the replicas to evenly spread the load and improve performance.
-
+To eliminate a single point of failure for a load balancer, deploy minimum two instances of it for redundancy. The instances share the public IP address so that it can "float" from one instance to another in the case of a failure. To control the load balancer's state and transfer the IP address to the active instance, you also need the failover solution for load balancers.
-### Components
+The use of a load balancer is optional, if your application implements this logic, but is highly-recommended.
-The components in this architecture are:
+#### Step 4. Backups
-- PostgreSQL nodes
-- Patroni - a template for configuring a highly available PostgreSQL cluster.
+Even with replication and failover mechanisms in place, it’s crucial to have regular backups of your data. Backups provide a safety net for catastrophic failures that affect both the primary and replica nodes. While replication ensures data is synchronized across multiple nodes, it does not protect against data corruption, accidental deletions, or malicious attacks that can affect all nodes.
-- etcd - a Distributed Configuration store that stores the state of the PostgreSQL cluster.
+
-- HAProxy - the load balancer for the cluster and is the single point of entry to client applications.
+Having regular backups ensures that you can restore your data to a previous state, preserving data integrity and availability even in the worst-case scenarios. Store your backups in separate, secure locations and regularly test them to ensure that you can quickly and accurately restore them when needed. This additional layer of protection is essential to maintaining continuous operation and minimizing data loss.
-- pgBackRest - the backup and restore solution for PostgreSQL
+The backup tool is optional but highly-recommended for data corruption recovery.
-- Percona Monitoring and Management (PMM) - the solution to monitor the health of your cluster
+As a result, you end up with the following components for a minimalistic highly-available deployment:
-### How components work together
+* A minimum two-node PostgreSQL cluster with the replication configured among nodes. The recommended minimalistic cluster is a three-node one.
+* A solution to manage the cluster and perform automatic failover when the primary node is down.
+* (Optional but recommended) A load-balancing proxy that provides a single point of entry to your cluster and distributes the load across cluster nodes. You need at least two instances of a load-balancing proxy and a failover tool to eliminate a single point of failure.
+* (Optional but recommended) A backup and restore solution to protect data against loss and corruption.
-Each PostgreSQL instance in the cluster maintains consistency with other members through streaming replication. Each instance hosts Patroni - a cluster manager that monitors the cluster health. Patroni relies on the operational etcd cluster to store the cluster configuration and sensitive data about the cluster health there.
+Optionally, you can add a monitoring tool to observe the health of your deployment, receive alerts about performance issues and timely react to them.
-Patroni periodically sends heartbeat requests with the cluster status to etcd. etcd writes this information to disk and sends the response back to Patroni. If the current primary fails to renew its status as leader within the specified timeout, Patroni updates the state change in etcd, which uses this information to elect the new primary and keep the cluster up and running.
+### What tools to use?
-The connections to the cluster do not happen directly to the database nodes but are routed via a connection proxy like HAProxy. This proxy determines the active node by querying the Patroni REST API.
+The PostgreSQL ecosystem offers many tools for high availability, but choosing the right ones can be challenging. At Percona, we have carefully selected and tested open-source tools to ensure they work well together and help you achieve high availability.
+
+In our [reference architecture](ha-architecture.md) section we recommend a combination of open-source tools, focusing on a minimalistic three-node PostgreSQL cluster.
+
+Note that the tools are recommended but not mandatory. You can use your own solutions and alternatives if they better meet your business needs. However, in this case, we cannot guarantee their compatibility and smooth operation.
+
+### Additional reading
+
+[Measuring high availability](ha-measure.md){.md-button}
## Next steps
-[Deploy on Debian or Ubuntu](ha-setup-apt.md){.md-button}
-[Deploy on RHEL or derivatives](ha-setup-yum.md){.md-button}
+[Architecture](ha-architecture.md){.md-button}
+
diff --git a/docs/solutions/patroni-info.md b/docs/solutions/patroni-info.md
new file mode 100644
index 000000000..c22ce21ab
--- /dev/null
+++ b/docs/solutions/patroni-info.md
@@ -0,0 +1,74 @@
+# Patroni
+
+Patroni is an open-source tool designed to manage and automate the high availability (HA) of PostgreSQL clusters. It ensures that your PostgreSQL database remains available even in the event of hardware failures, network issues or other disruptions. Patroni achieves this by using distributed consensus stores like ETCD, Consul, or ZooKeeper to manage cluster state and automate failover processes. We'll use [`etcd`](etcd-info.md) in our architecture.
+
+## Key benefits of Patroni for high availability
+
+- Automated failover and promotion of a new primary in case of a failure;
+- Prevention of split-brain scenarios (where two nodes believe they are the primary);
+- Simplifying the management of PostgreSQL clusters across multiple data centers;
+- Self-healing via automatic restarts of failed PostgreSQL instances or reinitialization of broken replicas.
+- Integration with tools like `pgBackRest`, `HAProxy`, and monitoring systems for a complete HA solution.
+
+## How Patroni works
+
+Patroni uses the `etcd` distributed consensus store to coordinate the state of a PostgreSQL cluster for the following operations:
+
+1. Cluster state management:
+
+ - After a user installs and configures Patroni, Patroni takes over the PostgreSQL service administration and configuration;
+ - Patroni maintains the cluster state data such as PostgreSQL configuration, information about which node is the primary and which are replicas, and their health status.
+ - Patroni manages PostgreSQL configuration files such as` postgresql.conf` and `pg_hba.conf` dynamically, ensuring consistency across the cluster.
+ - A Patroni agent runs on each cluster node and communicates with `etcd` and other nodes.
+
+2. Primary node election:
+
+ - Patroni initiates a primary election process after the cluster is initialized;
+ - Patroni initiates a failover process if the primary node fails;
+ - When the old primary is recovered, it rejoins the cluster as a new replica;
+ - Every new node added to the cluster joins it as a new replica;
+ - `etcd` ensures that only one node is elected as the new primary, preventing split-brain scenarios.
+
+3. Automatic failover:
+
+ - If the primary node becomes unavailable, Patroni initiates a new primary election process with the most up-to-date replicas;
+ - When a node is elected it is automatically promoted to primary;
+ - Patroni updates the `etcd` consensus store and reconfigures the remaining replicas to follow the new primary.
+
+4. Health checks:
+
+ - Patroni continuously monitors the health of all PostgreSQL instances;
+ - If a node fails or becomes unreachable, Patroni takes corrective actions by restarting PostgreSQL or initiating a failover process.
+
+## Split-brain prevention
+
+Split-brain is an issue, which occurs when two or more nodes believe they are the primary, leading to data inconsistencies. Patroni prevents split-brain by using an `etcd` distributed locking mechanism. The primary node holds a leader lock in `etcd`. If the lock is lost (for example, due to network partitioning), the node demotes itself to a replica.
+
+One important aspect of how Patroni works is that it requires a quorum (the majority) of nodes to agree on the cluster state, preventing isolated nodes from becoming a primary. The quorum strengthens Patroni's capabilities of preventing split-brain.
+
+## Watchdog
+
+Patroni can use a watchdog mechanism to improve resilience. But what is watchdog?
+
+A watchdog is a mechanism that ensures a system can recover from critical failures. In the context of Patroni, a watchdog is used to forcibly restart the node and terminate a failed primary node to prevent split-brain scenarios.
+
+While Patroni itself is designed for high availability, a watchdog provides an extra layer of protection against system-level failures that Patroni might not be able to detect, such as kernel panics or hardware lockups. If the entire operating system becomes unresponsive, Patroni might not be able to function correctly. The watchdog operates independently so it can detect that the server is unresponsive and reset it, bringing it back to a known good state.
+
+Watchdog adds an extra layer of safety, because it helps protecting against scenarios where the `etcd` consensus store is unavailable or network partitions occur.
+
+There are 2 types of watchdogs:
+
+ - Hardware watchdog: A physical device that reboots the server if the operating system becomes unresponsive.
+ - Software watchdog: A software-based mechanism that monitors the system and takes corrective actions (e.g., killing processes or rebooting the node).
+
+Most of the servers in the cloud nowadays use a software watchdog.
+
+## Integration with other tools
+
+Patroni integrates well with other tools to create a comprehensive high-availability solution. In our architecture, such tools are:
+
+* HAProxy to load balance directing traffic to both the primary and replica nodes,
+* pgBackRest to help to ensure robust backup and restore,
+* PMM for monitoring.
+
+Patroni provides hooks that allow you to customize its behavior. You can use hooks to execute custom scripts or commands at various stages of Patroni lifecycle, such as before and after failover, or when a new instance joins the cluster. Thereby you can integrate Patroni with other systems and automate various tasks. For example, use a hook to update the monitoring system when a failover occurs.
\ No newline at end of file
diff --git a/docs/solutions/pgbackrest-info.md b/docs/solutions/pgbackrest-info.md
new file mode 100644
index 000000000..34f675061
--- /dev/null
+++ b/docs/solutions/pgbackrest-info.md
@@ -0,0 +1,35 @@
+# PgBackRest
+
+`pgBackRest` is an advanced backup and restore tool designed specifically for PostgreSQL databases. `pgBackRest` emphasizes simplicity, speed, and scalability. Its architecture is focused on minimizing the time and resources required for both backup and restoration processes.
+
+`pgBackRest` uses a custom protocol, which allows for more flexibility compared to traditional tools like `tar` and `rsync` and limits the types of connections that are required to perform a backup, thereby increasing security. `pgBackRest` is a simple, but feature-rich, reliable backup and restore system that can seamlessly scale up to the largest databases and workloads.
+
+## Key features of `pgBackRest`
+
+1. **Full, differential, and incremental backups (at file or block level)**: `pgBackRest` supports various types of backups, including full, differential, and incremental, providing efficient storage and recovery options. Block-level backups save space by only copying the parts of files that have changed.
+
+2. **Point-in-Time recovery (PITR)**: `pgBackRest` enables restoring a PostgreSQL database to a specific point in time, crucial for disaster recovery scenarios.
+
+3. **Parallel backup and restore**: `pgBackRest` can perform backups and restores in parallel, utilizing multiple CPU cores to significantly reduce the time required for these operations.
+
+4. **Local or remote operation**: A custom protocol allows `pgBackRest` to backup, restore, and archive locally or remotely via TLS/SSH with minimal configuration. This allows for flexible deployment options.
+
+5. **Backup rotation and archive expiration**: You can set retention policies to manage backup rotation and WAL archive expiration automatically.
+
+6. **Backup integrity and verification**: `pgBackRest` performs integrity checks on backup files, ensuring they are consistent and reliable for recovery.
+
+7. **Backup resume**: `pgBackRest` can resume an interrupted backup from the point where it was stopped. Files that were already copied are compared with the checksums in the manifest to ensure integrity. This operation can take place entirely on the repository host, therefore, it reduces load on the PostgreSQL host and saves time since checksum calculation is faster than compressing and retransmitting data.
+
+8. **Delta restore**: This feature allows pgBackRest to quickly apply incremental changes to an existing database, reducing restoration time.
+
+9. **Compression and encryption**: `pgBackRest` offers options for compressing and encrypting backup data, enhancing security and reducing storage requirements.
+
+## How `pgBackRest` works
+
+For making backups and restores you need a backup server and the `pgBackRest` agents running on the database nodes. The backup server has the information about a PostgreSQL cluster, where it is located, how to back it up and where to store backup files. This information is defined within a configuration section called a *stanza*.
+
+The storage location where `pgBackRest` stores backup data and WAL archives is called the repository. It can be a local directory, a remote server, or a cloud storage service like AWS S3, S3-compatible storages or Azure blob storage. `pgBackRest` supports up to 4 repositories, allowing for redundancy and flexibility in backup storage.
+
+When you create a stanza, it initializes the repository and prepares it for storing backups. During the backup process, `pgBackRest` reads the data from the PostgreSQL cluster and writes it to the repository. It also performs integrity checks and compresses the data if configured.
+
+Similarly, during the restore process, `pgBackRest` reads the backup data from the repository and writes it to the PostgreSQL data directory. It also verifies the integrity of the restored data.
\ No newline at end of file
diff --git a/docs/solutions/pgbackrest.md b/docs/solutions/pgbackrest.md
index 7697f4e19..6d9a50e88 100644
--- a/docs/solutions/pgbackrest.md
+++ b/docs/solutions/pgbackrest.md
@@ -1,38 +1,34 @@
# pgBackRest setup
-[pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) is a backup tool used to perform PostgreSQL database backup, archiving, restoration, and point-in-time recovery. While it can be used for local backups, this procedure shows how to deploy a [pgBackRest server running on a dedicated host :octicons-link-external-16:](https://pgbackrest.org/user-guide-rhel.html#repo-host) and how to configure PostgreSQL servers to use it for backups and archiving.
+[pgBackRest :octicons-link-external-16:](https://pgbackrest.org/) is a backup tool used to perform PostgreSQL database backup, archiving, restoration, and point-in-time recovery.
+
+In our solution we deploy a [pgBackRest server on a dedicated host :octicons-link-external-16:](https://pgbackrest.org/user-guide-rhel.html#repo-host) and also deploy pgBackRest on the PostgreSQL servers. Them we configure PostgreSQL servers to use it for backups and archiving.
You also need a backup storage to store the backups. It can either be a remote storage such as AWS S3, S3-compatible storages or Azure blob storage, or a filesystem-based one.
-## Configure backup server
+## Preparation
-To make things easier when working with some templates, run the commands below as the root user. Run the following command to switch to the root user:
-
-```{.bash data-prompt="$"}
-$ sudo su -
-```
+Make sure to complete the [initial setup](ha-init-setup.md) steps.
+
+## Install pgBackRest
-### Install pgBackRest
+Install pgBackRest on the following nodes: `node1`, `node2`, `node3`, `backup`
-1. Enable the repository with [percona-release :octicons-link-external-16:](https://www.percona.com/doc/percona-repo-config/index.html)
+=== ":material-debian: On Debian/Ubuntu"
```{.bash data-prompt="$"}
- $ percona-release setup ppg-{{pgversion}}
+ $ sudo apt install percona-pgbackrest
```
-2. Install pgBackRest package
+=== ":material-redhat: On RHEL/derivatives"
- === ":material-debian: On Debian/Ubuntu"
-
- ```{.bash data-prompt="$"}
- $ apt install percona-pgbackrest
- ```
+ ```{.bash data-prompt="$"}
+ $ sudo yum install percona-pgbackrest
+ ```
- === ":material-redhat: On RHEL/derivatives"
+## Configure a backup server
- ```{.bash data-prompt="$"}
- $ yum install percona-pgbackrest
- ```
+Do the following steps on the `backup` node.
### Create the configuration file
@@ -40,9 +36,9 @@ $ sudo su -
```{.bash data-prompt="$"}
export SRV_NAME="bkp-srv"
- export NODE1_NAME="node-1"
- export NODE2_NAME="node-2"
- export NODE3_NAME="node-3"
+ export NODE1_NAME="node1"
+ export NODE2_NAME="node2"
+ export NODE3_NAME="node3"
export CA_PATH="/etc/ssl/certs/pg_ha"
```
@@ -53,9 +49,9 @@ $ sudo su -
This directory is usually created during pgBackRest's installation process. If it's not there already, create it as follows:
```{.bash data-prompt="$"}
- $ mkdir -p /var/lib/pgbackrest
- $ chmod 750 /var/lib/pgbackrest
- $ chown postgres:postgres /var/lib/pgbackrest
+ $ sudo mkdir -p /var/lib/pgbackrest
+ $ sudo chmod 750 /var/lib/pgbackrest
+ $ sudo chown postgres:postgres /var/lib/pgbackrest
```
3. The default `pgBackRest` configuration file location is `/etc/pgbackrest/pgbackrest.conf`, but some systems continue to use the old path, `/etc/pgbackrest.conf`, which remains a valid alternative. If the former is not present in your system, create the latter.
@@ -63,15 +59,15 @@ $ sudo su -
Access the file's parent directory (either `cd /etc/` or `cd /etc/pgbackrest/`), and make a backup copy of it:
```{.bash data-prompt="$"}
- $ cp pgbackrest.conf pgbackrest.conf.bak
+ $ sudo cp pgbackrest.conf pgbackrest.conf.orig
```
- Then use the following command to create a basic configuration file using the environment variables we created in a previous step:
+4. Then use the following command to create a basic configuration file using the environment variables we created in a previous step. This example command adds the configuration file at the path `/etc/pgbackrest.conf`. Make sure to specify the correct path for the configuration file on your system:
=== ":material-debian: On Debian/Ubuntu"
```
- cat < pgbackrest.conf
+ echo "
[global]
# Server repo details
@@ -146,13 +142,14 @@ $ sudo su -
pg3-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg3-host-ca-file=${CA_PATH}/ca.crt
pg3-socket-path=/var/run/postgresql
- EOF
+
+ " | sudo tee /etc/pgbackrest.conf
```
=== ":material-redhat: On RHEL/derivatives"
```
- cat < pgbackrest.conf
+ echo "
[global]
# Server repo details
@@ -201,7 +198,7 @@ $ sudo su -
pg1-host=${NODE1_NAME}
pg1-host-port=8432
pg1-port=5432
- pg1-path=/var/lib/pgsql/{{pgversion}}/data
+ pg1-path=/var/lib/postgresql/{{pgversion}}/main
pg1-host-type=tls
pg1-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg1-host-key-file=${CA_PATH}/${SRV_NAME}.key
@@ -211,7 +208,7 @@ $ sudo su -
pg2-host=${NODE2_NAME}
pg2-host-port=8432
pg2-port=5432
- pg2-path=/var/lib/pgsql/{{pgversion}}/data
+ pg2-path=/var/lib/postgresql/{{pgversion}}/main
pg2-host-type=tls
pg2-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg2-host-key-file=${CA_PATH}/${SRV_NAME}.key
@@ -221,55 +218,69 @@ $ sudo su -
pg3-host=${NODE3_NAME}
pg3-host-port=8432
pg3-port=5432
- pg3-path=/var/lib/pgsql/{{pgversion}}/data
+ pg3-path=/var/lib/postgresql/{{pgversion}}/main
pg3-host-type=tls
pg3-host-cert-file=${CA_PATH}/${SRV_NAME}.crt
pg3-host-key-file=${CA_PATH}/${SRV_NAME}.key
pg3-host-ca-file=${CA_PATH}/ca.crt
pg3-socket-path=/var/run/postgresql
- EOF
+
+ " | sudo tee /etc/pgbackrest.conf
```
*NOTE*: The option `backup-standby=y` above indicates the backups should be taken from a standby server. If you are operating with a primary only, or if your secondaries are not configured with `pgBackRest`, set this option to `n`.
### Create the certificate files
+
+Run the following commands as a root user or with `sudo` privileges
1. Create the folder to store the certificates:
```{.bash data-prompt="$"}
- $ mkdir -p ${CA_PATH}
+ $ sudo mkdir -p /etc/ssl/certs/pg_ha
+ ```
+
+2. Create the environment variable to simplify further configuration
+
+ ```{.bash data-prompt="$"}
+ $ export CA_PATH="/etc/ssl/certs/pg_ha"
```
-
-2. Create the certificates and keys
+
+3. Create the CA certificates and keys
+
+ ```{.bash data-prompt="$"}
+ $ sudo openssl req -new -x509 -days 365 -nodes -out ${CA_PATH}/ca.crt -keyout ${CA_PATH}/ca.key -subj "/CN=root-ca"
+ ```
+
+3. Create the certificate and keys for the backup server
```{.bash data-prompt="$"}
- $ openssl req -new -x509 -days 365 -nodes -out ${CA_PATH}/ca.crt -keyout ${CA_PATH}/ca.key -subj "/CN=root-ca"
+ $ sudo openssl req -new -nodes -out ${CA_PATH}/${SRV_NAME}.csr -keyout ${CA_PATH}/${SRV_NAME}.key -subj "/CN=${SRV_NAME}"
```
-3. Create the certificate for the backup and the PostgreSQL servers
+4. Create the certificates and keys for each PostgreSQL node
```{.bash data-prompt="$"}
- $ for node in ${SRV_NAME} ${NODE1_NAME} ${NODE2_NAME} ${NODE3_NAME}
- do
- openssl req -new -nodes -out ${CA_PATH}/$node.csr -keyout ${CA_PATH}/$node.key -subj "/CN=$node";
- done
+ $ sudo openssl req -new -nodes -out ${CA_PATH}/${NODE1_NAME}.csr -keyout ${CA_PATH}/${NODE1_NAME}.key -subj "/CN=${NODE1_NAME}"
+ $ sudo openssl req -new -nodes -out ${CA_PATH}/${NODE2_NAME}.csr -keyout ${CA_PATH}/${NODE2_NAME}.key -subj "/CN=${NODE2_NAME}"
+ $ sudo openssl req -new -nodes -out ${CA_PATH}/${NODE3_NAME}.csr -keyout ${CA_PATH}/${NODE3_NAME}.key -subj "/CN=${NODE3_NAME}"
```
-4. Sign the certificates with the `root-ca` key
+4. Sign all certificates with the `root-ca` key
```{.bash data-prompt="$"}
- $ for node in ${SRV_NAME} ${NODE1_NAME} ${NODE2_NAME} ${NODE3_NAME}
- do
- openssl x509 -req -in ${CA_PATH}/$node.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/$node.crt;
- done
+ $ sudo openssl x509 -req -in ${CA_PATH}/${SRV_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${SRV_NAME}.crt
+ $ sudo openssl x509 -req -in ${CA_PATH}/${NODE1_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${NODE1_NAME}.crt
+ $ sudo openssl x509 -req -in ${CA_PATH}/${NODE2_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${NODE2_NAME}.crt
+ $ sudo openssl x509 -req -in ${CA_PATH}/${NODE3_NAME}.csr -days 365 -CA ${CA_PATH}/ca.crt -CAkey ${CA_PATH}/ca.key -CAcreateserial -out ${CA_PATH}/${NODE3_NAME}.crt
```
5. Remove temporary files, set ownership of the remaining files to the `postgres` user, and restrict their access:
```{.bash data-prompt="$"}
- $ rm -f ${CA_PATH}/*.csr
- $ chown postgres:postgres -R ${CA_PATH}
- $ chmod 0600 ${CA_PATH}/*
+ $ sudo rm -f ${CA_PATH}/*.csr
+ $ sudo chown postgres:postgres -R ${CA_PATH}
+ $ sudo chmod 0600 ${CA_PATH}/*
```
### Create the `pgbackrest` daemon service
@@ -295,30 +306,35 @@ $ sudo su -
WantedBy=multi-user.target
```
-2. Reload, start, and enable the service
+2. Make `systemd` aware of the new service:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl daemon-reload
+ ```
+3. Enable `pgBackRest`:
+
```{.bash data-prompt="$"}
- $ systemctl daemon-reload
- $ systemctl start pgbackrest.service
- $ systemctl enable pgbackrest.service
+ $ sudo systemctl enable --now pgbackrest.service
```
## Configure database servers
Run the following commands on `node1`, `node2`, and `node3`.
-1. Install pgBackRest package
+1. Install `pgBackRest` package
=== ":material-debian: On Debian/Ubuntu"
```{.bash data-prompt="$"}
- $ apt install percona-pgbackrest
+ $ sudo apt install percona-pgbackrest
```
=== ":material-redhat: On RHEL/derivatives"
```{.bash data-prompt="$"}
- $ yum install percona-pgbackrest
+ $ sudo yum install percona-pgbackrest
+ ```
2. Export environment variables to simplify the config file creation:
@@ -331,23 +347,29 @@ Run the following commands on `node1`, `node2`, and `node3`.
3. Create the certificates folder:
```{.bash data-prompt="$"}
- $ mkdir -p ${CA_PATH}
+ $ sudo mkdir -p ${CA_PATH}
```
4. Copy the `.crt`, `.key` certificate files and the `ca.crt` file from the backup server where they were created to every respective node. Then change the ownership to the `postgres` user and restrict their access. Use the following commands to achieve this:
```{.bash data-prompt="$"}
- $ scp ${SRV_NAME}:${CA_PATH}/{$NODE_NAME.crt,$NODE_NAME.key,ca.crt} ${CA_PATH}/
- $ chown postgres:postgres -R ${CA_PATH}
- $ chmod 0600 ${CA_PATH}/*
+ $ sudo scp ${SRV_NAME}:${CA_PATH}/{$NODE_NAME.crt,$NODE_NAME.key,ca.crt} ${CA_PATH}/
+ $ sudo chown postgres:postgres -R ${CA_PATH}
+ $ sudo chmod 0600 ${CA_PATH}/*
```
-5. Edit or create the configuration file which, as explained above, can be either at the `/etc/pgbackrest/pgbackrest.conf` or `/etc/pgbackrest.conf` path:
+5. Make a copy of the configuration file. The path to it can be either `/etc/pgbackrest/pgbackrest.conf` or `/etc/pgbackrest.conf`:
+
+ ```{.bash data-prompt="$"}
+ $ sudo cp pgbackrest.conf pgbackrest.conf.orig
+ ```
+
+6. Create the configuration file. This example command adds the configuration file at the path `/etc/pgbackrest.conf`. Make sure to specify the correct path for the configuration file on your system:
=== ":material-debian: On Debian/Ubuntu"
```ini title="pgbackrest.conf"
- cat < pgbackrest.conf
+ echo "
[global]
repo1-host=${SRV_NAME}
repo1-host-user=postgres
@@ -370,14 +392,14 @@ Run the following commands on `node1`, `node2`, and `node3`.
[cluster_1]
pg1-path=/var/lib/postgresql/{{pgversion}}/main
- EOF
+ " | sudo tee /etc/pgbackrest.conf
```
=== ":material-redhat: On RHEL/derivatives"
```ini title="pgbackrest.conf"
- cat < pgbackrest.conf
+ echo "
[global]
repo1-host=${SRV_NAME}
repo1-host-user=postgres
@@ -400,10 +422,10 @@ Run the following commands on `node1`, `node2`, and `node3`.
[cluster_1]
pg1-path=/var/lib/pgsql/{{pgversion}}/data
- EOF
+ " | sudo tee /etc/pgbackrest.conf
```
-6. Create the pgbackrest `systemd` unit file at the path `/etc/systemd/system/pgbackrest.service`
+7. Create the pgbackrest `systemd` unit file at the path `/etc/systemd/system/pgbackrest.service`
```ini title="/etc/systemd/system/pgbackrest.service"
[Unit]
@@ -424,64 +446,73 @@ Run the following commands on `node1`, `node2`, and `node3`.
WantedBy=multi-user.target
```
-7. Reload, start, and enable the service
+8. Reload the `systemd`, the start the service
```{.bash data-prompt="$"}
- $ systemctl daemon-reload
- $ systemctl start pgbackrest
- $ systemctl enable pgbackrest
+ $ sudo systemctl daemon-reload
+ $ sudo systemctl enable --now pgbackrest
```
The pgBackRest daemon listens on port `8432` by default:
```{.bash data-prompt="$"}
- $ netstat -taunp
- Active Internet connections (servers and established)
- Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
- tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd
- tcp 0 0 0.0.0.0:8432 0.0.0.0:* LISTEN 40224/pgbackrest
+ $ netstat -taunp | grep '8432'
```
-8. If you are using Patroni, change its configuration to use `pgBackRest` for archiving and restoring WAL files. Run this command only on one node, for example, on `node1`:
+ ??? admonition "Sample output"
+
+ ```{text .no-copy}
+ Active Internet connections (servers and established)
+ Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
+ tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd
+ tcp 0 0 0.0.0.0:8432 0.0.0.0:* LISTEN 40224/pgbackrest
+ ```
+
+9. If you are using Patroni, change its configuration to use `pgBackRest` for archiving and restoring WAL files. Run this command only on one node, for example, on `node1`:
```{.bash data-prompt="$"}
$ patronictl -c /etc/patroni/patroni.yml edit-config
```
-
- === ":material-debian: On Debian/Ubuntu"
- ```yaml title="/etc/patroni/patroni.yml"
- postgresql:
- (...)
- parameters:
- (...)
- archive_command: pgbackrest --stanza=cluster_1 archive-push /var/lib/postgresql/{{pgversion}}/main/pg_wal/%f
- (...)
- recovery_conf:
- (...)
- restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1 archive-get %f %p
- (...)
- ```
-
- === ":material-redhat: On RHEL/derivatives"
+ This opens the editor for you.
+
+10. Change the configuration as follows:
+
+ ```yaml title="/etc/patroni/patroni.yml"
+ postgresql:
+ parameters:
+ archive_command: pgbackrest --stanza=cluster_1 archive-push /var/lib/postgresql/{{pgversion}}/main/pg_wal/%f
+ archive_mode: true
+ archive_timeout: 600s
+ hot_standby: true
+ logging_collector: 'on'
+ max_replication_slots: 10
+ max_wal_senders: 5
+ max_wal_size: 10GB
+ wal_keep_segments: 10
+ wal_level: logical
+ wal_log_hints: true
+ recovery_conf:
+ recovery_target_timeline: latest
+ restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1 archive-get %f "%p"
+ use_pg_rewind: true
+ use_slots: true
+ retry_timeout: 10
+ slots:
+ percona_cluster_1:
+ type: physical
+ ttl: 30
+ ```
- ```yaml title="/etc/patroni/patroni.yml"
- postgresql:
- (...)
- parameters:
- archive_command: pgbackrest --stanza=cluster_1 archive-push /var/lib/pgsql/{{pgversion}}/data/pg_wal/%f
- (...)
- recovery_conf:
- restore_command: pgbackrest --config=/etc/pgbackrest.conf --stanza=cluster_1 archive-get %f %p
- (...)
- ```
- Reload the changed configurations:
+11. Reload the changed configurations. Provide the cluster name or the node name for the following command. In our example we use the `cluster_1` cluster name:
```{.bash data-prompt="$"}
- $ patronictl -c /etc/patroni/postgresql.yml reload
+ $ patronictl -c /etc/patroni/patroni.yml reload cluster_1
```
+ It may take a while to reload the new configuration.
+
:material-information: Note: When configuring a PostgreSQL server that is not managed by Patroni to archive/restore WALs from the `pgBackRest` server, edit the server's main configuration file directly and adjust the `archive_command` and `restore_command` variables as shown above.
## Create backups
diff --git a/docs/solutions/postgis.md b/docs/solutions/postgis.md
index 00c0b59f7..ebba7f775 100644
--- a/docs/solutions/postgis.md
+++ b/docs/solutions/postgis.md
@@ -1,7 +1,5 @@
# Spatial data manipulation
-!!! admonition "Version added: 15.3"
-
Organizations dealing with spatial data need to store it somewhere and manipulate it. PostGIS is the open source extension for PostgreSQL that allows doing just that. It adds support for storing the spatial data types such as:
* Geographical data like points, lines, polygons, GPS coordinates that can be mapped on a sphere.
diff --git a/docs/yum.md b/docs/yum.md
index 4cd57fcfe..fc3bd429a 100644
--- a/docs/yum.md
+++ b/docs/yum.md
@@ -347,6 +347,12 @@ $ sudo yum -y install curl
$ sudo yum install percona-pgpool-II-pg{{pgversion}}
```
+ Install pgvector package suite:
+
+ ```{.bash data-prompt="$"}
+ $ sudo yum install percona-pgvector_{{pgversion}} percona-pgvector_{{pgversion}}-debuginfo percona-pgvector_{{pgversion}}-debugsource percona-pgvector_{{pgversion}}-llvmjit
+ ```
+
Some extensions require additional setup in order to use them with Percona Distribution for PostgreSQL. For more information, refer to [Enabling extensions](enable-extensions.md).
### Start the service
diff --git a/ha-haproxy.md b/ha-haproxy.md
new file mode 100644
index 000000000..743494f6e
--- /dev/null
+++ b/ha-haproxy.md
@@ -0,0 +1,67 @@
+# Configure HAProxy
+
+HAproxy is the load balancer and the single point of entry to your PostgreSQL cluster for client applications. A client application accesses the HAPpoxy URL and sends its read/write requests there. Behind-the-scene, HAProxy routes write requests to the primary node and read requests - to the secondaries in a round-robin fashion so that no secondary instance is unnecessarily loaded. To make this happen, provide different ports in the HAProxy configuration file. In this deployment, writes are routed to port 5000 and reads - to port 5001
+
+This way, a client application doesn’t know what node in the underlying cluster is the current primary. HAProxy sends connections to a healthy node (as long as there is at least one healthy node available) and ensures that client application requests are never rejected.
+
+1. Install HAProxy on the `HAProxy-demo` node:
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install percona-haproxy
+ ```
+
+2. The HAProxy configuration file path is: `/etc/haproxy/haproxy.cfg`. Specify the following configuration in this file.
+
+ ```
+ global
+ maxconn 100
+
+ defaults
+ log global
+ mode tcp
+ retries 2
+ timeout client 30m
+ timeout connect 4s
+ timeout server 30m
+ timeout check 5s
+
+ listen stats
+ mode http
+ bind *:7000
+ stats enable
+ stats uri /
+
+ listen primary
+ bind *:5000
+ option httpchk /primary
+ http-check expect status 200
+ default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
+ server node1 node1:5432 maxconn 100 check port 8008
+ server node2 node2:5432 maxconn 100 check port 8008
+ server node3 node3:5432 maxconn 100 check port 8008
+
+ listen standbys
+ balance roundrobin
+ bind *:5001
+ option httpchk /replica
+ http-check expect status 200
+ default-server inter 3s fall 3 rise 2 on-marked-down shutdown-sessions
+ server node1 node1:5432 maxconn 100 check port 8008
+ server node2 node2:5432 maxconn 100 check port 8008
+ server node3 node3:5432 maxconn 100 check port 8008
+ ```
+
+
+ HAProxy will use the REST APIs hosted by Patroni to check the health status of each PostgreSQL node and route the requests appropriately.
+
+3. Restart HAProxy:
+
+ ```{.bash data-prompt="$"}
+ $ sudo systemctl restart haproxy
+ ```
+
+4. Check the HAProxy logs to see if there are any errors:
+
+ ```{.bash data-prompt="$"}
+ $ sudo journalctl -u haproxy.service -n 100 -f
+ ```
\ No newline at end of file
diff --git a/mkdocs-base.yml b/mkdocs-base.yml
index f0ea54a63..8a8a3537f 100644
--- a/mkdocs-base.yml
+++ b/mkdocs-base.yml
@@ -53,7 +53,8 @@ theme:
- content.tabs.link
- content.action.edit
- content.action.view
- - content.code.copy
+ - content.code.copy
+ - content.tooltips
extra_css:
- https://unicons.iconscout.com/release/v3.0.3/css/line.css
@@ -173,10 +174,21 @@ nav:
- Solutions:
- Overview: solutions.md
- High availability:
- - 'High availability': 'solutions/high-availability.md'
- - 'Deploying on Debian or Ubuntu': 'solutions/ha-setup-apt.md'
- - 'Deploying on RHEL or derivatives': 'solutions/ha-setup-yum.md'
- - solutions/pgbackrest.md
+ - 'Overview': 'solutions/high-availability.md'
+ - solutions/ha-measure.md
+ - 'Architecture': solutions/ha-architecture.md
+ - Components:
+ - 'ETCD': 'solutions/etcd-info.md'
+ - 'Patroni': 'solutions/patroni-info.md'
+ - 'HAProxy': 'solutions/haproxy-info.md'
+ - 'pgBackRest': 'solutions/pgbackrest-info.md'
+ - solutions/ha-components.md
+ - Deployment:
+ - 'Initial setup': 'solutions/ha-init-setup.md'
+ - 'etcd setup': 'solutions/ha-etcd-config.md'
+ - 'Patroni setup': 'solutions/ha-patroni.md'
+ - solutions/pgbackrest.md
+ - 'HAProxy setup': 'solutions/ha-haproxy.md'
- solutions/ha-test.md
- Backup and disaster recovery:
- 'Overview': 'solutions/backup-recovery.md'
diff --git a/snippets/check-etcd.md b/snippets/check-etcd.md
new file mode 100644
index 000000000..1bd516fd2
--- /dev/null
+++ b/snippets/check-etcd.md
@@ -0,0 +1,47 @@
+3. Check the etcd cluster members. Use `etcdctl` for this purpose. Ensure that `etcdctl` interacts with etcd using API version 3 and knows which nodes, or endpoints, to communicate with. For this, we will define the required information as environment variables. Run the following commands on one of the nodes:
+
+ ```
+ export ETCDCTL_API=3
+ HOST_1=10.104.0.1
+ HOST_2=10.104.0.2
+ HOST_3=10.104.0.3
+ ENDPOINTS=$HOST_1:2379,$HOST_2:2379,$HOST_3:2379
+ ```
+
+4. Now, list the cluster members and output the result as a table as follows:
+
+ ```{.bash data-prompt="$"}
+ $ sudo etcdctl --endpoints=$ENDPOINTS -w table member list
+ ```
+
+ ??? example "Sample output"
+
+ ```
+ +------------------+---------+-------+------------------------+----------------------------+------------+
+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+ +------------------+---------+-------+------------------------+----------------------------+------------+
+ | 4788684035f976d3 | started | node2 | http://10.104.0.2:2380 | http://192.168.56.102:2379 | false |
+ | 67684e355c833ffa | started | node3 | http://10.104.0.3:2380 | http://192.168.56.103:2379 | false |
+ | 9d2e318af9306c67 | started | node1 | http://10.104.0.1:2380 | http://192.168.56.101:2379 | false |
+ +------------------+---------+-------+------------------------+----------------------------+------------+
+ ```
+
+5. To check what node is currently the leader, use the following command
+
+ ```{.bash data-prompt="$"}
+ $ sudo etcdctl --endpoints=$ENDPOINTS -w table endpoint status
+ ```
+
+ ??? example "Sample output"
+
+ ```{.text .no-copy}
+ +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+ | ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+ +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+ | 10.104.0.1:2379 | 9d2e318af9306c67 | 3.5.16 | 20 kB | true | false | 2 | 10 | 10 | |
+ | 10.104.0.2:2379 | 4788684035f976d3 | 3.5.16 | 20 kB | false | false | 2 | 10 | 10 | |
+ | 10.104.0.3:2379 | 67684e355c833ffa | 3.5.16 | 20 kB | false | false | 2 | 10 | 10 | |
+ +-----------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
+ ```
+
+
\ No newline at end of file
diff --git a/snippets/percona-release-apt.md b/snippets/percona-release-apt.md
new file mode 100644
index 000000000..c3a80d194
--- /dev/null
+++ b/snippets/percona-release-apt.md
@@ -0,0 +1,24 @@
+1. Install the `curl` download utility if it's not installed already:
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt update
+ $ sudo apt install curl
+ ```
+
+2. Download the `percona-release` repository package:
+
+ ```{.bash data-prompt="$"}
+ $ curl -O https://repo.percona.com/apt/percona-release_latest.generic_all.deb
+ ```
+
+3. Install the downloaded repository package and its dependencies using `apt`:
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt install gnupg2 lsb-release ./percona-release_latest.generic_all.deb
+ ```
+
+4. Refresh the local cache to update the package information:
+
+ ```{.bash data-prompt="$"}
+ $ sudo apt update
+ ```
\ No newline at end of file
diff --git a/snippets/percona-release-yum.md b/snippets/percona-release-yum.md
new file mode 100644
index 000000000..05d669385
--- /dev/null
+++ b/snippets/percona-release-yum.md
@@ -0,0 +1,5 @@
+Run the following command as the `root` user or with `sudo` privileges:
+
+```{.bash data-prompt="$"}
+$ sudo yum install -y https://repo.percona.com/yum/percona-release-latest.noarch.rpm
+```
\ No newline at end of file
diff --git a/variables.yml b/variables.yml
index 9342bac61..90156ad3e 100644
--- a/variables.yml
+++ b/variables.yml
@@ -1,14 +1,15 @@
# PG Variables set for HTML output
# See also mkdocs.yml plugins.with-pdf.cover_subtitle and output_path
-release: 'release-notes-v17.0'
+release: 'release-notes-v17.2'
dockertag: '17.0'
pgversion: '17'
pgsmversion: '2.1.0'
-pspgversion: '17.0.1'
+pspgversion: '17.2.1'
date:
+ 17_2: 2024-12-26
17_0: 2024-10-03