PostgreSQL Master and Slave Configuration Guide: Implementing Streaming Replication

PostgreSQL’s streaming replication is a powerful feature that allows real-time data replication from a primary database server (master) to one or more standby servers (slaves). This guide will take you through the steps required to set up streaming replication, including the configuration of the primary node, creation of a replication user, and the setup of the standby node.

Configuring the Primary Node

The primary node requires specific configuration changes to enable streaming replication. Below are the steps to configure the primary node:

1. Modify postgresql.conf

To enable streaming replication, you’ll need to adjust several parameters in the postgresql.conf file located in your data directory:

wal_level = replica               /* Enables streaming replication */

archive_mode = on                 /* Enables the archive process */

archive_command = 'cp %p /data/PRIMARY_LIVE_16/archive/%f'  /* Archive directory setup */

max_wal_senders = 1               /* Number of parallel WAL senders */

wal_keep_segments = 30            /* Maintain 30 WAL segments in the pg_xlog directory */

Explanation:

wal_level = replica: This setting ensures that sufficient information is written to the WAL (Write-Ahead Log) to support streaming replication.
archive_mode = on: Enables WAL archiving, allowing for the archiving of completed WAL files.
archive_command: Defines the command to archive completed WAL files, storing them in the specified directory.
max_wal_senders: Limits the number of concurrent connections for WAL sending.
wal_keep_segments: Retains a specified number of WAL segments in the pg_xlog directory to ensure smooth replication.

2. Create a Replication User

A dedicated replication user is required to manage replication tasks. You can create this user with the following SQL command:

CREATE ROLE replica_user WITH REPLICATION LOGIN PASSWORD 'replica_user';

This command creates a user with the necessary replication privileges, which will be used for streaming replication.

3. Restart the Primary Server

After making the configuration changes, restart the primary server to apply them:

pg_ctl -D /path/to/data restart

Make sure to replace /path/to/data with the actual path to your data directory.

Setting Up the Standby Node

With the primary node configured, the next step is to set up the standby node to start receiving replicated data.

1. Create a Backup for the Standby Node

Use the pg_basebackup utility to create a base backup from the primary server. This backup will be used to initialize the standby server:

pg_basebackup -p 5432 -D /data/standby/main -U replica_user -W -P -R

Options Explained:

-P: Displays progress information during the backup process.
-W: Prompts for the replication user’s password.
-R: Automatically writes the recovery.conf file required for the standby server to start in recovery mode.

2. Configure Standby Node Settings

After creating the backup, update the configuration on the standby node by editing the recovery.conf file:

standby_mode = 'on'

primary_conninfo = 'host=192.168.206.130 port=5433 user=replica_user password=replica_user'

restore_command = 'cp /data/standby_16/archive/%f %p'

Explanation:

standby_mode: Enables the standby mode on the server, allowing it to act as a replica.
primary_conninfo: Specifies the connection details (host, port, user, password) to the primary server.
restore_command: Defines how the standby server retrieves archived WAL files from the specified directory.

3. Verify the Standby Server

To ensure that the standby server is correctly set up and in recovery mode, run the following query:

SELECT pg_is_in_recovery();

If the result is true, the standby server is successfully in recovery mode and ready to replicate data from the primary server.

4. Additional Backup (Optional)

For redundancy, you can create another backup using the following command:

pg_basebackup -p 5432 -D /data/PRIMARY_LIVE_16/standby_backup -U replica_user -W -P -R

This step is optional but recommended for environments where high availability is critical.