Five tips on Postgres logical decoding

Five tips on Postgres logical decoding

Logical decoding is a mechanism that enables users to stream changes on Postgres as a sequence of logical operations like INSERTs, UPDATEs, and DELETEs. This is useful for applications that need to keep an external data store synchronized with Postgres. For example, replicating data in Postgres to Data Warehouses for analytics.

In this blog, I'll walk through 5 interesting logical decoding concepts worth knowing about:

  1. Slot Growth - Under the covers, a logical replication slot captures changes in the Postgres Write-Ahead Log (WAL) and streams them in a human-readable format to the client. A common issue posed by logical decoding is unexpected replication slot growth. This could risk filling up storage and causing server crashes. Slot growth mostly happens when the consumer application (a.k.a. client) that reads changes from a replication slot lags or halts. You could monitor slot activity and its growth to detect if the client is lagging or is halted:

    1. Monitoring the state of the slot: The query below indicates if the slot is active or not. If the active column is predominantly showing true, it means that the consumer app is constantly consuming the changes. Else, it means that the app is halted.

       SELECT slot_name,active FROM pg_replication_slots ;
      
    2. Monitoring the size of the slot - If the size of the slot is constantly growing, it could indicate that the consumer app is not catching up with the rate of ingestion on Postgres. Below

       SELECT slot_name, 
       pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), 
       confirmed_flush_lsn)) 
       AS replication_lag_bytes 
       FROM pg_replication_slots;
      

      There could be other reasons incl. long-running transactions, the client forgetting to flush the slot periodically, etc. causing replication slot growth. More on the causes, preventive measures and mitigation steps for replication slot growth in the next blog!

  2. Large transactions can lead to Slot growth and this can be avoided - By default, a large in-progress transaction cannot be consumed until the transaction finishes. This could potentially increase the size of the slot. To resolve this concern, starting from Postgres 14, the START_REPLICATION command enables clients to consume large in-flight transactions (see proto_version). However, it is the responsibility of the client to reliably handle changes from the in-flight transaction.

  3. PUBLICATION to filter logical decoding changes: When reading changes from a logical replication slot, you can use a PUBLICATION as a filter, allowing you to selectively capture the changes you're interested in. This feature acts as a sieve, enabling filtering based on specific tables, schemas, types of changes (like INSERT, UPDATE, or DELETE), and even column values (introduced in Postgres 15). Crucially, this filtering occurs server-side, reducing the volume of data transmitted over the network.

     /*Create a publication that 
     publishes all changes in two tables*/
     CREATE PUBLICATION mypublication 
     FOR TABLE users, departments;
    
     /*Create a publication that publishes 
     all changes from active departments*/
     CREATE PUBLICATION active_departments FOR TABLE 
     departments WHERE (active IS TRUE);
    
     /*Create a publication that only 
     publishes INSERT operations in one table:*/
     CREATE PUBLICATION insert_only FOR TABLE mydata
         WITH (publish = 'insert');
    
  4. logical_decoding_work_mem - In Postgres 13, a new setting called logical_decoding_work_mem was introduced, controlling how much memory a logical replication connection (wal sender process) can use to decode changes. By default, it's set at 64MB, but if your workload involves large transactions or high concurrency, this might not be sufficient. When that happens, the decoding process might overflow to disk, slowing down logical decoding. Increasing logical_decoding_work_mem, especially if you have available memory, can significantly boost decoding performance. Keep an eye on pg_stat_activity to see if your wal_sender process is struggling with IO wait events, and consider tuning logical_decoding_work_mem. This blog captures a more detailed summary of how this setting works.

  5. Logical decoding via standbys in Postgres 16 - Before Postgres 16, creating logical replication slots on standbys (read-replicas) wasn't possible. However, Postgres 16 brings a significant change—you can now create replication slots on standby! This allows clients to consume changes from the replica rather than the primary server, offering several benefits:

    1. Eases the load on the primary server.

    2. Eliminates the risk of replicating slot growth impacting the primary.

    3. Making logical decoding resilient to failovers.

Closing statements..

At PeerDB, we enable you to easily replicate data from Postgres to Data Warehouses, Queues and Storage. Behind the scenes, we rely on Postgres' logical decoding to make this happen. To ensure that we provide a fast and simple experience to users, we've been delving into the intricacies of logical decoding and trying to keep up with its evolution. This blog captures a few of our learnings. Hope you enjoyed reading the blog!

If you want to give PeerDB a try, these links should prove useful: :)

  1. Quickstart

  2. PeerDB's Github repo

  3. Join PeerDB's Slack community

  4. PeerDB docs