Skip to content

feat(cluster): expose replicationSlots on the CNPG Cluster#877

Open
philippemnoel wants to merge 1 commit into
cloudnative-pg:mainfrom
paradedb:feat/replication-slots-failover
Open

feat(cluster): expose replicationSlots on the CNPG Cluster#877
philippemnoel wants to merge 1 commit into
cloudnative-pg:mainfrom
paradedb:feat/replication-slots-failover

Conversation

@philippemnoel
Copy link
Copy Markdown
Contributor

What

  • Exposes CloudNativePG Cluster.spec.replicationSlots on the cluster chart through cluster.replicationSlots.
  • Documents the logical decoding slot failover settings required for CDC tools (e.g. Debezium, Artie).
  • Extends the non-default-configuration chainsaw test to render and assert the new block plus the matching PostgreSQL parameters.

Why

CDC tools (Debezium, Artie, etc.) create and consume a logical replication slot on the current primary. Before this change, a CNPG switchover or failover could leave that logical slot behind on the old primary, which then becomes a replica. CDC stops reconnecting cleanly and the abandoned slot can continue retaining WAL.

CloudNativePG can coordinate logical decoding slot synchronization across HA instances via spec.replicationSlots, but the cluster chart did not previously surface that field, so chart users had no way to enable it without bypassing the chart.

Requirements for CDC failover

  • CloudNativePG operator and CRDs 1.27+.
  • PostgreSQL 17+ for native failover slots (PostgreSQL 18 is covered).
  • Set cluster.replicationSlots.highAvailability.synchronizeLogicalDecoding: true.
  • Set cluster.postgresql.parameters.hot_standby_feedback: "on".
  • Set cluster.postgresql.parameters.sync_replication_slots: "on".
  • Ensure the CDC client creates or alters its logical slot with failover = true; CNPG cannot move a normal non-failover logical slot.
  • Before planned failovers, verify the target standby has the logical slot with synced = true, temporary = false, and invalidation_reason IS NULL.

Backwards compatibility

cluster.replicationSlots defaults to {}, and the template uses {{- with ... }} so the block is omitted entirely when unset. Existing deployments render identically.

Tests

  • helm lint charts/cluster
  • helm template test charts/cluster --show-only templates/cluster.yaml (no replicationSlots rendered by default)
  • helm template test charts/cluster --show-only templates/cluster.yaml --values charts/cluster/test/postgresql-cluster-configuration/01-non_default_configuration_cluster.yaml (renders replicationSlots verbatim)
  • The chainsaw postgresql-cluster-configuration test now asserts the rendered replicationSlots block.

Surfaces CloudNativePG's `spec.replicationSlots` on the cluster chart so
chart users can enable synchronization of user-created logical
replication slots between the primary and standbys. With PostgreSQL 17+
failover slots, this lets CDC consumers (e.g. Debezium) survive a CNPG
failover without losing the slot.

To make logical decoding slots survive failover, users must:
- enable `cluster.replicationSlots.highAvailability.synchronizeLogicalDecoding`
- set `cluster.postgresql.parameters.hot_standby_feedback: "on"`
- set `cluster.postgresql.parameters.sync_replication_slots: "on"`
- create the CDC client's logical slot with `failover = true`

Requires CloudNativePG 1.27+ and PostgreSQL 17+ for native failover
slots. The block is omitted entirely when the value is empty, so
existing deployments are unaffected.

The non-default-configuration chainsaw test is extended with a
`replicationSlots` block and the matching PostgreSQL parameters, and
asserts they appear verbatim on the rendered Cluster CR.

Signed-off-by: Philippe Noël <philippemnoel@gmail.com>
@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant