standby node stuck in catchingup state when replication slot lost on primary node #1031
Unanswered
rhicks0614
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
hi,
I'm trying out pg_auto_failover version 2.1.2, using two nodes (postgresql-14).
I'm additionally setting max_slot_wal_keep_size, due to disk space limitations.
After stopping the standby node, the primary node went to "wait_primary" as expected.
Eventually the replication slot wal_status goes to "lost", due to max_slot_wal_keep_size being set:
Now after restarting the standby, I observe it is stuck in the "catchingup" state per "pg_autoctl show state":
and in the postgres logs, the standby is continually trying to start streaming again, even though the WAL segment has been removed (lost):
Is there some way to trigger it to give up, and proceed to do pg_basebackup to recover?
thanks!
Beta Was this translation helpful? Give feedback.
All reactions