MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL #4037

ParadoxV5 · 2025-05-07T02:42:30Z

The Jira issue number for this PR is: MDEV-18983

Description

rpl_semi_sync_master_wait_for_slave_count is a long-requested variable specifying the minimum number of acknowledging replicas before a semi-synchronous transaction can complete on the primary.
The default count is 1 to match the default behavior of “complete when any one replica acknowledges”.
When this matches the number of replicas, it requires all replicas to acknowledge, and we have full-synchronous replication at home.

Consequently, its purpose also leads it to indirectly control the number of replicas required to keep semi-synchronous replication active when rpl_semi_sync_master_wait_no_slave is set to disable immediately (i.e. auto-revert to asynchronous).

Release Notes

TODO
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.

https://mariadb.com/kb/en/semisynchronous-replication
- rpl_semi_sync_master_wait_for_slave_count
- rpl_semi_sync_master_wait_no_slave

How can this PR be tested?

TODO: Basic, Grant and Functional tests for the new variable

PR quality check

This is a new feature or a refactoring, and the PR is based against the main branch.
~~This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.~~
I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

* Create reüsable functions `is_no_slave()` & `Active_tranx::get_tranx_node()` * Replace `Active_tranx::is_thd_waiter()` with equivalent method `is_tranx_end_pos()`

WIP

fix_rpl_semi_sync_master_wait_for_slave_count

ParadoxV5

The feature should be implemented. While it’s ready for review besides testing, I would prefer we first refactor our code to solve the hindrances I encountered.

ParadoxV5 · 2025-05-13T19:23:04Z

sql/semisync_master.cc

  DBUG_ASSERT(rpl_semi_sync_master_clients > 0);
-  if (!(--rpl_semi_sync_master_clients) && !rpl_semi_sync_master_wait_no_slave)
+  --rpl_semi_sync_master_clients;
+  if (is_no_slave())


This will conflict (and does not supersede) MDEV-36359: Patch NULL deref after disabling semi-sync primary #3931.

-- cannot be inlined inside DBUG_ASSERT as non-DBUG builds skip the entire expression.

ParadoxV5 · 2025-05-13T19:24:14Z

sql/semisync_master.cc

@@ -68,6 +69,14 @@ static ulonglong timespec_to_usec(const struct timespec *ts)
  return (ulonglong) ts->tv_sec * TIME_MILLION + ts->tv_nsec / TIME_THOUSAND;
 }

+/** @return Should we revert to async because there not enough slaves? */
+static bool is_no_slave()


Note from #3931:

The code for “automatic revert to async” is not an active function, but rather passive behaviors of various functions when the conditions are met.

ParadoxV5 · 2025-05-13T19:33:42Z

sql/semisync_master.cc

+    entry= m_active_tranxs->get_tranx_node(log_file_name, log_file_pos);
+    if (entry && ++(entry->acks) >= rpl_semi_sync_master_wait_for_slave_count)
+    {
+      /* Remove all active transaction nodes before this point. */
+      m_active_tranxs->clear_active_tranx_nodes(log_file_name, log_file_pos,
+                                                signal_waiting_transaction);
+      if (m_active_tranxs->is_empty())
+        m_wait_file_name_inited= false;
+    }


report_reply_binlog() is somehow responsible for three distinct subfeatures:

turn Semi-Sync back on if it is not, probably in response to a late ACK post-timeout

Is this even the correct place for this sub-“feature”?
rpl_semi_sync_master_wait_for_slave_count doesn’t even care about it.

update Semi-Sync’s ‘current’ position

dump_start() also calls report_reply_binlog(), yet I believe it’s only needs this subfeature.
The other two subfeatures require this entry && condition to defend against this call.

clear_active_tranx_nodes(): flush and clear transactions (plural?) before and including the specified position to allow them to complete

ParadoxV5 · 2025-05-13T19:57:39Z

sql/semisync_master.cc

+Tranx_node *Active_tranx::find_acked_tranx_node()
+{
+  Tranx_node *new_front;
+  for (Tranx_node *entry= m_trx_front; entry; entry= entry->next)


Repl_semi_sync_master::refresh_wait_for_slave_count(uint32 server_id) calls this to iteratively find the transaction(s) that can now pass with a lowered rpl_semi_sync_master_wait_for_slave_count.
Then it calls
report_reply_binlog() which includes clear_active_tranx_nodes() which iteratively finds those transactions again.
Yes, both iterations are linear.

While the infrequently-used clear_active_tranx_nodes() can use lambdas to merge with this iteration, its call is mixed within report_reply_binlog().
This, along with is_no_slave() & Active_tranx::get_tranx_node(), and how implementing fix_rpl_semi_sync_master_wait_for_slave_count requires new methods in two layers, are signs that Semi-Sync is due for refactoring.

Active_tranx is a naturally-ordered, block-allocating, hashtable with linked list, specialized for managing … its elements.

P.S. It uses neither C++ std::set nor our in-house HASH.

ParadoxV5 · 2025-05-13T20:02:37Z

sql/sys_vars.cc

+static bool fix_rpl_semi_sync_master_wait_for_slave_count
+  (sys_var *self, THD *thd, enum_var_type type)
+{
+  mysql_mutex_unlock(&LOCK_global_system_variables);
+    repl_semisync_master.refresh_wait_for_slave_count(thd->variables.server_id);


Ideally, the refresh is only required when the requirement decreases.
However, it seems that the Sys.Var. system has a limitation that the variable must have an ‘old’ duplicate that an ON_UPDATE callback can compare to tell whether it was decreased.

Intriguingly, other Semi-Sync Primary variables are duplicated and updated through ON_UPDATE even though they don’t have special handling.
This is likely a remnant of Semi-Sync’s past as a plugin.

ParadoxV5 · 2025-05-13T20:14:12Z

sql/sys_vars.cc

+       "The number of slaves that need to acknowledge that they have received "
+       "a transaction before the transaction can complete on the master",
+       GLOBAL_VAR(rpl_semi_sync_master_wait_for_slave_count),
+       CMD_LINE(REQUIRED_ARG), VALID_RANGE(1, 0xFFFF),


matches MySQL

bnestere · 2025-05-13T21:55:49Z

Hi @ParadoxV5 !

Thanks for submitting the PR! A couple quick questions:

Why did you not include any tests? Regardless of whether or not we refactor, we'd need MTR tests for the feature (which wouldn't change whether or not a refactor is included).
Do you have a specific refactor design in mind? I agree the semi-sync code-base is spaghetti-adjacent, and I'm open to improvements.

ParadoxV5 · 2025-05-13T22:37:11Z

Hi @bnestere,

Why did you not include any tests? Regardless of whether or not we refactor, we'd need MTR tests for the feature (which wouldn't change whether or not a refactor is included).

It is simply because I have not begun writing them. I wanted to share those thoughts with the team early.

Do you have a specific refactor design in mind? I agree the semi-sync code-base is spaghetti-adjacent, and I'm open to improvements.

Besides improving the inconveniences I had, I don’t have exact designs in mind, especially when also considering adjacent topics such as proofing for #3931 and coherence with async.

bnestere · 2025-05-14T16:22:35Z

It is simply because I have not begun writing them. I wanted to share those thoughts with the team early.

Ah ok. Agreed that early feedback is ideal :) Overall, I'd say your patch aligns with what I had in mind for the feature, though I didn't do a detailed review. To the general early-feedback process, if you're just looking for design-level feedback (which it seems you are doing here), it should be ok to just write-up/sketch out the idea of the implementation (preferably before starting implementation, as to save time in-case the design is changed). Then later on when you share code, I'd say to at least provide a minimal test to show the code works (even for early feedback).

Besides improving the inconveniences I had, I don’t have exact designs in mind, especially when also considering adjacent topics such as proofing for #3931 and coherence with async.

Ok. My thoughts on the current semi-sync implementation aren't that the implementation is necessarily wrong, just that it is hard to understand the details of what is happening, and when. I wonder if this doesn't really even need that large of a refactoring, but perhaps just some small encapsulations, similar to what you did in this patch, to improve readability. Or at least, it would be a start (as you've already started doing :)). Perhaps if there is time left-over at the end of the sprint, we can address it.

ParadoxV5 · 2025-05-14T16:27:12Z

… just that it is hard to understand the details of what is happening, and when.

The “what” is straightforward to encapsulate, but the “when”… 🤔

ParadoxV5 added MariaDB Corporation Replication Patches involved in replication labels May 7, 2025

ParadoxV5 added 3 commits May 12, 2025 18:52

Deduplicate code in sql/semisync_master.cc

6e4fb08

* Create reüsable functions `is_no_slave()` & `Active_tranx::get_tranx_node()` * Replace `Active_tranx::is_thd_waiter()` with equivalent method `is_tranx_end_pos()`

MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL

6556570

WIP

MDEV-18983 WIP 2

705b878

ParadoxV5 force-pushed the mdev-18983 branch from 6cd2824 to 705b878 Compare May 13, 2025 00:52

MDEV-18983 WIP 3

d82f4b6

fix_rpl_semi_sync_master_wait_for_slave_count

ParadoxV5 requested a review from bnestere May 13, 2025 19:15

ParadoxV5 commented May 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL #4037

MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL #4037

ParadoxV5 commented May 7, 2025 •

edited

Loading

ParadoxV5 left a comment •

edited

Loading

ParadoxV5 May 13, 2025

ParadoxV5 May 13, 2025

ParadoxV5 May 13, 2025

ParadoxV5 May 13, 2025 •

edited

Loading

ParadoxV5 May 13, 2025 •

edited

Loading

ParadoxV5 May 13, 2025

ParadoxV5 May 13, 2025

bnestere commented May 13, 2025

ParadoxV5 commented May 13, 2025

bnestere commented May 14, 2025

ParadoxV5 commented May 14, 2025

MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL #4037

Are you sure you want to change the base?

MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL #4037

Conversation

ParadoxV5 commented May 7, 2025 • edited Loading

Description

Release Notes

How can this PR be tested?

PR quality check

ParadoxV5 left a comment • edited Loading

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025 • edited Loading

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025 • edited Loading

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025

Choose a reason for hiding this comment

ParadoxV5 May 13, 2025

Choose a reason for hiding this comment

bnestere commented May 13, 2025

ParadoxV5 commented May 13, 2025

bnestere commented May 14, 2025

ParadoxV5 commented May 14, 2025

ParadoxV5 commented May 7, 2025 •

edited

Loading

ParadoxV5 left a comment •

edited

Loading

ParadoxV5 May 13, 2025 •

edited

Loading

ParadoxV5 May 13, 2025 •

edited

Loading