-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
MDEV-18983 Port rpl_semi_sync_master_wait_for_slave_count from MySQL #4037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Create reüsable functions `is_no_slave()` & `Active_tranx::get_tranx_node()` * Replace `Active_tranx::is_thd_waiter()` with equivalent method `is_tranx_end_pos()`
fix_rpl_semi_sync_master_wait_for_slave_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The feature should be implemented. While it’s ready for review besides testing, I would prefer we first refactor our code to solve the hindrances I encountered.
DBUG_ASSERT(rpl_semi_sync_master_clients > 0); | ||
if (!(--rpl_semi_sync_master_clients) && !rpl_semi_sync_master_wait_no_slave) | ||
--rpl_semi_sync_master_clients; | ||
if (is_no_slave()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This will conflict (and does not supersede) MDEV-36359: Patch NULL deref after disabling semi-sync primary #3931.
--
cannot be inlined insideDBUG_ASSERT
as non-DBUG builds skip the entire expression.
@@ -68,6 +69,14 @@ static ulonglong timespec_to_usec(const struct timespec *ts) | |||
return (ulonglong) ts->tv_sec * TIME_MILLION + ts->tv_nsec / TIME_THOUSAND; | |||
} | |||
|
|||
/** @return Should we revert to async because there not enough slaves? */ | |||
static bool is_no_slave() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from #3931:
The code for “automatic revert to async” is not an active function, but rather passive behaviors of various functions when the conditions are met.
entry= m_active_tranxs->get_tranx_node(log_file_name, log_file_pos); | ||
if (entry && ++(entry->acks) >= rpl_semi_sync_master_wait_for_slave_count) | ||
{ | ||
/* Remove all active transaction nodes before this point. */ | ||
m_active_tranxs->clear_active_tranx_nodes(log_file_name, log_file_pos, | ||
signal_waiting_transaction); | ||
if (m_active_tranxs->is_empty()) | ||
m_wait_file_name_inited= false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
report_reply_binlog()
is somehow responsible for three distinct subfeatures:
- turn Semi-Sync back on if it is not, probably in response to a late ACK post-timeout
- Is this even the correct place for this sub-“feature”?
rpl_semi_sync_master_wait_for_slave_count
doesn’t even care about it.
- Is this even the correct place for this sub-“feature”?
- update Semi-Sync’s ‘current’ position
dump_start()
also callsreport_reply_binlog()
, yet I believe it’s only needs this subfeature.
The other two subfeatures require thisentry &&
condition to defend against this call.
clear_active_tranx_nodes()
: flush and clear transactions (plural?) before and including the specified position to allow them to complete
Tranx_node *Active_tranx::find_acked_tranx_node() | ||
{ | ||
Tranx_node *new_front; | ||
for (Tranx_node *entry= m_trx_front; entry; entry= entry->next) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repl_semi_sync_master::refresh_wait_for_slave_count(uint32 server_id)
calls this to iteratively find the transaction(s) that can now pass with a lowered rpl_semi_sync_master_wait_for_slave_count
.
Then it calls
report_reply_binlog()
which includes clear_active_tranx_nodes()
which iteratively finds those transactions again.
Yes, both iterations are linear.
While the infrequently-used clear_active_tranx_nodes()
can use lambdas to merge with this iteration, its call is mixed within report_reply_binlog()
.
This, along with is_no_slave()
& Active_tranx::get_tranx_node()
, and how implementing fix_rpl_semi_sync_master_wait_for_slave_count
requires new methods in two layers, are signs that Semi-Sync is due for refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Active_tranx
is a naturally-ordered, block-allocating, hashtable with linked list, specialized for managing … its elements.
P.S. It uses neither C++ std::set
nor our in-house HASH
.
static bool fix_rpl_semi_sync_master_wait_for_slave_count | ||
(sys_var *self, THD *thd, enum_var_type type) | ||
{ | ||
mysql_mutex_unlock(&LOCK_global_system_variables); | ||
repl_semisync_master.refresh_wait_for_slave_count(thd->variables.server_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, the refresh is only required when the requirement decreases.
However, it seems that the Sys.Var. system has a limitation that the variable must have an ‘old’ duplicate that an ON_UPDATE
callback can compare to tell whether it was decreased.
Intriguingly, other Semi-Sync Primary variables are duplicated and updated through ON_UPDATE
even though they don’t have special handling.
This is likely a remnant of Semi-Sync’s past as a plugin.
"The number of slaves that need to acknowledge that they have received " | ||
"a transaction before the transaction can complete on the master", | ||
GLOBAL_VAR(rpl_semi_sync_master_wait_for_slave_count), | ||
CMD_LINE(REQUIRED_ARG), VALID_RANGE(1, 0xFFFF), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
matches MySQL
Hi @ParadoxV5 ! Thanks for submitting the PR! A couple quick questions:
|
Hi @bnestere,
It is simply because I have not begun writing them. I wanted to share those thoughts with the team early.
Besides improving the inconveniences I had, I don’t have exact designs in mind, especially when also considering adjacent topics such as proofing for #3931 and coherence with async. |
Ah ok. Agreed that early feedback is ideal :) Overall, I'd say your patch aligns with what I had in mind for the feature, though I didn't do a detailed review. To the general early-feedback process, if you're just looking for design-level feedback (which it seems you are doing here), it should be ok to just write-up/sketch out the idea of the implementation (preferably before starting implementation, as to save time in-case the design is changed). Then later on when you share code, I'd say to at least provide a minimal test to show the code works (even for early feedback).
Ok. My thoughts on the current semi-sync implementation aren't that the implementation is necessarily wrong, just that it is hard to understand the details of what is happening, and when. I wonder if this doesn't really even need that large of a refactoring, but perhaps just some small encapsulations, similar to what you did in this patch, to improve readability. Or at least, it would be a start (as you've already started doing :)). Perhaps if there is time left-over at the end of the sprint, we can address it. |
The “what” is straightforward to encapsulate, but the “when”… 🤔 |
Description
rpl_semi_sync_master_wait_for_slave_count
is a long-requested variable specifying the minimum number of acknowledging replicas before a semi-synchronous transaction can complete on the primary.The default count is
1
to match the default behavior of “complete when any one replica acknowledges”.When this matches the number of replicas, it requires all replicas to acknowledge, and we have full-synchronous replication at home.
Consequently, its purpose also leads it to indirectly control the number of replicas required to keep semi-synchronous replication active when
rpl_semi_sync_master_wait_no_slave
is set to disable immediately (i.e. auto-revert to asynchronous).Release Notes
TODO
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.
rpl_semi_sync_master_wait_for_slave_count
rpl_semi_sync_master_wait_no_slave
How can this PR be tested?
TODO: Basic, Grant and Functional tests for the new variable
PR quality check
main
branch.This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.