MDEV-36760, MDEV-36781, MDEV-36759: Fix performance regressions after MDEV-29445#4042
Merged
sanja-byelkin merged 6 commits into10.11from May 21, 2025
Merged
MDEV-36760, MDEV-36781, MDEV-36759: Fix performance regressions after MDEV-29445#4042sanja-byelkin merged 6 commits into10.11from
sanja-byelkin merged 6 commits into10.11from
Conversation
|
|
There were two issues with the test:
1. A race between a race_condition.inc and status variable, where the
status variable Rpl_semi_sync_master_status could be ON before the
semi-sync connection finished establishing, resulting in
Rpl_semi_sync_master_clients showing 0 (instead of 1). To fix this,
we simply instead wait for Rpl_semi_sync_master_clients to be 1
before proceeding.
2. Another race between a race_condition.inc and status variable,
where the wait_condition waited on a process_list command of
'BINLOG DUMP' to disappear to infer the binlog dump thread was
killed, to where we then verified semi-sync state was correct
using status variables. However, the 'BINLOG DUMP' command is
overridden with a killed status before the semi-sync tear-down
happens, and thereby we could see invalid values. The fix for
this is to change the wait_condition to instead wait for the
connection with the replication user is gone, because that stays
through the binlog dump thread tear-down life-cycle
log_t::append_prepare_wait(): Do not attempt to read log_sys.write_lsn because it is not protected by log_sys.latch but by write_lock, which we cannot hold here. The assertion could fail if log_t::write_buf() is executing concurrently, and it has not yet executed log_write_buf() or updated log_sys.write_lsn. Fixes up commit acd071f (MDEV-21923)
In commit b692342 (MDEV-29445) we started to specify the MAP_POPULATE flag for allocating the InnoDB buffer pool. This would cause a lot of time to be spent on __mm_populate() inside the Linux kernel, such as 16 seconds to pre-fault or commit innodb_buffer_pool_size=64G. Let us revert to the previous way of allocating the buffer pool at startup. Note: An attempt to increase the buffer pool size by SET GLOBAL innodb_buffer_pool_size (up to innodb_buffer_pool_size_max) will invoke my_virtual_mem_commit(), which will use MAP_POPULATE to zero-fill and prefault the requested additional memory area, blocking buf_pool.mutex. Before MDEV-29445 we allocated the InnoDB buffer pool by invoking mmap(2) once (via my_large_malloc()). After the change, we would invoke mmap(2) twice, first via my_virtual_mem_reserve() and then via my_virtual_mem_commit(). Outside Microsoft Windows, we are reverting back to my_large_malloc() like allocation. my_virtual_mem_reserve(): Define only for Microsoft Windows. Other platforms should invoke my_large_virtual_alloc() and update_malloc_size() instead of my_virtual_mem_reserve() and my_virtual_mem_commit(). my_large_virtual_alloc(): Define only outside Microsoft Windows. Do not specify MAP_NORESERVE nor MAP_POPULATE, to preserve compatibility with my_large_malloc(). Were MAP_POPULATE specified, the mmap() system call would be significantly slower, for example 18 seconds to reserve 64 GiB upfront.
buf_buddy_shrink(): Properly cover the case when KEY_BLOCK_SIZE corresponds to the innodb_page_size, that is, the ROW_FORMAT=COMPRESSED page frame is directly allocated from the buffer pool, not via the binary buddy allocator. buf_LRU_check_size_of_non_data_objects(): Avoid a crash when the buffer pool is being shrunk. buf_pool_t::shrink(): Abort if over 95% of the shrunk buffer pool would be occupied by the adaptive hash index or record locks.
In commit b692342 (MDEV-29445) some hash tables were accidentally created with the minimum size (101 entries) instead of correctly deriving the size from the initial innodb_buffer_pool_size. This led to very long hash bucket chains, which are very slow to traverse. ut_find_prime(): Assert that the size is nonzero in order to catch this type of regression in the future. innodb_init_params(): Do not bother reading buf_pool.curr_size() when it is known to be 0, srv_start(): Correctly initialize srv_lock_table_size to 5 times buf_pool.curr_size(), that is, the buffer pool size in pages, between invoking buf_pool.create() and lock_sys.create(). btr_search_enable(), dict_sys_t::create(), dict_sys_t::resize(): Correctly refer to buf_pool.curr_pool_size(), that is, innodb_buffer_pool_size in bytes, when calculating the hash table size. In MDEV-29445 the expressions buf_pool_get_curr_size() were accidentally replaced with buf_pool.curr_size().
mariadb-DebarunBanerjee
approved these changes
May 13, 2025
Contributor
mariadb-DebarunBanerjee
left a comment
There was a problem hiding this comment.
I am done with the review for following 6 commits included.
- MDEV-36663: Testcase Fixup : Ok, trivial patch.
- MDEV-36760 log_t::append_prepare_wait(): Ok, I reviewed in pull/4038
- MDEV-36780: InnoDB buffer pool reserves all assigned memory:Ok but this could impact and need testing to ensure that we are back to the previous level.
- MDEV-36781: Assertion i < BUF_BUDDY_SIZES : Ok, Trivial debug specific
- MDEV-36759: Huge performance drop : This seems to be fixing the real regression. Great catch and the fix looks correct.
grooverdan
referenced
this pull request
in MariaDB/mariadb-docker
May 14, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fix several performance regressions due to #3107. See the individual commits.
Release Notes
Several performance regressions were fixed.
How can this PR be tested?
For an example, see MDEV-36759.
Basing the PR against the correct MariaDB version
mainbranch.PR quality check