Skip to content

Conversation

@tenfyzhong
Copy link
Collaborator

What problem does this PR solve?

This PR addresses the issue where the KeyspaceManager was not periodically updating its cached keyspace metadata. This could lead to outdated information being used by various components of TiCDC, potentially causing inconsistencies or errors.

This PR introduces a periodic update mechanism for the KeyspaceManager to ensure that the cached keyspace metadata is always up-to-date.

Issue Number: close #2731

What is changed and how it works?

The core changes involve refactoring the KeyspaceManager to:

  • Introduce a Manager interface: This provides a cleaner abstraction for interacting with keyspace management functionalities.
  • Implement a Run() method: This method starts a background goroutine that periodically calls updatePeriodicity.
  • Implement updatePeriodicity(): This method uses a time.Ticker to trigger updates at a defined interval (updateDuration).
  • Implement an update() method: This method is responsible for fetching the latest keyspace metadata from PD and updating the internal cache. It uses a mutex (mu) to prevent concurrent updates and ensure that only one update process runs at a time.
  • Modify LoadKeyspace and GetKeyspaceByID: These methods now first attempt to load from the cache and only fetch from PD if the keyspace is not found in the cache or if a periodic update has occurred.
  • Add forceLoadKeyspace: A new internal method to explicitly load keyspace metadata from PD, used by the periodic update mechanism.
  • Update NewKeyspaceManager to NewManager: Reflects the interface change.
  • Update all usages of KeyspaceManager: All instances now use the Manager interface.
  • Add pkg/keyspace/keyspace_manager_classic_test.go and pkg/keyspace/keyspace_manager_next_gen_test.go: These new test files cover the behavior of the KeyspaceManager in both classic and next-gen TiCDC modes.
  • Add pkg/pdutil/api_client_mock.go: This mock is generated to support testing the new KeyspaceManager functionality.
  • Add pkg/keyspace/keyspace_manager.go: This file contains the refactored KeyspaceManager implementation.
  • Add pkg/pdutil/api_client.go: This file contains the PDAPIClient interface and its mock.
  • Modify server.go: The server.go file now calls keyspaceManager.Run() after initializing the KeyspaceManager.
  • Update Makefile: The UT_PACKAGES_OTHERS variable is updated to include the new pkg/keyspace/... path.

Check List

Tests

  • Unit test

Questions

Will it cause performance regression or break compatibility?

This change introduces a background goroutine for periodic updates, which consumes minimal resources. The refactoring of the KeyspaceManager interface and its usage is backward-compatible as the core functionality remains the same.

Do you need to update user documentation, design documentation or monitoring documentation?

No user-facing documentation changes are immediately required. Design documentation related to keyspace management might benefit from an update to reflect the periodic update mechanism.

Release note

- None

- Rename the `KeyspaceManager` interface to `Manager`.
- Rename the `keyspaceManager` struct to `manager`.
- Update all usages of `KeyspaceManager` to `Manager` across the codebase.
- Update the constructor function from `NewKeyspaceManager` to `NewManager`.

Signed-off-by: tenfyzhong <[email protected]>
- Initialize a ticker to periodically update keyspace metadata.
- Implement a background goroutine that fetches and updates keyspace metadata at regular intervals.
- Ensure that updates only occur in non-classic kernel environments.
- Stop the ticker when the manager is closed to prevent resource leaks.

Signed-off-by: tenfyzhong <[email protected]>
- Define `periodicUpdateKeyspace` constant for the ticker interval.
- Update the ticker initialization to use the new constant.

Signed-off-by: tenfyzhong <[email protected]>
- Add GoDoc comments to explain the purpose of each method in the Manager interface.
- This improves code readability and maintainability by clarifying the expected behavior of the interface.

Signed-off-by: tenfyzhong <[email protected]>
- The mutex lock was being acquired and released on every ticker tick, which is inefficient.
- Moved the mutex lock to be acquired once before the loop and released after the loop.
- This change improves performance by reducing lock contention.

Signed-off-by: tenfyzhong <[email protected]>
- Add mock for PDAPIClient to facilitate testing of components that interact with PD.
- Update generate-mock.sh to include the new mock generation.

Signed-off-by: tenfyzhong <[email protected]>
- Add a `Run` method to the `keyspace.Manager` interface and implementation.
- This method initializes a ticker and starts a goroutine for periodic updates of keyspace metadata.
- The `updatePeriodicity` function now correctly handles locking and updating keyspace metadata.
- The `NewManager` function no longer calls `update()` directly.
- The `server.go` file now calls `keyspaceManager.Run()` after creating the manager.
- Add `pkg/keyspace` to `UT_PACKAGES_OTHERS` in the Makefile.
- Add a new test file `pkg/keyspace/keyspace_manager_test.go` to cover the `update` functionality.

Signed-off-by: tenfyzhong <[email protected]>
- Add unit tests for the `GetKeyspaceByID` and `LoadKeyspace` methods of the `manager` struct in the `nextgen` build mode.
- These tests cover scenarios where keyspace metadata is fetched from the PD API client and then cached locally.
- Ensure that subsequent calls to these methods retrieve data from the local cache.
- Add corresponding tests for the `classic` build mode, which do not interact with the PD API client.

Signed-off-by: tenfyzhong <[email protected]>
@ti-chi-bot ti-chi-bot bot added the release-note-none Denotes a PR that doesn't merit a release note. label Oct 20, 2025
@ti-chi-bot
Copy link

ti-chi-bot bot commented Oct 20, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign sdojjy for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 20, 2025
@gemini-code-assist
Copy link

Summary of Changes

Hello @tenfyzhong, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the KeyspaceManager by introducing a robust mechanism for periodic metadata updates. The core change involves refactoring the existing KeyspaceManager into a more abstract Manager interface, which now includes functionality to run a background goroutine for regularly fetching the latest keyspace information from PD. This ensures that all components relying on keyspace metadata consistently operate with up-to-date information, thereby resolving potential inconsistencies caused by stale cached data. The changes are designed to be backward-compatible and introduce minimal resource overhead.

Highlights

  • Periodic Keyspace Metadata Updates: The KeyspaceManager now supports periodic updates of its cached keyspace metadata, preventing the use of outdated information and improving consistency.
  • Refactored Keyspace Manager Interface: The KeyspaceManager has been refactored into a new Manager interface, providing a cleaner abstraction and improving the overall design.
  • Background Goroutine for Updates: A new Run() method starts a background goroutine that uses a time.Ticker to trigger regular updates of keyspace metadata from PD.
  • Enhanced Caching Logic: The LoadKeyspace and GetKeyspaceByID methods now prioritize loading from the local cache and only fetch from PD if the data is not found or a periodic update has occurred, ensuring efficient access while maintaining freshness.
  • Comprehensive Testing and Mocks: New unit tests have been added for both classic and next-gen TiCDC modes, along with a generated mock for the PDAPIClient, to ensure the reliability of the new functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

- Clarify the purpose of the Run method in the Manager interface.

Signed-off-by: tenfyzhong <[email protected]>
@tenfyzhong
Copy link
Collaborator Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the KeyspaceManager to support periodic updates of keyspace metadata, which is a great improvement for keeping the cache up-to-date. The introduction of the Manager interface and the background update mechanism are well-implemented. My review focuses on the new implementation in pkg/keyspace/keyspace_manager.go, where I've identified a couple of areas for improvement to enhance robustness and efficiency, specifically regarding concurrency handling during cache updates and resource cleanup in the Close method.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively refactors the KeyspaceManager to support periodic updates of its cache, which is a great improvement for keeping metadata fresh. The introduction of the Manager interface and the separation of concerns are well done. My review focuses on further improving the concurrency handling and robustness of the new implementation. I've pointed out a potential performance regression due to the removal of double-checked locking, a possible panic in the Close method, and a couple of suggestions to improve code clarity and idiomatic Go usage. Overall, this is a solid refactoring with a few areas for refinement.


k.storageMap = make(map[string]kv.Storage)

k.ticker.Stop()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The Close method calls k.ticker.Stop() without checking if k.ticker is nil. If Run() is not called before Close(), k.ticker will be nil, causing a panic. It's important to make Close() safe to call even if the manager hasn't been fully started.

	if k.ticker != nil {
		k.ticker.Stop()
	}

Comment on lines 205 to 209
mu := &sync.Mutex{}

for range k.ticker.C {
k.update(mu)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The sync.Mutex for preventing overlapping updates is created as a local variable in updatePeriodicity and passed as a pointer to update. While this works, it's more idiomatic and clearer to make the mutex a member of the manager struct. This would make the ownership and scope of the lock explicit.

I suggest the following refactoring:

  1. Add updateMu sync.Mutex to the manager struct.
  2. Change updatePeriodicity to call k.update() without arguments in the loop.
  3. Change update's signature to func (k *manager) update() and use k.updateMu.TryLock() inside it.

Comment on lines +224 to +226
for _, keyspace := range k.keyspaceMap {
keyspaces = append(keyspaces, keyspace.Name)
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This loop iterates over the values (*keyspacepb.KeyspaceMeta) of k.keyspaceMap just to get the keyspace names. It's more direct and slightly more efficient to iterate over the map's keys directly.

	for name := range k.keyspaceMap {
		keyspaces = append(keyspaces, name)
	}

- Removed external mutex from `update` function.
- Added internal `updateMu` to `manager` struct.
- Updated `update` function to use `k.updateMu` for locking.
- Added nil check for ticker in `Close` function.
- Updated tests to reflect the internal mutex usage.

Signed-off-by: tenfyzhong <[email protected]>
- Rename `forceLoadKeyspace` to `ForceLoadKeyspace` for clarity and consistency with naming conventions.
- Update the `LoadKeyspace` method to explicitly call `ForceLoadKeyspace` when the local cache is not found, ensuring data is always fetched from the PD.
- Modify the `KeyspaceCheckerMiddleware` to use `ForceLoadKeyspace` to ensure the latest keyspace metadata is loaded.
- Update tests to reflect the new method name and behavior.

Signed-off-by: tenfyzhong <[email protected]>
- The keyspace state check in `KeyspaceCheckerMiddleware` was redundant.
- The state check has been moved to the `CreateChangefeed` handler in `api/v2/changefeed.go`.
- This change simplifies the middleware and ensures the check is performed at the correct level of the API.

Signed-off-by: tenfyzhong <[email protected]>
- Add validation to `ResumeChangefeed` to ensure the keyspace is enabled.
- Add validation to `UpdateChangefeed` to ensure the keyspace is enabled.

Signed-off-by: tenfyzhong <[email protected]>
- Renamed `MockKeyspaceManager` to `MockManager` to better reflect its purpose.
- Updated type references in middleware and maintainer tests.
- Added `ForceLoadKeyspace` method to the mock to align with the interface change.

Signed-off-by: tenfyzhong <[email protected]>
@tenfyzhong
Copy link
Collaborator Author

/test all

- Add a comment to `unit_test_pkg` target explaining how to specify a package.
- Add a comment to `unit_test_pkg_next_gen` target explaining how to specify a package.

Signed-off-by: tenfyzhong <[email protected]>
@tenfyzhong
Copy link
Collaborator Author

/test all

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-none Denotes a PR that doesn't merit a release note. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

next-gen: Update the local cached keyspace meta periodicity in the keyspaceManager

1 participant