Skip to content

br/pkg/streamhelper: fix flaky TestGCServiceSafePoint assertion (#66755)#66889

Open
ti-chi-bot wants to merge 2 commits intopingcap:release-8.5from
ti-chi-bot:cherry-pick-66755-to-release-8.5
Open

br/pkg/streamhelper: fix flaky TestGCServiceSafePoint assertion (#66755)#66889
ti-chi-bot wants to merge 2 commits intopingcap:release-8.5from
ti-chi-bot:cherry-pick-66755-to-release-8.5

Conversation

@ti-chi-bot
Copy link
Member

@ti-chi-bot ti-chi-bot commented Mar 11, 2026

This is an automated cherry-pick of #66755

What problem does this PR solve?

Issue Number: close #66731

Problem Summary:

TestGCServiceSafePoint in br/pkg/streamhelper is flaky because it waits for serviceGCSafePoint != 0 after task removal. In this test setup, cp can be 1, so cp-1 is validly 0, causing intermittent false failures.

What changed and how does it work?

  • Updated TestGCServiceSafePoint to wait only for serviceGCSafePointDeleted after unregistering the task.
  • Kept the existing earlier assertion serviceGCSafePoint == cp-1, so the safe point value is still verified.

This removes the nondeterministic != 0 condition while preserving the intended behavior checks.

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No need to test
    • I checked and no code files have been changed.
    • Test-only assertion update for flaky behavior; no production logic changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

None

Summary by CodeRabbit

  • Tests
    • Improved test coverage for garbage collection safe point state management, with enhanced assertions to verify initialization and deletion behavior in backup operations.

@ti-chi-bot ti-chi-bot added contribution This PR is from a community contributor. ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR. labels Mar 11, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This PR addresses a flaky test by adding explicit state tracking to the test infrastructure. A new serviceGCSafePointSet boolean field is introduced to the fake cluster struct and set when the GC safe point is updated, allowing the test to verify this state was properly set while also simplifying the final wait condition.

Changes

Cohort / File(s) Summary
Test state tracking
br/pkg/streamhelper/basic_lib_for_test.go
Added serviceGCSafePointSet bool field to fakeCluster struct; field is set to true when BlockGCUntil updates the safe point.
Test assertion refinement
br/pkg/streamhelper/advancer_test.go
Updated TestGCServiceSafePoint to assert that serviceGCSafePointSet is true after OnTick, and removed the serviceGCSafePoint != 0 requirement from the final wait condition, now only checking serviceGCSafePointDeleted.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Possibly related PRs

Suggested labels

lgtm

Suggested reviewers

  • YuJuncen
  • Leavrth

Poem

🐰 A test that flickers in the night,
Now watches flags to shine so bright,
With state we track and conditions clear,
The flaky dance will disappear! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the specific test being fixed (TestGCServiceSafePoint) and the problem being addressed (flaky assertion).
Description check ✅ Passed The description follows the template with issue number, problem summary, explanation of changes, and test checklist completed appropriately.
Linked Issues check ✅ Passed The PR directly addresses issue #66731 by eliminating the nondeterministic serviceGCSafePoint != 0 condition and verifying safe point registration/deletion deterministically.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing the flaky test assertion; no production logic modifications or unrelated changes are present.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
br/pkg/streamhelper/advancer_test.go (1)

210-213: Minor: Locking consistency.

Line 210 reads serviceGCSafePoint without the mutex, while lines 211-213 acquire the lock for serviceGCSafePointSet. Since both fields are protected by the same mutex in BlockGCUntil, consider consolidating both reads under a single lock for consistency:

♻️ Optional: Consolidate assertions under lock
 	req.NoError(adv.OnTick(ctx))
-	req.Equal(env.serviceGCSafePoint, cp-1)
 	env.fakeCluster.mu.Lock()
+	req.Equal(env.serviceGCSafePoint, cp-1)
 	req.True(env.serviceGCSafePointSet)
 	env.fakeCluster.mu.Unlock()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@br/pkg/streamhelper/advancer_test.go` around lines 210 - 213, The test reads
env.serviceGCSafePoint without holding the cluster mutex while later assertions
lock env.fakeCluster.mu; to make locking consistent, acquire env.fakeCluster.mu
before reading env.serviceGCSafePoint and keep the check of
env.serviceGCSafePointSet inside the same critical section (i.e., wrap the
req.Equal(env.serviceGCSafePoint, cp-1) and req.True(env.serviceGCSafePointSet)
under a single env.fakeCluster.mu.Lock()/Unlock() block), matching how
BlockGCUntil protects those fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@br/pkg/streamhelper/advancer_test.go`:
- Around line 210-213: The test reads env.serviceGCSafePoint without holding the
cluster mutex while later assertions lock env.fakeCluster.mu; to make locking
consistent, acquire env.fakeCluster.mu before reading env.serviceGCSafePoint and
keep the check of env.serviceGCSafePointSet inside the same critical section
(i.e., wrap the req.Equal(env.serviceGCSafePoint, cp-1) and
req.True(env.serviceGCSafePointSet) under a single
env.fakeCluster.mu.Lock()/Unlock() block), matching how BlockGCUntil protects
those fields.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 51e25915-e1eb-49e6-9336-0f35d4887a7f

📥 Commits

Reviewing files that changed from the base of the PR and between 1c68d4c and 693f338.

📒 Files selected for processing (2)
  • br/pkg/streamhelper/advancer_test.go
  • br/pkg/streamhelper/basic_lib_for_test.go

@codecov
Copy link

codecov bot commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (release-8.5@1c68d4c). Learn more about missing BASE report.

Additional details and impacted files
@@               Coverage Diff                @@
##             release-8.5     #66889   +/-   ##
================================================
  Coverage               ?   55.5390%           
================================================
  Files                  ?       1819           
  Lines                  ?     652779           
  Branches               ?          0           
================================================
  Hits                   ?     362547           
  Misses                 ?     263318           
  Partials               ?      26914           
Flag Coverage Δ
integration 39.2159% <ø> (?)
unit 64.9688% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.9954% <0.0000%> (?)
parser ∅ <0.0000%> (?)
br 63.9343% <0.0000%> (?)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ti-chi-bot ti-chi-bot bot added cherry-pick-approved Cherry pick PR approved by release team. and removed do-not-merge/cherry-pick-not-approved labels Mar 16, 2026
@ti-chi-bot ti-chi-bot bot added approved needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 17, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Leavrth, YuJuncen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added lgtm and removed needs-1-more-lgtm Indicates a PR needs 1 more LGTM. labels Mar 17, 2026
@ti-chi-bot
Copy link

ti-chi-bot bot commented Mar 17, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-03-17 00:55:10.415032832 +0000 UTC m=+232037.502690369: ☑️ agreed by Leavrth.
  • 2026-03-17 05:37:17.811053442 +0000 UTC m=+248964.898710979: ☑️ agreed by YuJuncen.

@v01dstar
Copy link
Contributor

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved cherry-pick-approved Cherry pick PR approved by release team. contribution This PR is from a community contributor. lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. type/cherry-pick-for-release-8.5 This PR is cherry-picked to release-8.5 from a source PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants