Skip to content

fix(ci_visibility): git multi-process issues#18156

Open
gnufede wants to merge 10 commits into
mainfrom
gnufede/fix-git-warning-issues
Open

fix(ci_visibility): git multi-process issues#18156
gnufede wants to merge 10 commits into
mainfrom
gnufede/fix-git-warning-issues

Conversation

@gnufede
Copy link
Copy Markdown
Member

@gnufede gnufede commented May 19, 2026

Description

Fixes two sources of spurious git warning logs that customers see when using the pytest plugin under pytest-xdist (or other multi-process) environments.

Problem 1: shallow lock contention
Multiple xdist workers each call git fetch --update-shallow during startup. They all compete for .git/shallow.lock, causing repeated fatal: Unable to create '.../.git/shallow.lock': File exists warnings. Fixed by adding _call_git_with_lock_retry() in git.py, which retries up to 5 times with exponential back-off + jitter when lock contention is detected.

Problem 2: merge-base called on a shallow repo
git merge-base was called inside get_env_tags(), which runs at plugin startup before any unshallowing. On shallow clones the commits aren't locally available yet, so the call always fails
silently. Fixed by removing the call from get_env_tags() and deferring it to SessionManager.upload_git_data(), after unshallowing has completed. The standalone helper
get_pr_base_commit_sha() that was the only caller is removed as dead code.

Testing

New tests added:

  • TestGitLockRetry — unit tests for _call_git_with_lock_retry: success path, non-lock errors skip retry, lock error retries then succeeds, retry exhaustion returns last failure,
    unshallow_repository uses the retry helper, get_merge_base does not (it's read-only).
  • TestUpdatePrMergeBase — unit tests for SessionManager._update_pr_merge_base: skips when SHA already set, skips when inputs are missing, sets the tag when both SHAs are present, skips when
    merge-base returns empty, and two wiring tests verifying it is called from upload_git_data() in both shallow and non-shallow scenarios.
  • Updated TestGitUnshallow assertions to reflect that _call_git_with_lock_retry now passes input_string as an explicit second argument to _call_git.

No regression tests (real shallow clone + full plugin run) were added; that would require a more involved fixture setup.

Risks

Low. The changes are confined to the pytest plugin internals and only affect CI Visibility data collection, not any tracer hot paths.

  • The retry logic only activates when a specific lock-error string is matched; all other errors fall through immediately as before.
  • PULL_REQUEST_BASE_BRANCH_SHA was already an optional tag — if merge-base fails it is simply absent, same as before.
  • Removing get_pr_base_commit_sha is safe; grep confirms it had no callers outside its own definition.

gnufede and others added 4 commits May 19, 2026 10:25
…vironments

In multi-process (xdist) runs each worker invokes git commands concurrently.
Two issues caused spurious warnings:

1. Simultaneous `git fetch --update-shallow` calls compete for
   `.git/shallow.lock`.  Fix: retry up to 5 times with exponential
   back-off + jitter via a new `_call_git_with_lock_retry()` helper.

2. `git merge-base` was called inside `get_env_tags()`, before the
   repository was unshallowed, so the required commits were not yet
   available locally.  Fix: defer the call to `upload_git_data()`,
   after any unshallowing has completed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Merge-base is now computed directly via Git.get_merge_base() inside
SessionManager._update_pr_merge_base(), so this standalone wrapper is
dead code.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented May 19, 2026

Codeowners resolved as

ddtrace/testing/internal/session_manager.py                             @DataDog/ci-app-libraries
tests/testing/mocks.py                                                  @DataDog/ci-app-libraries

@datadog-official
Copy link
Copy Markdown
Contributor

datadog-official Bot commented May 19, 2026

Pipelines  Tests

Fix all issues with BitsAI

⚠️ Warnings

🚦 5 Pipeline jobs failed

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 10/17   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). TypeError: '>=' not supported between instances of 'Mock' and 'tuple' in tests/testing/internal/session_manager.py:455

🧪 1 Test failed

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.14] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 12/17   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). TypeError: '>=' not supported between instances of 'Mock' and 'tuple' in tests/testing/internal/session_manager.py:455

🧪 1 Test failed

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.11] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 16/17   View in Datadog   GitLab

🔧 Fix in code (Fix with Cursor). TypeError: '>=' not supported between instances of 'Mock' and 'tuple' in tests/testing/internal/session_manager.py:455

🧪 1 Test failed

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.14] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

View all 5 failed jobs.

🧪 9 Tests failed in 9 jobs

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 11/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.11] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 13/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.9] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 14/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.14] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 15/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.12] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 17/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.10] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 3/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.13] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 4/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.10] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 5/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.12] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

DataDog/apm-reliability/dd-trace-py | ci_visibility/testing 6/17

TestUploadGitDataSkipping::test_git_upload_proceeds_in_online_mode[py3.13] from test_bazel_offline_session_manager.py   View in Datadog (Fix with Cursor)
&#39;&gt;=&#39; not supported between instances of &#39;Mock&#39; and &#39;tuple&#39;

ℹ️ Info

No other issues found (see more)

❄️ No new flaky tests detected

🔄 Datadog retried 6 tests - 0 passed on retry View in Datadog

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 2923a32 | Docs | Datadog PR Page | Give us feedback!

@gnufede gnufede changed the title Gnufede/fix git warning issues fix(ci_visibility): git warning issues May 19, 2026
gnufede and others added 2 commits May 19, 2026 12:07
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pull the initial call outside the loop so the loop body is only retry
logic, eliminating the unreachable return and making control flow obvious.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gnufede gnufede changed the title fix(ci_visibility): git warning issues fix(ci_visibility): git multi-process issues May 19, 2026
@gnufede gnufede marked this pull request as ready for review May 19, 2026 13:10
@gnufede gnufede requested review from a team as code owners May 19, 2026 13:10
@gnufede gnufede requested review from brettlangdon and wconti27 May 19, 2026 13:10
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 06626a1ff0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread ddtrace/testing/internal/session_manager.py
…its known

_update_pr_merge_base() was called after the "all commits already in
backend, skip pack upload" early return, so PR sessions where
search_commits found every recent commit already known never populated
git.pull_request.base_branch_sha.

Move the merge-base update before the commits_not_in_backend == 0 check
so it runs on every path where unshallowing has already completed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@brettlangdon brettlangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

release note lgtm

…ock TypeError

Unshallowing was moved before the commits_not_in_backend==0 early return,
which caused tests that mock all commits as known to hit
git.get_git_version() >= (2, 27, 0) with a Mock object and raise TypeError.

Gate the is_shallow_repository() / get_git_version() block on
len(commits_not_in_backend) > 0, restoring the original invariant that
unshallowing only runs when there are commits to upload, while keeping
_update_pr_merge_base() before the early return.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@gnufede
Copy link
Copy Markdown
Member Author

gnufede commented May 21, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 838253fc64

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

return

if git.is_shallow_repository() and git.get_git_version() >= (2, 27, 0):
if len(commits_not_in_backend) > 0 and git.is_shallow_repository() and git.get_git_version() >= (2, 27, 0):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Unshallow before computing merge-base on shallow repos

When commits_not_in_backend is empty, this new guard skips unshallowing even if the checkout is shallow, but _update_pr_merge_base() still runs later and calls git merge-base. In shallow CI clones that do not contain the base ancestry, get_merge_base() fails and now emits a warning, so git.pull_request.base_branch_sha is not populated in exactly the “all commits already known” path this change intended to support.

Useful? React with 👍 / 👎.

…nown

On shallow clones where all recent commits are already in the backend
(commits_not_in_backend == 0), unshallowing was skipped, so
_update_pr_merge_base() called git merge-base without the required
ancestry and emitted a spurious warning.

Remove the commits_not_in_backend > 0 guard from the outer shallow check
so the repo is always unshallowed first. Keep the re-check gated on
commits_not_in_backend > 0 since there is nothing to re-query when all
commits are already known.

Add is_shallow_repository.return_value = False to get_mock_git_instance()
so tests that use the shared Git mock don't enter the shallow block and
hit get_git_version() >= (2, 27, 0) on a Mock object (TypeError).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants