-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: resolve Git Anonymous Resolver excessive memory usage #8677
base: main
Are you sure you want to change the base?
Conversation
Switch git resolver from go-git library to use git binary. The go-git library does not resolve deltas efficiently, and as a result is not recommended to be used to clone repositories with full-depth. In one example RemoteResolutionRequest targeting a repository which summed 145Mb, configuring the resolution timeout to 10 minutes and giving the resolver to have 25Gb of memory, the resolver pod was OOM killed after ~6 minutes. Additionally, since go-git's delta resolution does not accept any contexts, the time required and memory used during resolving a large repository will not be cutoff when the context is canceled, impacting the resolver's performance for other concurrent remote resolutions. Since the git resolver can be provided a revision which is not tracked at any ref in the repository, and because go-git only supports fetching fully-qualified refs, go-git does not support fetching arbitrary revisions. Therefore, in order to guarantee the requested revision is fetched, if we continue to use the go-git library we must fetch all revisions. Switching to the git binary enables the git resolver to take advantage of the git-fetch's support for fetching arbitrary revisions. Note that if the revision is not at any ref head, fetching the revision does depend on the git server enabling uploadpack.allowReachableSHA1InWant. Resolves tektoncd#8652 See also: https://git-scm.com/docs/protocol-capabilities#_allow_reachable_sha1_in_want NOTE: This feature is enabled and supported in Github and Gitlab but not Gitea: go-gitea/gitea#11958
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
_, err = repo.execGit(ctx, "clone", repo.url, tmpDir, "--depth=1", "--no-checkout") | ||
if err != nil { | ||
return nil, cleanupFunc, err | ||
} | ||
|
||
_, err = repo.execGit(ctx, "fetch", "origin", repo.revision, "--depth=1") | ||
if err != nil { | ||
return nil, cleanupFunc, err | ||
} | ||
|
||
_, err = repo.execGit(ctx, "checkout", "FETCH_HEAD") | ||
if err != nil { | ||
return nil, cleanupFunc, err | ||
} | ||
|
||
revisionSha, err := repo.execGit(ctx, "rev-list", "-n1", "HEAD") | ||
if err != nil { | ||
return nil, cleanupFunc, err | ||
} | ||
repo.revision = strings.TrimSpace(string(revisionSha)) | ||
|
||
return &repo, cleanupFunc, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I initially simply did a full git clone
and then simply did git checkout
. However doing a shallow clone with --no-checkout
, fetch
ing the revision, and then checking the revision out improved the time and space performance significantly. Using the same repository mentioned in the original issue as a benchmark, using a shallow clone reduced the clone time from ~110 seconds by a factor of ten. Using --no-checkout
similarly reduced the disk space and time slightly (before the necessary revision was checked-out)
The following is the coverage report on the affected files.
|
@@ -29,7 +29,7 @@ spec: | |||
description: Username to be used to login to the container registry | |||
default: "_json_key" | |||
- name: releaseAsLatest | |||
description: Whether to tag and publish this release as Pipelines' latest | |||
description: Whether to tag and publish this release as Pipelines latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not a grammatical error, I just noticed it threw off the syntax highlighting of my editor and both versions seemed equally correct. I am happy to revert if needed.
@@ -55,7 +55,7 @@ function ko_resolve() { | |||
github.com/tektoncd/pipeline/cmd/nop: ghcr.io/tektoncd/pipeline/github.com/tektoncd/pipeline/combined-base-image:latest | |||
github.com/tektoncd/pipeline/cmd/workingdirinit: ghcr.io/tektoncd/pipeline/github.com/tektoncd/pipeline/combined-base-image:latest | |||
|
|||
github.com/tektoncd/pipeline/cmd/git-init: cgr.dev/chainguard/git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The git-init
package no longer exists.
/kind feat |
@aThorp96: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind feature |
Cc @vdemeester |
beta e2e tests are failing |
@waveywaves yeah, hoping to get to addressing the e2e test failures today or tomorrow. |
@aThorp96: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Switch git resolver from go-git library to use git binary.
The go-git library does not resolve deltas efficiently, and as a result is not recommended to be used to clone repositories with full-depth. In one example RemoteResolutionRequest targeting a repository which summed 145Mb, configuring the resolution timeout to 10 minutes and giving the resolver to have 25Gb of memory, the resolver pod was OOM killed after ~6 minutes. Additionally, since go-git's delta resolution does not accept any contexts, the time required and memory used during resolving a large repository will not be cutoff when the context is canceled, impacting the resolver's performance for other concurrent remote resolutions.
Since the git resolver can be provided a commit sha which is not tracked at any ref/head, and because go-git only supports fetching fully-qualified refs, go-git does not support fetching arbitrary revisions. Therefore, in order to guarantee the requested revision is fetched, if we continue to use the go-git library, we must fetch all revisions.
Switching to the git binary significantly improves the time and memory performance of the git resolver (by multiple orders of magnitude, in my testing). Using the git binary also enables the resolver to take advantage of shallow fetching/cloning due to the git-fetch's support for fetching arbitrary revisions, improving both the time and memory performance by another order of magnitude. Note that if the revision is not at any ref head or tag, fetching the revision does depend on the git server enabling uploadpack.allowReachableSHA1InWant. This feature is enabled and supported in Github and Gitlab but not Gitea: go-gitea/gitea#11958
See also: https://git-scm.com/docs/protocol-capabilities#_allow_reachable_sha1_in_want
Resolves #8652
Changes
Submitter Checklist
As the author of this PR, please check off the items in this checklist:
/kind <type>
. Valid types are bug, cleanup, design, documentation, feature, flake, misc, question, tepRelease Notes