-
Notifications
You must be signed in to change notification settings - Fork 2
Backward compatibility hack for Nix < 2.20 Git inputs #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Taken from NixOS#13993. Co-authored-by: Josh Spence <[email protected]> Co-authored-by: Graham Dennis <[email protected]>
WalkthroughPropagates computed Input fingerprints earlier, introduces GitAccessorOptions (including applyFilters) through Git accessor APIs and state, applies optional git blob filtering during reads, updates fingerprint construction via a helper, and adds backward-compatibility tests for NAR hash behavior. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Fetch caller
participant Scheme as GitInputScheme
participant Repo as GitRepo / GitSourceAccessor
participant Libgit as libgit2
Caller->>Scheme: request accessor/read (input, rev, options)
Scheme->>Repo: getAccessor(rev, options, displayPrefix)
Repo->>Repo: construct GitSourceAccessor(state.oid <- rev, state.options <- options)
Scheme->>Scheme: compute fingerprint via makeFingerprint(input, rev)
Scheme->>Scheme: set Input.cachedFingerprint <- fingerprint
Caller->>Repo: readBlob(oid)
Repo->>Libgit: git_blob_lookup(oid)
Libgit-->>Repo: raw blob
alt state.options.applyFilters == true
Repo->>Libgit: git_blob_filter(raw, commit/attrs)
Libgit-->>Repo: filtered blob
Repo-->>Caller: filtered content
else
Repo-->>Caller: raw content
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/libfetchers/fetchers.cc (1)
327-333: RedundantcachedFingerprintassignment
Input::getFingerprint(store)already setscachedFingerprintinternally before returning, so assigningcachedFingerprint = accessor->fingerprint;is redundant and can be dropped for clarity.src/libfetchers/include/nix/fetchers/git-utils.hh (1)
91-96: NewapplyFiltersflag is reasonable; consider documenting intentExtending
GitRepo::getAccessorwithbool applyFilters = falsekeeps existing callers unchanged and matches the implementation inGitRepoImpl. Since this flag reintroduces Git blob filters (for the compat path), it would help future readers to document here that:
applyFilters = falseis the default, reproducible behavior, andapplyFilters = trueis only for legacy NAR‑hash compatibility.src/libfetchers/git-utils.cc (1)
748-769: GitSourceAccessor filtering path looks correct; consider tightening attr scope with verified flagThe new
applyFiltersplumbing inGitSourceAccessor:
- Stores the commit/tree OID and the
applyFiltersflag inState.- Uses
git_blob_filterwithattr_commit_id = state->oidandGIT_BLOB_FILTER_ATTRIBUTES_FROM_COMMITwhenapplyFiltersis true.- Leaves behavior unchanged when
applyFiltersis false, and continues to short‑circuit via git‑LFS smudging whenlfsFetchis active.That matches the intended "only when explicitly asked" behavior and is safe for the legacy EOL path.
To avoid host‑specific/global attributes affecting this special‑case compat behavior, consider adding the
GIT_FILTER_NO_SYSTEM_ATTRIBUTESflag (equivalent toGIT_ATTR_CHECK_NO_SYSTEM), which is available in libgit2:opts.flags = GIT_FILTER_ATTRIBUTES_FROM_COMMIT | GIT_FILTER_NO_SYSTEM_ATTRIBUTES;This prevents loading
/etc/gitattributesduring blob filtering, keeping the scope tight to commit‑provided attributes only.Also applies to: 795-817
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
src/libfetchers/fetchers.cc(1 hunks)src/libfetchers/git-utils.cc(4 hunks)src/libfetchers/git.cc(5 hunks)src/libfetchers/include/nix/fetchers/git-utils.hh(1 hunks)tests/functional/fetchGit.sh(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
src/libfetchers/include/nix/fetchers/git-utils.hh (2)
src/libfetchers/git-utils.cc (9)
rev(350-390)rev(350-350)rev(392-397)rev(392-392)rev(524-524)rev(555-555)rev(557-562)rev(632-703)rev(632-632)src/libutil/include/nix/util/source-accessor.hh (1)
displayPrefix(164-164)
src/libfetchers/git.cc (2)
src/libfetchers/include/nix/fetchers/fetchers.hh (11)
input(215-215)input(217-217)input(222-222)input(224-228)input(238-241)input(238-238)input(253-256)store(158-158)store(176-176)store(243-246)store(243-243)src/libfetchers/include/nix/fetchers/git-utils.hh (5)
rev(31-31)rev(33-33)rev(85-85)rev(91-96)rev(111-111)
src/libfetchers/git-utils.cc (1)
src/libfetchers/include/nix/fetchers/git-utils.hh (6)
rev(31-31)rev(33-33)rev(85-85)rev(91-96)rev(111-111)ref(38-38)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build_x86_64-linux / build
- GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (3)
src/libfetchers/git.cc (1)
641-645: Centralizing fingerprint construction looks goodFactoring fingerprint construction into
makeFingerprintand reusing it ingetFingerprint(including the;d=dirty-suffix case) keeps the fingerprint semantics consistent and easier to maintain. No correctness issues spotted.Also applies to: 977-999
tests/functional/fetchGit.sh (1)
314-338: CRLF / narHash compatibility tests are well‑scopedThe new tests correctly exercise:
- Legacy
narHashmatching only with filters applied (plus warning mentioning the new hash),- The updated hash matching the unfiltered content,
- A deliberately wrong
narHashyielding the 102 “NAR hash mismatch” failure.Quoting and expected contents (
"\r\n"vs"\n") look correct.src/libfetchers/git-utils.cc (1)
555-563: applyFilters propagation through GitRepoImpl accessorsThe extended signatures for:
getRawAccessor(const Hash & rev, bool smudgeLfs = false, bool applyFilters = false), andgetAccessor(const Hash & rev, bool exportIgnore, std::string displayPrefix, bool smudgeLfs = false, bool applyFilters = false)correctly pass
applyFiltersdown intoGitSourceAccessor. Existing callers (e.g.treeHashToNarHash,getSubmodules) continue to use the defaultfalse, while the new compat logic ingit.ccexplicitly setstrueonly where needed.Also applies to: 1339-1355
Before Nix 2.20, we used git, which applies Git filters (in particular doing end-of-line conversion based on .gitattributes). In 2.20, we switched to libgit2 and stopped applying filters, which is probably better for reproducibility. However, that breaks existing lock files / fetchTree calls for Git inputs that use those filters, since it invalidates the NAR hash. So as a backward compatibility hack, we now check the NAR hash computed over the Git tree without filtering applied. If there is a hash mismatch, we try again *with* filtering. If that succeeds, we print a warning and return the filtered tree.
428f45a to
04b1b1e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
src/libfetchers/git.cc (1)
793-818: Path format issue: UseCanonPath(".gitattributes")without leading slashLine 796 uses
CanonPath("/.gitattributes")with a leading slash, which is inconsistent with the established pattern in the codebase (e.g.,CanonPath(".gitmodules")in git-utils.cc:1381). This inconsistency may causepathExiststo return false even when.gitattributesexists, preventing the backward compatibility logic from triggering.Apply this diff:
- if (accessor->pathExists(CanonPath("/.gitattributes"))) { + if (accessor->pathExists(CanonPath(".gitattributes"))) {
🧹 Nitpick comments (1)
tests/functional/fetchGit.sh (1)
329-329: Clarify the purpose of_NIX_TEST_BARF_ON_UNCACHEABLEThe test sets
_NIX_TEST_BARF_ON_UNCACHEABLE=1without explanation. Please add a comment explaining why this environment variable is needed for this test.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/libfetchers/git.cc(5 hunks)tests/functional/fetchGit.sh(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build_aarch64-darwin / build
- GitHub Check: build_x86_64-linux / build
🔇 Additional comments (6)
src/libfetchers/git.cc (4)
19-19: LGTM!The include is necessary for the
fetchToStore2function used in the backward compatibility logic below.
641-645: LGTM!Good refactoring to extract fingerprint construction into a reusable helper method, eliminating duplication.
980-980: LGTM!Correctly updated to use the new
makeFingerprinthelper method.
996-997: LGTM!Correctly updated to use the new
makeFingerprinthelper method for the dirty workdir case.tests/functional/fetchGit.sh (2)
314-328: LGTM!The test setup correctly creates a scenario where Git filters affect content (LF vs CRLF), which is exactly what the backward compatibility logic handles.
331-340: Test scenarios effectively cover the backward compatibility logicThe three test cases properly validate:
- Pre-Nix 2.20 behavior (filters applied, CRLF content)
- Post-Nix 2.20 behavior (no filters, LF content)
- Error handling for mismatched hashes
Note: If the
CanonPathissue flagged in git.cc is not fixed, Test 1 would fail because the backward compatibility logic wouldn't trigger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/libfetchers/fetchers.cc (2)
327-347: Fingerprint caching in makeStoreAccessor is slightly redundant but harmless
getFingerprint(store)already populatescachedFingerprintinsideInput, so the explicitcachedFingerprint = accessor->fingerprint;write is effectively a no‑op refresh. It’s fine as is; if you want to trim dead writes, you could safely drop that line without changing behavior.
358-365: New fingerprint propagation between accessor and Input result is reasonable; minor enhancement possibleThe new logic that:
- prefers
accessor->getFingerprint(CanonPath::root).secondto fillresult.cachedFingerprint, and- otherwise computes via
result.getFingerprint(store)and assigns it toaccessor->fingerprintis a good way to keep accessors and
Input’s cached fingerprint in sync.If
getFingerprint(CanonPath::root)may compute the fingerprint on demand rather than reading a cached field, you might also setaccessor->fingerprintfrom that value to avoid recomputation later:- if (auto fp = accessor->getFingerprint(CanonPath::root).second) - result.cachedFingerprint = *fp; - else - accessor->fingerprint = result.getFingerprint(store); + if (auto fp = accessor->getFingerprint(CanonPath::root).second) { + accessor->fingerprint = *fp; + result.cachedFingerprint = *fp; + } else { + accessor->fingerprint = result.getFingerprint(store); + }Not required for correctness, but could reduce duplicate work depending on
SourceAccessorimplementations.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
src/libfetchers-tests/git-utils.cc(1 hunks)src/libfetchers/fetchers.cc(2 hunks)src/libfetchers/git-utils.cc(8 hunks)src/libfetchers/git.cc(7 hunks)src/libfetchers/github.cc(1 hunks)src/libfetchers/include/nix/fetchers/git-utils.hh(2 hunks)src/libfetchers/tarball.cc(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (5)
src/libfetchers/git.cc (2)
src/libfetchers/include/nix/fetchers/fetchers.hh (11)
input(215-215)input(217-217)input(222-222)input(224-228)input(238-241)input(238-238)input(253-256)store(158-158)store(176-176)store(243-246)store(243-243)src/libfetchers/git-utils.cc (13)
rev(350-390)rev(350-350)rev(392-397)rev(392-392)rev(524-524)rev(555-555)rev(558-558)rev(629-700)rev(629-629)settings(702-716)settings(702-702)makeFingerprint(741-744)makeFingerprint(741-741)
src/libfetchers/tarball.cc (1)
src/libfetchers/github.cc (16)
settings(35-111)settings(36-36)settings(127-135)settings(127-127)settings(182-204)settings(182-183)settings(206-213)settings(207-207)settings(215-228)settings(215-216)settings(236-236)settings(238-238)settings(246-316)settings(246-246)settings(318-339)settings(319-319)
src/libfetchers/include/nix/fetchers/git-utils.hh (2)
src/libfetchers/git-utils.cc (10)
rev(350-390)rev(350-350)rev(392-397)rev(392-392)rev(524-524)rev(555-555)rev(558-558)rev(629-700)rev(629-629)wd(561-561)src/libutil/include/nix/util/source-accessor.hh (1)
displayPrefix(164-164)
src/libfetchers/fetchers.cc (2)
src/libfetchers/github.cc (2)
store(350-356)store(350-350)src/libfetchers/include/nix/fetchers/fetchers.hh (4)
store(158-158)store(176-176)store(243-246)store(243-243)
src/libfetchers/git-utils.cc (3)
src/libfetchers/include/nix/fetchers/git-utils.hh (10)
rev(31-31)rev(40-40)rev(42-42)rev(94-94)rev(101-101)rev(116-116)wd(103-104)path(38-38)path(85-85)ref(47-47)src/libutil/include/nix/util/source-accessor.hh (1)
displayPrefix(164-164)src/libfetchers/git.cc (5)
getAccessor(959-980)getAccessor(960-960)makeFingerprint(641-648)makeFingerprint(641-641)path(399-405)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build_x86_64-linux / build
- GitHub Check: build_aarch64-darwin / build
🔇 Additional comments (5)
src/libfetchers/include/nix/fetchers/git-utils.hh (2)
25-32: GitAccessorOptions struct looks coherent and backwards‑compatible by defaultCentralizing
exportIgnore,smudgeLfs, andapplyFiltersintoGitAccessorOptionswith all‑false defaults preserves prior behavior at call sites that now pass{}. ThemakeFingerprintdeclaration here also makes sense as the natural place to incorporate these flags into the fingerprint.
100-104: Updated GitRepo::getAccessor signatures align with the new options typeSwitching both
getAccessoroverloads to takeconst GitAccessorOptions &is consistent with the new struct and keeps the API surface clean. As long as allGitRepoimplementors are updated accordingly (which compilation should catch), this change looks good.src/libfetchers-tests/git-utils.cc (1)
94-96: Test call site correctly adapted to GitAccessorOptionsUsing
{}here to constructGitAccessorOptionspreserves the old behavior (exportIgnore == falseand no filters) while matching the new API. No changes to test semantics.src/libfetchers/tarball.cc (1)
133-141: Tarball cache accessor call correctly migrated to options objectPassing
{}as theGitAccessorOptionswhen opening the tarball cache keeps the previous “no export‑ignore / no filters” behavior while conforming to the new signature. This is an appropriate default for tarball‑derived trees.src/libfetchers/github.cc (1)
323-331: Git forge tarball accessor updated consistently with tarball.ccThe switch to
getAccessor(tarballInfo.treeHash, {}, …)mirrors the tarball path and preserves prior behavior (no filters / export‑ignore) for cached archive trees. This keeps the GitArchive input scheme aligned with the tarball cache API.
| warn( | ||
| "Git input '%s' specifies a NAR hash '%s' that was created by Nix < 2.20.\n" | ||
| "Nix >= 2.20 does not apply Git filters and `export-ignore` by default, which changes the NAR hash.\n" | ||
| "Please update the NAR hash to '%s'.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: update this to recommend re-locking with a newer Nix version, and include a link to docs.determinate.systems on other approaches for extending backwards compatibility
|
Should we change the fetcher-cache version to avoid issues when using mixed Nix versions? |
90bb354 to
ec9252d
Compare
|
Closing this for #278. |
Motivation
Before Nix 2.20, we used the
gitCLI, which applies Git filters (in particular doing end-of-line conversion based on .gitattributes) and appliesexport-ignore. In 2.20, we switched tolibgit2and stopped doing those things, which is probably better for reproducibility. However, this breaks existing lock files /fetchTreecalls for Git inputs that rely on those features, since it invalidates the NAR hash.So as a backward compatibility hack, we now check the NAR hash computed over the Git tree without filters and
export-ignoreapplied. If there is a hash mismatch, we try again with filters andexport-ignore. If that succeeds, we print a warning and return the latter tree.Context
Summary by CodeRabbit
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.