Skip to content

feat(gl): sanctioned iCaptcha client flow + secure git lifecycle#138

Merged
kevincodex1 merged 2 commits into
mainfrom
feat/icaptcha-client
Jul 5, 2026
Merged

feat(gl): sanctioned iCaptcha client flow + secure git lifecycle#138
kevincodex1 merged 2 commits into
mainfrom
feat/icaptcha-client

Conversation

@kevincodex1

@kevincodex1 kevincodex1 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Builds the client side of the iCaptcha gate so a legitimate agent completes the full git lifecycle (register → repo create → push → clone) transparently — no manual headers, no GITLAWB_ICAPTCHA_PROOF env hack. Server enforcement is unchanged.

Motivation & context

The node enforces iCaptcha on writes (create/fork/register), but gl/git-remote-gitlawb had no native support — an agent testing the lifecycle hit a chain of failures (silent version drift → 401, then 403 iCaptcha proof required, then sub mismatch, plus misleading repo info stubs). This implements the sanctioned flow: solve the challenge and retry automatically.
Closes #

Kind of change

  • Bug fix
  • Feature
  • Security fix
  • Docs
  • Tests / CI
  • Refactor (no behavior change)
  • Breaking or protocol change (issue required first)

Summary

Builds the client side of the iCaptcha gate so a legitimate agent completes the full git lifecycle (register → repo create → push → clone) transparently — no manual headers, no GITLAWB_ICAPTCHA_PROOF env hack. Server enforcement is unchanged.

Motivation & context

The node enforces iCaptcha on writes (create/fork/register), but gl/git-remote-gitlawb had no native support — an agent testing the lifecycle hit a chain of failures (silent version drift → 401, then 403 iCaptcha proof required, then sub mismatch, plus misleading repo info stubs). This implements the sanctioned flow: solve the challenge and retry automatically.
Closes #

Kind of change

  • Bug fix
  • Feature
  • Security fix
  • Docs
  • Tests / CI
  • Refactor (no behavior change)
  • Breaking or protocol change (issue required first)

What changed

  • New crate icaptcha-client — deterministic solvers (arithmetic/algebra/sequence) + obtain_proof() challenge→solve→answer loop; requesterId = caller DID (so proof sub matches the signer).
  • gl (crates/gl)http.rs: unified signed writes that, on a 403 iCaptcha challenge (detected via new x-icaptcha-url/x-icaptcha-level headers), solve and retry with x-icaptcha-proof (bounded retry for the ~5 min TTL); removes the env hack; actionable hint on 401 not_an_agent. repo.rs: repo info/clone return a real 404 instead of a stub with a placeholder owner DID. doctor.rs: gl-vs-node version drift + iCaptcha reachability checks.
  • git-remote-gitlawb — same shared client; push stays RFC-9421 signed, with a 403-iCaptcha solve/retry safety net; clone/fetch unsigned; missing-repo 404 kept.
  • gitlawb-node — the 403 icaptcha_proof_required response now advertises the service URL + required level as JSON fields and x-icaptcha-url/x-icaptcha-level headers (mirrors the existing human_detected pattern). The sub == authenticated-DID, expiry, and level checks are unchanged.
  • README — full lifecycle docs (requesterId == DID rule, proof TTL).

How a reviewer can verify

cargo fmt --all --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace            # needs Postgres (DATABASE_URL); node still rejects missing/expired/wrong-sub proofs
cargo test -p icaptcha-client     # deterministic solvers, no DB
# behavioural: repo-not-found is a clean 404, not a stub
cargo run -p gl -- repo info zNoSuchOwner/definitely-not-real --node https://node.gitlawb.com

Before you request review

- [x] Scope is one logical change; no unrelated churn
- [x] cargo test --workspace passes locally (against Postgres)
- [x] New behavior is covered by tests (solvers; node proof-rejection tests retained)
- [x] cargo fmt --all and cargo clippy --workspace --all-targets -- -D warnings are clean
- [x] Commit titles use Conventional Commits
- [x] Docs / .env.example updated if behavior or config changed (README lifecycle)
- [x] Checked existing PRs so this isn't a duplicate

Protocol & signing impact

- [x] Touches DID / did:key, Ed25519 / RFC 9421 signatures, UCAN, ref certs, or P2P wire formats
- [ ] Discussed in an issue before implementation
- [x] Backward-compatible with existing nodes and previously signed history

The only wire additions are additive, backward-compatible: the x-icaptcha-url/x-icaptcha-level response headers + JSON fields on an existing 403, and the client echoing the existing x-icaptcha-proof header. No change to signing coverage or proof verification; old clients/nodes interoperate unchanged.

Notes for reviewers

- Push is intentionally signed-only (owner signature is the gate) — not per-push iCaptcha-gated; auto-create-on-push was deferred.
- The proof stays an unsigned header for now; binding it into the RFC-9421 signature is noted as future hardening.
- Depends on the node advertising the x-icaptcha-* headers (included here) — deploy node + clients together.

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * Added support for iCaptcha-protected write actions, with automatic challenge handling and retry.
  * Expanded `gl doctor` to check iCaptcha connectivity and surface setup warnings.

* **Bug Fixes**
  * Improved repo cloning and info errors with clearer, more specific messages.
  * Tightened handling of proof reuse and expired challenges during writes.

* **Documentation**
  * Added guidance for using iCaptcha-enabled nodes, including signature requirements and optional setup steps.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

@github-actions github-actions Bot added the needs-issue PR has no linked issue label Jul 1, 2026
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Thanks for the contribution. A couple of things will help us review this faster:

  • Link the issue this addresses (Closes #123). For protocol changes, open an issue first.

See CONTRIBUTING.md. Update the PR and these notes will clear automatically.

@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5e1ed1a5-833a-4978-b909-b3cc1d8487ee

📥 Commits

Reviewing files that changed from the base of the PR and between 4aba938 and 83f3c74.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • Cargo.toml
  • README.md
  • crates/gitlawb-node/src/error.rs
  • crates/gitlawb-node/src/icaptcha.rs
  • crates/gl/Cargo.toml
  • crates/gl/src/doctor.rs
  • crates/gl/src/http.rs
  • crates/gl/src/repo.rs
  • crates/icaptcha-client/Cargo.toml
  • crates/icaptcha-client/src/lib.rs
  • crates/icaptcha-client/src/solvers.rs
✅ Files skipped from review due to trivial changes (3)
  • crates/gl/Cargo.toml
  • crates/icaptcha-client/Cargo.toml
  • README.md
🚧 Files skipped from review as they are similar to previous changes (8)
  • crates/gitlawb-node/src/error.rs
  • Cargo.toml
  • crates/gl/src/http.rs
  • crates/gl/src/doctor.rs
  • crates/gl/src/repo.rs
  • crates/gitlawb-node/src/icaptcha.rs
  • crates/icaptcha-client/src/solvers.rs
  • crates/icaptcha-client/src/lib.rs

📝 Walkthrough

Walkthrough

Adds a new icaptcha-client crate implementing deterministic challenge solvers and a blocking proof-obtaining flow, integrates it into gl's NodeClient for automatic challenge/retry on signed writes, restructures the node's IcaptchaProofRequired error into a structured 403 response, updates gl doctor/repo commands, and documents the flow in README.

Changes

iCaptcha proof flow

Layer / File(s) Summary
icaptcha-client crate core: config, types, proof flow
crates/icaptcha-client/Cargo.toml, crates/icaptcha-client/src/lib.rs
New crate exposing IcaptchaCfg, Challenge, Solver, and obtain_proof, which requests a challenge, tries local/callback solvers then interactive prompt, submits answers, and handles Passed/Continue/Failed outcomes; includes URL trust allowlisting, API-key gating, and error-body sanitization.
Deterministic local solvers
crates/icaptcha-client/src/solvers.rs
Adds arithmetic, algebra, and sequence solvers plus an exact-integer-square-root helper and unit tests, dispatched via solve(challenge_type, prompt).
Server-side structured proof-required error
crates/gitlawb-node/src/error.rs, crates/gitlawb-node/src/icaptcha.rs
Converts AppError::IcaptchaProofRequired to a struct with message/url/level, returns 403 with x-icaptcha-* headers/JSON body, and updates ProofGuard::consume/reject_error to populate it via a new url_and_level() helper.
gl NodeClient signed write retry
crates/gl/Cargo.toml, crates/gl/src/http.rs
Adds icaptcha-client dependency and reworks post/put/delete to route through send_signed/send_once, retrying signed requests with an x-icaptcha-proof header on 403 challenges (bounded by MAX_ICAPTCHA_RETRIES) and hinting on 401 human_detected responses.
gl doctor and repo command updates
crates/gl/src/doctor.rs, crates/gl/src/repo.rs
Adds a gl doctor iCaptcha connectivity check and CLI-version warning, and updates resolve_owner_did/cmd_clone/cmd_info to require local identity and surface explicit HTTP error statuses.
Workspace wiring and docs
Cargo.toml, README.md
Registers icaptcha-client as a workspace member and documents the signature + iCaptcha proof requirements for public-node writes.

Estimated code review effort: 4 (Complex) | ~60 minutes

Sequence Diagram(s)

sequenceDiagram
  participant gl as gl NodeClient
  participant Node
  participant IcaptchaClient

  gl->>Node: signed write request (send_once)
  Node-->>gl: 403 Forbidden + x-icaptcha-url/level
  gl->>IcaptchaClient: obtain_proof(cfg)
  IcaptchaClient->>Node: request_challenge
  Node-->>IcaptchaClient: Challenge
  IcaptchaClient->>IcaptchaClient: solvers::solve / interactive_prompt
  IcaptchaClient->>Node: submit_answer
  Node-->>IcaptchaClient: Passed (proof token)
  IcaptchaClient-->>gl: proof token
  gl->>Node: retry signed request with x-icaptcha-proof
  Node-->>gl: 200 OK
Loading

Possibly related PRs

  • Gitlawb/node#108: Introduces the original IcaptchaProofRequired error and ProofGuard::consume replay handling that this PR restructures into a struct-based variant.
  • Gitlawb/node#125: Modifies the same ProofGuard::consume/verification flow in crates/gitlawb-node/src/icaptcha.rs at overlapping code points.

Suggested labels: sev:medium, kind:security, subsystem:api

Suggested reviewers: jatmn, beardthelion

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding a sanctioned iCaptcha client flow for git operations.
Description check ✅ Passed The description follows the template well with summary, motivation, change type, implementation details, verification, and review notes.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/icaptcha-client

Comment @coderabbitai help to get the list of available commands.

@beardthelion beardthelion added crate:git-remote git-remote-gitlawb — the git remote helper crate:gl gl — the contributor CLI crate:node gitlawb-node — the serving node and REST API kind:feature New capability or surface labels Jul 1, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
crates/git-remote-gitlawb/src/main.rs (1)

309-309: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Avoid hard-coding the helper version in the User-Agent.

Line 309 now advertises 0.4.0, while the refs request in this function still uses another hard-coded value. Use env!("CARGO_PKG_VERSION") via a shared constant so version-drift checks don’t go stale.

♻️ Proposed refactor
+const USER_AGENT: &str = concat!("git/2.0 git-remote-gitlawb/", env!("CARGO_PKG_VERSION"));
+
 ...
-            .header("User-Agent", "git/2.0 git-remote-gitlawb/0.4.0");
+            .header("User-Agent", USER_AGENT);

Also reuse USER_AGENT for the /info/refs request in the same function.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/git-remote-gitlawb/src/main.rs` at line 309, The User-Agent string is
hard-coded with a stale helper version, and the same version value is duplicated
for the /info/refs request. Update the `main` function to use a shared
`USER_AGENT` constant based on `env!("CARGO_PKG_VERSION")`, and reuse that
constant for both the Git HTTP request and the refs request so the version stays
consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/git-remote-gitlawb/src/main.rs`:
- Around line 340-342: The remote helper path in main should not call
icaptcha_client::obtain_proof with a fallback that can reach interactive_prompt,
since this code runs over the Git protocol stream. Update the proof acquisition
flow in the icaptcha challenge handling block to use a non-interactive proof
source only, and if no proof can be obtained, return an actionable error instead
of prompting on stdin.
- Around line 329-333: The push request in main currently clones the entire
request_body before sending, which adds unnecessary peak-memory overhead; update
the send path to use a shared body representation instead of duplicating the
pack. Refactor the request_body handling in main around req.body(...).send() to
use a cloneable shared type such as bytes::Bytes so retries still work without
copying the full payload.

In `@crates/gl/src/doctor.rs`:
- Around line 207-231: The iCaptcha doctor check is using a different endpoint
source than the signed-write retry flow, so the warning points users at the
wrong fix. Update the iCaptcha probe in doctor.rs (the
`NodeClient::new(...).get("/v1/pubkey")` check) to align with the retry path
that uses the `x-icaptcha-url` header and `IcaptchaCfg::new`, and adjust the
warning text to reference that header-driven URL instead of
`GITLAWB_ICAPTCHA_URL`.

In `@crates/gl/src/repo.rs`:
- Around line 304-329: The bare-name branch in cmd_clone is building the repo
URL from kp.did().to_string(), which produces a full DID instead of the short
owner segment used elsewhere. Update cmd_clone to reuse resolve_owner_did (or
the same keypair-to-owner logic used by cmd_info and resolve_owner_repo_pair) so
the constructed gitlawb:// URL uses the short form consistently. Keep the
existing explicit owner/name path unchanged, and ensure bare-name cloning
matches the format expected by git-remote-gitlawb and the related tests.

In `@README.md`:
- Around line 217-219: Clarify the `git push` wording in README.md so it does
not imply an absolute “no per-push challenge” guarantee. Update the sentence in
the access-policy section near the `create / fork / register` and `git push`
description to state that pushes are signed-only by default while the helper may
still retry on 403 iCaptcha responses, using the same policy terms already used
elsewhere in the document.

---

Nitpick comments:
In `@crates/git-remote-gitlawb/src/main.rs`:
- Line 309: The User-Agent string is hard-coded with a stale helper version, and
the same version value is duplicated for the /info/refs request. Update the
`main` function to use a shared `USER_AGENT` constant based on
`env!("CARGO_PKG_VERSION")`, and reuse that constant for both the Git HTTP
request and the refs request so the version stays consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4ee1649b-ba1c-4095-a96e-671cf41600d0

📥 Commits

Reviewing files that changed from the base of the PR and between 8d96cab and 4aba938.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • Cargo.toml
  • README.md
  • crates/git-remote-gitlawb/Cargo.toml
  • crates/git-remote-gitlawb/src/main.rs
  • crates/gitlawb-node/src/error.rs
  • crates/gitlawb-node/src/icaptcha.rs
  • crates/gl/Cargo.toml
  • crates/gl/src/doctor.rs
  • crates/gl/src/http.rs
  • crates/gl/src/repo.rs
  • crates/icaptcha-client/Cargo.toml
  • crates/icaptcha-client/src/lib.rs
  • crates/icaptcha-client/src/solvers.rs

Comment thread crates/git-remote-gitlawb/src/main.rs Outdated
Comment on lines +329 to +333
// Clone the pack so the body survives a possible iCaptcha retry. (Push
// is signed-only today, so the retry path is essentially never taken.)
let resp = req
.body(request_body.clone())
.send()

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 Performance & Scalability | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
rg -n 'reqwest\s*=|bytes\s*=' Cargo.toml crates -g 'Cargo.toml'

Repository: Gitlawb/node

Length of output: 586


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== Cargo.toml excerpts =="
sed -n '1,120p' Cargo.toml

echo
echo "== git-remote-gitlawb main.rs around request body =="
sed -n '280,380p' crates/git-remote-gitlawb/src/main.rs

echo
echo "== sign_request definition =="
rg -n 'fn sign_request|sign_request\(' crates/git-remote-gitlawb/src/main.rs

Repository: Gitlawb/node

Length of output: 6122


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "== reqwest body API docs from local cargo registry if available =="
python3 - <<'PY'
import os, glob, textwrap, pathlib, sys
home = pathlib.Path.home()
candidates = list(home.glob('.cargo/registry/src/*/reqwest-0.12.*'))
print("\n".join(str(p) for p in candidates[:20]))
PY

echo
echo "== search local registry for Bytes body support =="
rg -n 'impl From<Bytes>|From<Bytes>|body<B: Into<Body>>|struct Body' ~/.cargo/registry/src -g '*.rs' | head -n 80

Repository: Gitlawb/node

Length of output: 345


Avoid cloning the full pack body here.
request_body.clone() duplicates the entire pack before every send, so large pushes pay the extra peak-memory cost even on the first attempt. A shared body type like bytes::Bytes would avoid the copy and still allow retries.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/git-remote-gitlawb/src/main.rs` around lines 329 - 333, The push
request in main currently clones the entire request_body before sending, which
adds unnecessary peak-memory overhead; update the send path to use a shared body
representation instead of duplicating the pack. Refactor the request_body
handling in main around req.body(...).send() to use a cloneable shared type such
as bytes::Bytes so retries still work without copying the full payload.

Comment thread crates/git-remote-gitlawb/src/main.rs Outdated
Comment thread crates/gl/src/doctor.rs
Comment on lines +207 to +231
// ── 4b. iCaptcha capability ───────────────────────────────────────────
// Gated writes (repo create / register / fork) auto-solve a challenge at the
// iCaptcha service; check it's reachable so the failure mode is obvious.
let icaptcha_url = std::env::var("GITLAWB_ICAPTCHA_URL")
.unwrap_or_else(|_| icaptcha_client::DEFAULT_URL.to_string());
match NodeClient::new(&icaptcha_url, None).get("/v1/pubkey").await {
Ok(resp) if resp.status().is_success() => {
checks.push(Check::pass("iCaptcha", format!("{icaptcha_url} reachable")));
}
Ok(resp) => {
checks.push(Check::warn(
"iCaptcha",
format!("{icaptcha_url} returned HTTP {}", resp.status()),
"gated writes (repo create / register) may fail until iCaptcha is reachable",
));
}
Err(e) => {
checks.push(Check::warn(
"iCaptcha",
format!("{icaptcha_url} unreachable: {e}"),
"set GITLAWB_ICAPTCHA_URL or check connectivity — repo create / register solve a challenge there",
));
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check whether GITLAWB_ICAPTCHA_URL is read anywhere besides doctor.rs
rg -n 'GITLAWB_ICAPTCHA_URL' --type=rust

Repository: Gitlawb/node

Length of output: 150


🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf '== files ==\n'
git ls-files | rg '(^crates/gl/src/(doctor|http)\.rs$|icaptcha|git-remote-gitlawb|remote-gitlawb|x-icaptcha-url|IcaptchaCfg)'

printf '\n== icaptcha symbols ==\n'
rg -n 'IcaptchaCfg|x-icaptcha-url|ICAPTCHA|GITLAWB_ICAPTCHA_API_KEY|GITLAWB_ICAPTCHA_URL' crates gl . --type=rust

Repository: Gitlawb/node

Length of output: 9784


Use the same iCaptcha endpoint as the retry path. The doctor check reads GITLAWB_ICAPTCHA_URL, but the signed-write retry flow derives iCaptcha from the x-icaptcha-url response header and IcaptchaCfg::new only reads GITLAWB_ICAPTCHA_API_KEY. Telling users to set this env var won’t fix gated-write failures; point the warning at the header-driven URL instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/gl/src/doctor.rs` around lines 207 - 231, The iCaptcha doctor check is
using a different endpoint source than the signed-write retry flow, so the
warning points users at the wrong fix. Update the iCaptcha probe in doctor.rs
(the `NodeClient::new(...).get("/v1/pubkey")` check) to align with the retry
path that uses the `x-icaptcha-url` header and `IcaptchaCfg::new`, and adjust
the warning text to reference that header-driven URL instead of
`GITLAWB_ICAPTCHA_URL`.

Comment thread crates/gl/src/repo.rs
Comment thread README.md

@jatmn jatmn left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found issues that need to be addressed before this is ready.

Findings

  • [P2] Complete CodeRabbit's request to keep the remote helper non-interactive
    crates/git-remote-gitlawb/src/main.rs:342
    CodeRabbit's thread is still unresolved, and the current push retry path calls icaptcha_client::obtain_proof(&cfg, None). Passing None lets the client fall back to interactive_prompt, which reads from process stdin. In the remote helper, stdin is the Git remote-helper protocol stream, so an unexpected challenge type can block waiting for input or consume protocol bytes instead of returning a clean Git error. Please complete that review request by using a non-interactive proof path in git-remote-gitlawb and surfacing an actionable failure when the proof cannot be solved automatically.

  • [P2] Complete CodeRabbit's request to preserve the short owner form for bare clone names
    crates/gl/src/repo.rs:311
    CodeRabbit's bare-name clone comment is still valid. cmd_info and resolve_owner_repo_pair resolve owner-less repo names through resolve_owner_did, which returns the short key segment, but cmd_clone now uses kp.did().to_string() directly. That makes gl repo clone myrepo build gitlawb://did:key:z.../myrepo while the adjacent commands and helper path shape use gitlawb://z.../myrepo. Please complete the review request by reusing resolve_owner_did or the same short-owner derivation for the bare-name branch.

  • [P2] Complete CodeRabbit's request to avoid copying the full push pack
    crates/git-remote-gitlawb/src/main.rs:332
    The retry loop clones request_body before every send, so even the normal first push attempt duplicates the entire pack in memory. Large pushes are exactly where this helper needs predictable memory use, and the previous code avoided that copy by moving the body once. Please complete CodeRabbit's request by using a shared cloneable body representation, such as bytes::Bytes, so retries can reuse the body without duplicating the pack bytes.

  • [P3] Complete CodeRabbit's request to align gl doctor with header-driven iCaptcha discovery
    crates/gl/src/doctor.rs:210
    The write retry path derives the iCaptcha endpoint from the node's x-icaptcha-url header and IcaptchaCfg::new; the only env var that path reads is GITLAWB_ICAPTCHA_API_KEY. gl doctor instead probes GITLAWB_ICAPTCHA_URL or the default URL and tells users to set that variable, which will not fix a gated-write failure against a node advertising a different iCaptcha service. Please complete the review request by checking or describing the same header-driven endpoint the retry path actually uses.

  • [P3] Complete CodeRabbit's request to fix the push/iCaptcha documentation drift
    README.md:217
    The README says git push has "no per-push challenge", but this PR adds a git-remote-gitlawb 403 challenge/retry path for push. Even if push is signed-only by policy today, the absolute wording contradicts the implementation and will mislead users debugging a node that does return a challenge. Please complete the review request by documenting that pushes are signed by default while the helper can retry if the node returns an iCaptcha challenge.

  • [P3] Link the protocol discussion required by the contribution guide
    PR description
    The PR body marks DID/signing/wire-format impact and notes new x-icaptcha-* response headers plus client x-icaptcha-proof behavior, but the issue link is still Closes # and the protocol-change checklist item for prior discussion is unchecked. CONTRIBUTING.md asks contributors to open an issue first for protocol-level changes. Please link the issue or discussion that covers this wire change before this lands, or split/adjust the scope if the protocol direction has not been agreed yet.

@beardthelion beardthelion left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core flow is right: structured iCaptcha discovery mirroring human_detected, sub-bound proofs, node gate untouched, CI green. Two security issues in the new icaptcha-client crate that the earlier reviews did not reach, plus a consolidation of the open push-path threads.

Findings

  • [P1] Sanitize and length-bound the iCaptcha response before it reaches the terminal
    crates/icaptcha-client/src/lib.rs:133
    On a non-2xx from the iCaptcha service, resp.text().unwrap_or_default() (and AnswerResult::Failed { reason } at :113) goes raw into bail!, which prints through main() -> Result in both gl and git-remote-gitlawb. The service URL is the node's x-icaptcha-url, and nodes are untrusted here (a MITM on the default http transport qualifies), so these are attacker-influenceable bytes reaching the TTY with control chars intact. That is the CWE-150 terminal-escape class #137 fixed in http_error_message, reintroduced in the new crate. Driving a hostile endpoint through obtain_proof confirmed it: raw ESC bytes land in the error and a 50KB body is echoed unbounded. Strip C0/C1 controls and cap the excerpt (same shape as #137), and bound the read.

  • [P2] Do not hand credentials or trust to a node-chosen iCaptcha URL
    crates/icaptcha-client/src/lib.rs:126
    x-icaptcha-url becomes the base URL with no scheme or host check, then request_challenge/submit_answer POST the caller DID and, when GITLAWB_ICAPTCHA_API_KEY is set, that key as a bearer token to it on the first hop. In the repro a hostile node captured Authorization: Bearer <key> directly, no redirect involved (reqwest already strips Authorization on cross-host redirects, so that path is closed). It is also a client-side SSRF primitive. Require https and pin or allowlist the host, and do not attach the API key to a node-discovered origin. With a key configured this is effectively P1 exfiltration.

  • [P2] Remove the dead iCaptcha retry loop from the push path
    crates/git-remote-gitlawb/src/main.rs:319
    This is the root cause under the two open main.rs threads (clone-the-pack and interactive-prompt). git_receive_pack (crates/gitlawb-node/src/api/repos.rs:789) never calls verify_request; only create/fork/register do. So a push 403 is always a plain Forbidden with no x-icaptcha-* headers, and the loop bails on the first pass without ever solving (icaptcha_cfg_from_headers returns None on a headerless 403). The loop is dead, yet it reintroduces request_body.clone() on every push and carries the latent interactive_prompt stdin read. Deleting it and restoring the moved body resolves both threads at once and keeps the helper non-interactive, rather than adding a bytes::Bytes body plus a non-interactive solver for a path that cannot fire.

  • [P3] Concur on the two open doctor/README threads
    crates/gl/src/doctor.rs:207, README.md:219
    gl doctor probes GITLAWB_ICAPTCHA_URL, which nothing in the solve path reads (IcaptchaCfg::new reads only GITLAWB_ICAPTCHA_API_KEY), and it hits /v1/pubkey while the client uses /v1/challenge and /v1/answer. The README "no per-push challenge" line becomes accurate again once the dead loop above is removed. Both already raised by CodeRabbit and jatmn; the only add is that removing the loop moots the README wording.

On the DID-format thread (CodeRabbit repo.rs:329, seconded as jatmn's J2): I ran this end to end and it is not a break. git clone gitlawb://did:key:z6Mk.../repo (full DID) and the short form both dispatch to the helper and produce a byte-identical node request (GET /z6Mk.../repo/info/refs... in both), because parse_gitlawb_url reduces either authority to the same short owner before any node contact. Full-DID is also what gl repo create (repo.rs:261) and gl repo info (repo.rs:362) emit and what the README documents, so bare-name clone matches them; the short form is used only for REST API paths, never a gitlawb:// URL. Reusing resolve_owner_did here is harmless but would make the emitted clone URL differ from what create prints, so I would leave it. jatmn's protocol-discussion-link ask (J6) still stands on its own.

kevincodex1 and others added 2 commits July 5, 2026 18:55
Builds the client side of the iCaptcha gate so a legitimate agent passes the
challenge transparently, with no manual headers and no GITLAWB_ICAPTCHA_PROOF
env var. Server enforcement is unchanged.

New crate `icaptcha-client` (blocking, shared by gl + the git helper):
- deterministic solvers for the computational challenge types (arithmetic,
  algebra, sequence) matching the iCaptcha generators; anagram/logic/LLM fall
  back to a solver hook / interactive prompt.
- obtain_proof(): POST /v1/challenge (requesterId = caller DID), solve, POST
  /v1/answer, handle escalation, return the signed proof. Optional bearer auth
  via GITLAWB_ICAPTCHA_API_KEY.

gl (crates/gl):
- http.rs: unify post/put/delete through one signed sender that, on a 403
  iCaptcha challenge (detected via x-icaptcha-url / x-icaptcha-level headers),
  solves it and retries the same signed request with the x-icaptcha-proof
  header (bounded retry absorbs the ~5 min proof TTL). Removes the env hack.
  Emits an actionable hint on 401 not_an_agent (old-CLI / unregistered).
- repo.rs: `repo info` checks status and returns a real 404 instead of a stub
  card with `?` fields; `repo info`/`clone` resolve the owner from the local
  identity or an explicit owner/name — never the node's own DID.
- doctor.rs: warn on gl-vs-node version drift; add an iCaptcha reachability
  check.

git-remote-gitlawb: same shared client — push stays signed (RFC 9421); on a 403
iCaptcha (safety net; push is signed-only, not gated) it solves and retries.
Clone/fetch unsigned; missing repo keeps its clear 404.

gitlawb-node: the 403 icaptcha_proof_required response now advertises the
service url + required level as JSON fields and x-icaptcha-url / x-icaptcha-level
headers (mirroring the human_detected pattern) so clients discover them instead
of scraping the message. The sub == authenticated-DID and expiry/level checks
are unchanged — enforcement is not weakened.

README: documents the full lifecycle (identity -> register -> repo create with
auto-iCaptcha -> push -> clone), the requesterId == DID rule, and proof TTL.

Tested: solver unit tests; full workspace green against Postgres (node still
rejects missing/expired/wrong-subject proofs).
Rebased onto main (resolving the git-remote-gitlawb/main.rs conflict from #119)
and addressed the review findings:

- [beardthelion P2 / jatmn P2 / CodeRabbit] Removed the dead iCaptcha retry loop
  from the push path. git_receive_pack never calls verify_request (only
  create/fork/register are iCaptcha-gated), so a push 403 never carries
  x-icaptcha-* headers and the loop could never solve. Taking main's push path
  (via #119) deletes it, which also resolves the pack-clone (retries no longer
  copy the whole pack) and the interactive-prompt-over-git-stream threads, and
  restores the sanitized info/refs error path. git-remote-gitlawb no longer
  depends on icaptcha-client.

- [beardthelion P1] Sanitize and length-bound iCaptcha error bodies before they
  reach the terminal. The origin is only as trusted as the node that advertised
  it, so its non-2xx bodies (and Failed{reason}) are attacker-influenceable; they
  went raw into bail! (CWE-150, the #137 class). Now bounded-read + C0/C1-stripped
  + capped, same shape as #137.

- [beardthelion P2] Do not trust or hand credentials to a node-chosen iCaptcha
  URL. x-icaptcha-url was used as the base with no scheme/host check and the API
  key was sent to it on the first hop (SSRF + key exfiltration). resolve_solver_url
  now honors an advertised URL only when https + host-allowlisted (public default
  or the operator's GITLAWB_ICAPTCHA_URL), else falls back to the trusted origin;
  the API key is attached only to the operator's own configured origin.

- [jatmn P2 / CodeRabbit] gl repo clone with a bare name now derives the short
  owner key via resolve_owner_did, matching the other commands' gitlawb://z.../name
  URL shape instead of the colon-bearing full DID.

Skipped (with reason): jatmn P3 (doctor GITLAWB_ICAPTCHA_URL) is resolved by the
P2 change — that env var is now read by the write path, so the doctor's probe is
accurate. CodeRabbit User-Agent-version nit lived in the push code reverted to
main. README push wording kept ("signed-only, no per-push challenge") — now
accurate since the retry path is gone; the API-key bullet was updated for the new
trust model.

Tests: added sanitize + resolve_solver_url unit tests; full workspace green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@kevincodex1 kevincodex1 force-pushed the feat/icaptcha-client branch from 4aba938 to 83f3c74 Compare July 5, 2026 11:04
@kevincodex1

Copy link
Copy Markdown
Contributor Author

@coderabbitai please do a full review. Approve if its all good

@coderabbitai

coderabbitai Bot commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail
gh pr view 138 --repo Gitlawb/node --json title,body,additions,deletions,changedFiles,files -q '.'
echo "---DIFF---"
gh pr diff 138 --repo Gitlawb/node > /tmp/pr138.diff
wc -l /tmp/pr138.diff

Length of output: 7406


🏁 Script executed:

#!/bin/bash
sed -n '1,1278p' /tmp/pr138.diff

Length of output: 49843


🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '1,60p' crates/gitlawb-node/src/error.rs

Length of output: 1847


🏁 Script executed:

#!/bin/bash
set -euo pipefail
sed -n '195,225p' crates/gl/src/repo.rs

Length of output: 1452


Review summary

I went through the full diff (all 12 changed files) rather than relying solely on the AI summary. Overall this is a solid, well-tested implementation:

Strengths

  • icaptcha-client: URL/key-trust logic (resolve_solver_url) correctly allowlists only the operator-configured or default public host, and only forwards the GITLAWB_ICAPTCHA_API_KEY to that trusted origin — a hostile node can't redirect the solve or exfiltrate the key. Good test coverage for these cases.
  • Error-body handling (error_excerpt/sanitize_excerpt) properly bounds and strips control characters from attacker-influenceable text before it reaches the terminal.
  • solvers.rs deterministic solvers (arithmetic/algebra/sequence) are bounded (MAX_ROUNDS, division-by-zero and non-integer checks in solve_algebra) and covered by tests including negative solutions and multiple sequence patterns.
  • gl/src/http.rs's send_signed/send_once refactor cleanly unifies POST/PUT/DELETE, bounds iCaptcha retries via MAX_ICAPTCHA_RETRIES, and correctly ignores 403s that don't carry x-icaptcha-* headers (so it doesn't swallow genuine forbidden responses).
  • crates/gitlawb-node/src/error.rs/icaptcha.rs: the struct-variant refactor and header-attaching IntoResponse change compiles cleanly (imports for Json/json! already present), and the unreachable!() arm is fine since the variant is always intercepted earlier.
  • repo.rs: removing the "fall back to node's own DID" behavior for resolve_owner_did/cmd_clone is a real correctness/security improvement (previously could silently attribute a nonexistent repo to the node's DID).

One minor nit (non-blocking)

In crates/gl/src/repo.rs, Line 209-210, the old doc comment line was left in place above the new one, so the two now contradict each other:

/// Derive the short DID key segment from a keypair, or fall back to the node's DID.
/// Resolve the owner short-DID for a bare repo name from the LOCAL identity.
/// Never falls back to the node's own DID (that produced bogus "owned by the
/// node" results for repos that don't exist) — if there's no local identity the
/// caller must pass an explicit `owner/name`.
async fn resolve_owner_did(_node: &str, dir: Option<&std::path::Path>) -> Result<String> {

The first line should be deleted since it describes the old (now-removed) fallback behavior that the new lines explicitly say no longer happens.

Nothing else stood out as a functional or security issue. This is a nit, not a blocker, so I'm approving.

[approve]


✏️ Learnings added
Learnt from: kevincodex1
URL: https://github.com/Gitlawb/node/pull/138

Timestamp: 2026-07-05T11:31:38.399Z
Learning: In crates/gl/src/repo.rs, `resolve_owner_did` (and `cmd_clone`) intentionally never fall back to the node's own DID when resolving a bare repo name's owner — a local identity (keypair) is required, or the caller must pass an explicit `owner/name`. This was a deliberate change (PR `#138` in Gitlawb/node) to avoid attributing nonexistent repos to the node's DID.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@kevincodex1 kevincodex1 merged commit 06388ec into main Jul 5, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

crate:git-remote git-remote-gitlawb — the git remote helper crate:gl gl — the contributor CLI crate:node gitlawb-node — the serving node and REST API kind:feature New capability or surface needs-issue PR has no linked issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants