perf: eliminate triple-collect and double-pass in uphill routing by Basedfloppa · Pull Request #4247 · freenet/freenet-core

Basedfloppa · 2026-05-25T00:22:31Z

Closes items 1 and 3 from #4243.

Why

select_uphill_hop is on the CONNECT hot path — it runs on every uphill routing decision when a peer is at terminus but cannot accept. The function had two unnecessary allocation patterns that add latency proportional to candidate count:

Triple-collect chain (scored → best_candidates → partition): Two intermediate Vec allocations before the final partition. The best_candidates Vec served only as a filtering step with no semantic need to materialize.
Double-pass reliability scoring (filter → map → fold → filter): Reliability was computed per-candidate in a .map(), then re-iterated in a .fold() to find the maximum. Since iteration is lazy until .collect(), this is a second pass over the entire candidate set.

Each allocation and pass is cheap in isolation, but on the hot path — especially with many candidates — they compound into measurable GC pressure and cache misses.

What

Issue 1 — Triple-collect chain (item 1 from #4243)

Chain filter().map() directly into .partition(), eliminating the intermediate Vec<PeerKeyLocation> (best_candidates). The scored Vec becomes the only intermediate allocation (necessary because we need the reliability-score data before partitioning).

Before: scored: Vec<(f64, PKL)> → filter().map() → best_candidates: Vec<PKL> → partition()
After: scored: Vec<(f64, PKL)> → filter().map() → partition() (one less Vec)

Issue 3 — Single-pass reliability scoring (item 3 from #4243)

Merge the fold pass into the initial filter_map iterator by tracking best_reliability inline via best_reliability.max(reliability). The filter_map now serves triple duty: recency filtering, reliability scoring, and best-reliability tracking.

Before: filter() + map() to collect scored → iter().map().fold() for max
After: filter_map() with inline best_reliability tracking (single pass)

Correctness

All three changes (recency filter, reliability scoring, partition logic) preserve the exact same semantics:

best_reliability = max over all reliability scores — same result whether computed inline or via fold
filter(|(r, _)| best - *r < tol) operates on the same scored Vec in both cases
partition() into close/far by distance threshold — unchanged
Final router.select_peer() call — unchanged

Testing

cargo fmt --check — clean
cargo clippy --all-targets -- -D warnings — clean
cargo test -p freenet --lib -- operations::connect — 60/60 passed
cargo test -p freenet --lib — 2594/2596 passed; 2 pre-existing failures (IPv6-unavailable, flaky delegate test — both unrelated)

Issue 1 — Triple-collect chain: chain filter+map directly into partition() inside select_uphill_hop, eliminating the intermediate Vec<PeerKeyLocation> allocation (best_candidates). Issue 3 — Single-pass reliability scoring: merge the separate reliability-scoring fold pass into the initial filter_map by tracking best_reliability inline with best_reliability.max(reliability). Closes first two items from freenet#4243.

github-actions · 2026-05-25T00:24:24Z

I now have enough context to write the review.

Rule Review: Missing boundary regression test for k=0 guard

Rules checked: git-workflow.md, code-style.md, testing.md, operations.md
Files reviewed: 2 (crates/core/src/operations/connect.rs, crates/core/src/router.rs)

Warnings

crates/core/src/router.rs:549–553 — The fix: commit adds a k > 0 guard to prevent select_nth_unstable_by(usize::MAX, ...) when consider_n_closest_peers = 0 with a non-empty peer list. No test covers this scenario. The existing test_select_k_best_empty_candidates (line 1359) sets considering_n_closest_peers(5) with empty peers, so k = 5.min(0) = 0 — but peer_distances.len() is also 0, meaning the old k < peer_distances.len() guard would have been false anyway, and the panic path is never reached in that test. A dedicated test with considering_n_closest_peers(0) and a non-empty peer list is the missing regression that would have caught (and now guards against) this specific panic. (rule: testing.md — boundary values, zero)

Info

crates/core/src/router.rs:548 — Comment ends with See PR #4247 review: https://github.com/freenet/freenet-core/pull/4247. Per code-style rules, comments must not reference the current PR/task — they rot. The useful part of the comment (explaining why k > 0 is the correct guard rather than k == 0) should stand alone without the URL. (rule: code-style.md)
Commit 58385372 — Subject line fix skip select_nth_unstable_by(0, ...) is missing the conventional-commit prefix (fix: + space before description). (rule: git-workflow.md)
PR title and commits 9da80af4 / 4fc752ff use perf: prefix. The project's git-workflow lists valid prefixes as feat|fix|docs|refactor|test|build; perf: is not among them. Consider using refactor: or adding perf: to the documented list. (rule: git-workflow.md)

Rule review against .claude/rules/. WARNING findings block merge. ⚠️ 1 warning(s) — fix or add review-override label

Issue 2 — Vec capacity hints: pre-allocate scored/fallbacks with candidates.len() and expired with forward_attempts.len() to avoid geometric reallocations on the CONNECT hot path. Issue 6 — Partial sort: use select_nth_unstable_by in select_k_best_peers to find the k closest peers in O(n) instead of O(n log n) full sort. k = consider_n_closest_peers (default 25) is typically 2-4× smaller than the candidate pool. Part of freenet#4243.

Basedfloppa · 2026-05-25T00:38:53Z

Closes items 1, 2, 3, and 6 from #4243.

Why

Four allocator-level hot spots identified in the routing engine, concentrated in
select_uphill_hop (CONNECT uphill routing), select_k_best_peers (peer
selection), and expire_forward_attempts.

Triple-collect chain (`select_uphill_hop`)

scored → best_candidates → partition: two intermediate Vec allocations
before the final partition. The best_candidates Vec existed only to be
consumed by .partition() on the next line.

Double-pass reliability scoring (`select_uphill_hop`)

Reliability computed per-candidate in .map(), then re-iterated in .fold()
to find the maximum. Since iteration is lazy until .collect(), this is a
second pass over the entire candidate set.

Vec::new() without capacity hints (connect hot path)

scored, fallbacks, and expired Vecs grew from 0 capacity via geometric
doubling (0→1→2→4→8→16→32→64), causing 4–6 reallocations each despite the
final size being known in advance.

Full sort where partial sort suffices (`select_k_best_peers`)

sort_by_key is O(n log n) but only top k = 25 elements are needed.
select_nth_unstable_by partitions around the k-th element in O(n), then only
the truncated k elements are sorted (O(k log k)).

What

Issue 1 — Triple-collect chain

Chain filter().map() directly into .partition(), eliminating the
intermediate best_candidates Vec.

Before: scored: Vec<(f64, PKL)> → filter().map() → best_candidates: Vec<PKL> → partition()

After: scored: Vec<(f64, PKL)> → filter().map() → partition()
(one less Vec)

Issue 2 — Vec capacity hints

Location	Before	After
`select_next_hop` — `scored`	`Vec::new()`	`Vec::with_capacity(candidates.len())`
`select_next_hop` — `fallbacks`	`Vec::new()`	`Vec::with_capacity(candidates.len())`
`expire_forward_attempts` — `expired`	`Vec::new()`	`Vec::with_capacity(self.forward_attempts.len())`

Issue 3 — Single-pass reliability scoring

Merge the fold pass into the initial filter_map by tracking
best_reliability inline via best_reliability.max(reliability). The
filter_map now serves triple duty: recency filtering, reliability scoring,
and best-reliability tracking.

Issue 6 — Partial sort

Replace sort_by_key (O(n log n)) with select_nth_unstable_by (O(n)) for
finding the k closest peers, then sort only the truncated k.

What's intentionally excluded from this PR

Item	Reason
4 — LRU cache	Requires careful invalidation design; estimator state changes on every `add_event()`. Worth a focused PR.
5 — Router write-lock	Architectural (channel-based handoff). ~1-2 days. Marginal gain outside dense gateways.
7 — PeerKeyLocation clones	The 3→2 reduction conflicts with the return value ownership; function returns `peer` after insertion. Deeper redesign needed.

Testing

cargo fmt / cargo clippy --all-targets -- -D warnings — clean
cargo test -p freenet --lib -- operations::connect — 60/60 passed
cargo test -p freenet --lib -- router — 87/87 passed
Full lib suite — 2594/2596 passed; 2 pre-existing failures
(IPv6-unavailable, flaky delegate test — both unrelated)

Basedfloppa · 2026-05-25T00:42:13Z

Why items 4, 5, and 7 are excluded from this PR

Item 4 — LRU cache for `predict_routing_outcome`

Problem: predict_routing_outcome(peer, target) is recomputed for every
(peer, target) pair on every routing decision. Multi-hop connect: 30
candidates × 3 hops = 90 predictions instead of 30 unique.

Why not here: The cache would need invalidation on every add_event()
because the internal estimators (isotonic regression, renegade ML predictor)
change their state with each event. A naive TTL-based cache could serve stale
predictions, causing the router to systematically prefer peers whose scores
are no longer accurate.

Two viable approaches exist:

Per-batch cache: Clear the entire cache at the start of each
select_k_best_peers call. Only deduplicates repeated prediction calls
within a single routing decision — no staleness risk.
Write-through cache: Invalidate specific (peer, target) entries when
add_event receives an event for that peer. Complex but preserves
cross-call reuse.

Both approaches need careful benchmarking to confirm the ~60% hit rate
estimate from the issue. Worth a focused PR with a benchmark harness.

Item 5 — Router write lock blocks routing readers

Problem: router.write().add_event(event) holds an exclusive RwLock
write guard for ~5–15 µs (2–5 isotonic estimator updates + ML prediction).
During that window, all concurrent router.read() calls (routing decisions)
are blocked. On a dense gateway (500+ peers, 50+ ops/s) this causes ~2–3%
collision rate with rare 10–50 µs latency spikes.

Why not here: The fix requires routing events through a channel to a
background writer task, so readers never contend with writers. This is an
architectural change touching the event loop structure and the synchronisation
boundary between OpManager and Router — estimated at 1–2 days of work.
The gain is only measurable on dense gateways; typical nodes see ~0.004%
collision (negligible). A separate PR with #[bench] coverage makes sense.

Item 7 — PeerKeyLocation cloned 3× in `forward_to_peer`

Problem: forward_to_peer clones peer three times: once for
self.forwarded_to, once as the HashMap key in forward_attempts, and once
for the ForwardAttempt.peer field. The issue suggests cutting to two clones
by moving peer into the HashMap key and cloning from forwarded_to for the
field.

Why not here: The function returns (PeerKeyLocation, ConnectRequest),
and the returned PeerKeyLocation is consumed by callers (e.g. stored in
actions.forward, verified in test assertions). Moving peer into
forward_attempts.insert(peer, ...) means it's no longer available for the
return tuple. The suggested "3→2" pattern leaves a use-after-move error.

To actually eliminate a clone, one would need to either:

Change the return type to a reference (&PeerKeyLocation), propagating the
lifetime change through Actions and all call sites, or
Store an Arc<PeerKeyLocation> so that clones are reference-counted.

Both are deeper API changes than the scope of this PR. The 3 clones amount to
~144 bytes + ~0.3 µs per hop — barely measurable (~12 KB/hr on a busy
gateway). Code hygiene issue, not a performance bottleneck.

fix skip select_nth_unstable_by(0, ...)

5838537

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: eliminate triple-collect and double-pass in uphill routing#4247

perf: eliminate triple-collect and double-pass in uphill routing#4247
Basedfloppa wants to merge 3 commits into
freenet:mainfrom
Basedfloppa:perf/routing-engine-triple-collect

Basedfloppa commented May 25, 2026

Uh oh!

github-actions Bot commented May 25, 2026 •

edited

Loading

Uh oh!

Basedfloppa commented May 25, 2026

Uh oh!

Basedfloppa commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Basedfloppa commented May 25, 2026

Why

What

Issue 1 — Triple-collect chain (item 1 from #4243)

Issue 3 — Single-pass reliability scoring (item 3 from #4243)

Correctness

Testing

Related

Uh oh!

github-actions Bot commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rule Review: Missing boundary regression test for k=0 guard

Warnings

Info

Uh oh!

Basedfloppa commented May 25, 2026

Why

Triple-collect chain (select_uphill_hop)

Double-pass reliability scoring (select_uphill_hop)

Vec::new() without capacity hints (connect hot path)

Full sort where partial sort suffices (select_k_best_peers)

What

Issue 1 — Triple-collect chain

Issue 2 — Vec capacity hints

Issue 3 — Single-pass reliability scoring

Issue 6 — Partial sort

What's intentionally excluded from this PR

Testing

Uh oh!

Basedfloppa commented May 25, 2026

Why items 4, 5, and 7 are excluded from this PR

Item 4 — LRU cache for predict_routing_outcome

Item 5 — Router write lock blocks routing readers

Item 7 — PeerKeyLocation cloned 3× in forward_to_peer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented May 25, 2026 •

edited

Loading

Triple-collect chain (`select_uphill_hop`)

Double-pass reliability scoring (`select_uphill_hop`)

Full sort where partial sort suffices (`select_k_best_peers`)

Item 4 — LRU cache for `predict_routing_outcome`

Item 7 — PeerKeyLocation cloned 3× in `forward_to_peer`