Harden HTTP relay failover with sequential routing and per-node timeouts #556
+202
−20
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
http://google.com/) blocked the chain and the relay never advanced.dev_router:preprocess/3flattened multi-node peer metadata, so downstream processors and relays couldn’t replay the original list of candidate nodes or rewrite their URIs consistently.hb_httpignored{error, Reason}tuples returned by the HTTP client, making it impossible for callers to detect failures and attempt the next peer.Solution
TargetMod5) and pass it—along with node-specific HTTP options/timeouts—to a newrelay_nodes_in_order/6. Each peer is tried with its ownhttp-timeout, enforced viarelay_request_with_timeout/2, so slow peers are abandoned and the relay advances deterministically.<<"nodes">>list (and normalized URIs) insidedev_router:preprocess/3, ensuring multi-node peers remain intact when the request is re-dispatched through[email protected].hb_httpto propagate errors fromhb_http_client, allowing callers to treat client failures just like HTTP-level failures.Changes
src/dev_relay.erlhb_maps:get(<<"method">>, TargetMod5, ...)for outbound requests and add helpers (relay_nodes_in_order/6,relay_request_with_timeout/2,peer_http_opts/3,peer_timeout/3) to merge per-node options, enforce timeouts, and document the behavior.relay_failover_test/0with explicit per-node<<"http-timeout">>settings (10 s for Google, 2 s for the invalid host, 5 s for the local peer) and explanatory comments.src/dev_router.erl<<"nodes">>structure when preprocessing requests, normalizing each node’s URI to point atuser-pathinstead of collapsing to a single peer.src/hb_http.erlRes = hb_http_client:request(...)and add aprocess_response/7clause that handles{error, Reason}by returning{error, {http_request_failed, Reason}}.This ensures multi-node relays respect per-peer timeouts, keep the full peer list intact, and bubble up HTTP client failures so the next candidate is tried automatically.