feat(transport): add transport-aware health check endpoints by dunglas · Pull Request #1201 · dunglas/mercure

dunglas · 2026-04-15T16:28:36Z

Expose readiness and liveness probes via the Caddy admin API so Kubernetes (and other orchestrators) can detect when a hub's transport connection is actually healthy, not just that the process is running.

New TransportHealthChecker optional interface (Ready / Live) that transports can implement to report their state.
New admin.api.mercure_health Caddy module exposing /mercure/health/{ready,live} aggregate endpoints and /mercure/health/{name}/{ready,live} per-hub endpoints on the admin API (port 2019).
Caddyfile name directive for identifying hubs in per-hub endpoints and future metrics labels (defaults to "default").
Helm chart: transport-aware probes enabled by default, with fallback to the legacy /healthz on HTTP port when healthCheck.enabled=false. Admin port always exposed; metrics.port kept for backward compat.
/healthz on the HTTP port is deprecated (only checks that Caddy is running, not transport connectivity).

Transports that do not implement the interface (Bolt, Local) are considered always-healthy; the Enterprise transports (Redis, Postgres, Kafka, Pulsar) implement it in the Mercure Enterprise repository.

Expose readiness and liveness probes via the Caddy admin API so Kubernetes (and other orchestrators) can detect when a hub's transport connection is actually healthy, not just that the process is running. - New TransportHealthChecker optional interface (Ready / Live) that transports can implement to report their state. - New admin.api.mercure_health Caddy module exposing /mercure/health/{ready,live} aggregate endpoints and /mercure/health/{name}/{ready,live} per-hub endpoints on the admin API (port 2019). - Caddyfile name directive for identifying hubs in per-hub endpoints and future metrics labels (defaults to "default"). - Helm chart: transport-aware probes enabled by default, with fallback to the legacy /healthz on HTTP port when healthCheck.enabled=false. Admin port always exposed; metrics.port kept for backward compat. - /healthz on the HTTP port is deprecated (only checks that Caddy is running, not transport connectivity). Transports that do not implement the interface (Bolt, Local) are considered always-healthy; the Enterprise transports (Redis, Postgres, Kafka, Pulsar) implement it in the Mercure Enterprise repository. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Adds transport-aware readiness/liveness health checks exposed via the Caddy admin API so orchestrators can distinguish “process is up” from “transport is healthy”, and updates the Helm chart/docs to use these endpoints (with /healthz marked deprecated).

Changes:

Introduces TransportHealthChecker (Ready/Live) and a new Caddy admin module admin.api.mercure_health exposing /mercure/health/* endpoints.
Adds a name directive for Mercure hubs to support per-hub health endpoints.
Updates Helm chart probes to use the admin health endpoints by default and updates docs/Caddyfiles to deprecate /healthz.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
transport.go	Adds optional transport health-check interface (`Ready`/`Live`).
caddy/health.go	Implements admin API health endpoints aggregating per-hub/per-transport status.
caddy/health_test.go	Adds coverage for health endpoints (OK/404/405/unhealthy).
caddy/mercure.go	Stores hub metadata (name/transport) for health aggregation and parses `name` directive.
charts/mercure/values.yaml	Adds `adminPort` and new `healthCheck` values; deprecates metrics.port usage.
charts/mercure/templates/deployment.yaml	Adds admin container port and switches probes to `/mercure/health/*` when enabled.
charts/mercure/templates/service.yaml	Routes metrics Service to the admin container port.
docs/hub/config.md	Documents new admin API health endpoints and deprecates `/healthz`.
Caddyfile	Marks `/healthz` as deprecated in comments.
dev.Caddyfile	Marks `/healthz` as deprecated in comments.
examples/chat/chart/mercure-example-chat/README.md	Formatting update to values table (helm-docs output).
.github/workflows/lint.yml	Workflow config tweak (per diff).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

The Caddy admin API binds to localhost:2019 by default for security, so httpGet probes from the kubelet fail with connection refused (the kubelet connects to the pod IP, not localhost inside the container). Switch to exec probes that run curl from inside the container, which matches the pattern already used by the preStop hook. Also update the documentation to reflect this and note the alternative (binding admin to 0.0.0.0, with the security caveat). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Return 404 when a per-hub health endpoint is queried for a hub name that doesn't exist, instead of silently responding 200 OK (which hid typos and misconfigurations). - Don't expose internal transport error details in the HTTP response body; log them server-side and return a generic error message to avoid leaking connection details if admin is exposed beyond localhost. - Chart: use adminPort (with metrics.port fallback) for preStop URL and metrics Service port so everything stays aligned when adminPort is customized. - values.yaml: fix helm-docs annotation marker on adminPort. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

dunglas · 2026-04-15T23:33:02Z

A few of the Copilot review threads on this PR look already addressed in the current head (e76fceb) but are still marked unresolved — flagging them here so they can be closed out:

docs/hub/config.md#L107 — "admin API binds to localhost" note. Addressed: the diff already adds the "These endpoints are exposed on the admin API port, not the main HTTP port. The Caddy admin API binds to localhost:2019 by default for security…" paragraph plus the admin 0.0.0.0:2019 alternative.
caddy/health.go#L126 — 404 for unknown hub names. Addressed: checkTransports now tracks matched and returns errHubNotFound, which handleHealth translates to http.StatusNotFound. Also covered by TestHealthEndpointUnknownHub.
charts/mercure/templates/deployment.yaml#L113 — preStop hook hitting hardcoded :2019. Addressed: preStop now posts to http://localhost:{{ .Values.adminPort | default .Values.metrics.port }}/stop.

Separately, a likely root cause for the lint-test (ct install) failure: with healthCheck.enabled: true as the default, the probes use exec with curl, but caddy:2-alpine doesn't ship curl (only ca-certificates, libcap, mailcap). BusyBox wget is available though, consistent with the docker-compose example you already put in docs/hub/config.md. I've pushed a suggested fix to claude/fix-pr-1201-curl-to-wget — switches both probes and the preStop hook to wget --spider, and regenerates charts/mercure/README.md so the new adminPort / healthCheck.* values show up in the values table. Feel free to cherry-pick or ignore.

Generated by Claude Code

…pine The caddy:2-alpine base image ships ca-certificates, libcap and mailcap but does not include curl. With healthCheck.enabled=true (the new default) the curl-based exec probes would always fail readiness and liveness, blocking ct install and any real deployment. The preStop hook had the same latent issue but failed silently. Switch both probes and the preStop hook to BusyBox wget (which is included in alpine), matching the wget-based Docker healthcheck already recommended in docs/hub/config.md. Also regenerate charts/mercure/README.md so the values table documents the new adminPort and healthCheck.* fields.

The caddy:2-alpine base image ships wget, not curl. Update the Kubernetes and Docker Compose examples in the docs accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings April 15, 2026 16:28

Copilot started reviewing on behalf of dunglas April 15, 2026 16:29 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

dunglas and others added 2 commits April 15, 2026 18:39

claude and others added 2 commits April 16, 2026 11:22

fix(docs): use wget in health check examples and fix prettier formatting

0e2afb2

The caddy:2-alpine base image ships wget, not curl. Update the Kubernetes and Docker Compose examples in the docs accordingly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(transport): add transport-aware health check endpoints#1201

feat(transport): add transport-aware health check endpoints#1201
dunglas wants to merge 5 commits intomainfrom
feat/transport-health

dunglas commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dunglas commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

dunglas commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dunglas commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants