Skip to content

feat: browseruse CLI#3722

Open
kinjalh wants to merge 4 commits into
masterfrom
browser-cli-tool
Open

feat: browseruse CLI#3722
kinjalh wants to merge 4 commits into
masterfrom
browser-cli-tool

Conversation

@kinjalh

@kinjalh kinjalh commented Jun 15, 2026

Copy link
Copy Markdown
Member

Test Plan

Test environment:
https://console.kinjal-gitgud.onplural.sh/

Checklist

  • I have added a meaningful title and summary to convey the impact of this PR to a user.
  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have deployed the agent to a test environment and verified that it works as expected (required only when changing agent code).

Plural Flow: console

@kinjalh kinjalh added the enhancement New feature or request label Jun 15, 2026
@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR introduces a browser-use CLI MCP sidecar for agent runtimes, allowing agents to interact with a headless browser via structured MCP tools (mcp__browser__*) instead of ad-hoc Playwright/Selenium scripts. A new browserEnabled field propagates from the AgentRuntime CRD spec through the Elixir schema, GraphQL layer, and Go client to control whether the sidecar and related tool permissions are provisioned.

  • Adds a new browseruse init container (restartPolicy: Always) that runs browser-use --mcp against the browserless CDP sidecar and exposes port 8082; wired into Claude, Codex, Gemini, and OpenCode tool configurations when enabled.
  • Adds a browser_enabled column to agent_runtimes (with default: false) and threads it through the full stack (migration, Ecto schema, GraphQL, OpenAPI, Go models, UI).
  • Adds a CD workflow to build and publish the pluralsh/browser-use Docker image, and updates all three system prompt templates to conditionally include browser MCP usage guidance.

Confidence Score: 3/5

The core logic is sound and well-tested, but the browser sidecar has no startup probe — meaning the browseruse container will race against an unready CDP endpoint and restart-loop on every new pod until the browser is actually serving.

The browser sidecar container has no startup probe, so Kubernetes marks it ready immediately. The browseruse container starts right after and tries to connect to CDP on port 3000, but the browser is not necessarily serving yet — causing browseruse to fail its own startup probe repeatedly and restart-loop on every enabled pod. The hardcoded image tag will also drift silently from the CD pipeline whenever a new version is published.

go/deployment-operator/internal/controller/agentrun_pod.go — the enableBrowser and enableBrowserUse functions need a startup probe on the browser sidecar to prevent restart loops.

Important Files Changed

Filename Overview
go/deployment-operator/internal/controller/agentrun_pod.go Adds enableBrowserUse() to append a browser-use MCP sidecar init container; browser sidecar has no startup probe so browseruse can race on CDP port 3000, and the new container has no resource requests/limits; tag is hardcoded to 0.1.0
go/deployment-operator/internal/controller/agentrun_pod_test.go Good test coverage for the new sidecar; port 8082 is hardcoded as a string literal instead of referencing the constant
.github/workflows/deployment-operator-cd-browser-use.yaml New CD pipeline for the browser-use sidecar image; BROWSER_USE_DEFAULT_VERSION matches the hardcoded tag in the controller but the two must be kept in sync manually
go/deployment-operator/dockerfiles/browser-use/Dockerfile Clean Dockerfile for the browser-use MCP sidecar; runs as non-root (UID 65532), uses tini as init, installs browser-use at a pinned version
go/deployment-operator/pkg/common/mcp.go Adds BrowserUseMCPServerPort (8082) and BrowserUseMCPServerURL constants for the browser-use MCP server
priv/repo/migrations/20260615032400_add_agent_runtime_browser_enabled.exs Adds browser_enabled boolean column to agent_runtimes with NOT NULL default false; safe migration
lib/console/schema/agent_runtime.ex Adds browser_enabled field with default false; correctly included in @Valid changeset fields
go/deployment-operator/pkg/agentrun-harness/tool/claude/claude.go Conditionally registers the browser MCP server and allows browser tools when BrowserEnabled; consistent with existing pattern
go/deployment-operator/pkg/agentrun-harness/tool/codex/codex.go Adds browser MCP server entry with TrustPolicy always when BrowserEnabled; consistent with the plural MCP entry
go/deployment-operator/pkg/agentrun-harness/tool/gemini/gemini.go Passes BrowserEnabled and BrowserMCPURL into the settings template; straightforward
go/deployment-operator/pkg/agentrun-harness/tool/opencode/opencode.go Passes BrowserEnabled and BrowserMCPURL into the OpenCode config template; consistent pattern

Comments Outside Diff (2)

  1. go/deployment-operator/internal/controller/agentrun_pod.go, line 562-599 (link)

    P1 Browser sidecar has no startup probe — browseruse races against an unready CDP endpoint

    enableBrowser appends the browser sidecar with no startup or readiness probe, so Kubernetes considers it "started" the moment the container process launches. enableBrowserUse is called immediately after, and since the browseruse sidecar's startup probe only checks port 8082 (its own MCP listener), it will fail until browser-use --mcp successfully connects to CDP on port 3000 — which requires the browser container to be ready first. In practice this means browseruse repeatedly crashes its startup probe (30 × 2 s = 60 s window), gets killed, and restarts in a loop until the browser sidecar is actually serving CDP. Adding a TCPSocket startup probe on defaultContainerBrowserServerPort (3000) to the browser container in enableBrowser would let Kubernetes gate the browseruse start until the browser is actually ready.

  2. go/deployment-operator/internal/controller/agentrun_pod.go, line 424-426 (link)

    P2 Hardcoded image tag will drift from the CD pipeline

    defaultBrowseruseTag = "0.1.0" is duplicated in the CI workflow (BROWSER_USE_DEFAULT_VERSION: 0.1.0). Every time the CD pipeline bumps the browser-use image version, this constant must also be updated manually in the operator code or all runtimes will keep pulling the old image. Consider reading the tag from a configurable field (e.g. the AgentRuntime spec or an operator-level config map) rather than a compile-time constant.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Reviews (1): Last reviewed commit: "browseruse CLI" | Re-trigger Greptile

Comment on lines +615 to +640
pod.Spec.InitContainers = append(pod.Spec.InitContainers, corev1.Container{
Name: browseruseContainerName,
Image: image,
SecurityContext: ensureDefaultContainerSecurityContext(nil),
RestartPolicy: lo.ToPtr(corev1.ContainerRestartPolicyAlways),
Args: []string{
"--mcp",
"--cdp-url", fmt.Sprintf("http://127.0.0.1:%d", defaultContainerBrowserServerPort),
"--port", fmt.Sprintf("%d", common.BrowserUseMCPServerPort),
},
Ports: []corev1.ContainerPort{
{
Name: "browseruse",
ContainerPort: int32(common.BrowserUseMCPServerPort),
},
},
StartupProbe: &corev1.Probe{
ProbeHandler: corev1.ProbeHandler{
TCPSocket: &corev1.TCPSocketAction{
Port: intstr.FromInt32(int32(common.BrowserUseMCPServerPort)),
},
},
PeriodSeconds: 2,
FailureThreshold: 30,
},
})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No resource requests/limits on the browseruse sidecar

The container spec omits Resources, leaving the pod's QoS class potentially degraded and giving the scheduler no signal for bin-packing. The browser sidecar (enableBrowser) lets callers pass resource overrides via browserConfig.Container.Resources, but enableBrowserUse provides no equivalent. At minimum, sensible default requests should be set so the pod doesn't land in the BestEffort class alongside the browser and DinD sidecars.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +350 to +353
assert.Contains(t, bu.Args, "--port")
assert.Contains(t, bu.Args, "8082")

if assert.Len(t, bu.Ports, 1) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The test hardcodes the string "8082" for the --port arg instead of deriving it from common.BrowserUseMCPServerPort. If the constant ever changes, this assertion silently passes while the probe and port declaration still use the old value.

Suggested change
assert.Contains(t, bu.Args, "--port")
assert.Contains(t, bu.Args, "8082")
if assert.Len(t, bu.Ports, 1) {
assert.Contains(t, bu.Args, "--port")
assert.Contains(t, bu.Args, fmt.Sprintf("%d", common.BrowserUseMCPServerPort))
if assert.Len(t, bu.Ports, 1) {

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@kinjalh kinjalh changed the title browseruse CLI feat: browseruse CLI Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant