RSM-3143: research — DLA integration via pi-coding-agent bridge#3478
Draft
Poliuk wants to merge 24 commits into
Draft
RSM-3143: research — DLA integration via pi-coding-agent bridge#3478Poliuk wants to merge 24 commits into
Poliuk wants to merge 24 commits into
Conversation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Set PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1 in Buildkite (pipeline, release-build-and-distribute, code-freeze release pipeline), the publish-npm-package GitHub Actions workflow, and the `install:bundle` npm script in apps/cli. This is intended to prevent the `playwright install chromium` postinstall pulled in transitively by the `data-liberation` agent dep from downloading ~150 MB of Chromium during CI builds. End-user installs (`npm install -g wp-studio`) still pay the cost on first install — DLA's runtime bootstraps Chromium lazily when a Wix/Squarespace adapter actually needs it. Note: empirical testing shows `playwright install chromium` (the CLI command DLA invokes in its postinstall) does NOT currently honor PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD — only Playwright's own historical auto-postinstall path checks it, and modern playwright/playwright-core no longer ship a postinstall script. The env-var change is defensive (zero-cost, future-proof if Playwright re-introduces auto-postinstall); a separate follow-up will be needed to fully eliminate the download — see issues/rsm-3143-dla-pi-research/verification/t9-summary.txt for details. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…te (orchestrator)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Related issues
@anthropic-ai/claude-agent-sdkto@mariozechner/pi-coding-agent)studio code#3277 (closed)How AI was used in this PR
This PR was orchestrated end-to-end via the
/orchestratorskill across two phases. The full agent cascade:Research phase (RSM-3143):
Spec-to-code phase:
studio liberate(originallystudio migrate), T9 Playwright env in CI)tsx/dist/cli.mjs→tsx/cliresolution bug surfaced by code-review65ce8848)Reviewers should especially scrutinise: the
tools/dla/bridge contract (250 LOC, type-safe but uses one documentedinputSchema as unknown as TSchemacast — seewave-1-mcp-bridge-feasibility.md§2 for why this is safe); the two-layer permission policy intools/dla/policy.ts; theSTUDIO_DLA_ENABLEDfeature-flag gating inapps/cli/ai/runtimes/pi/index.ts; and the/liberateskill body atapps/cli/ai/skills/liberate/SKILL.md.Draft PR — not for immediate merge. Per owner direction, both the research artifacts and the implementation land in the same PR. The PR is opened as a draft pending human review on a real Wix/Squarespace test site with
STUDIO_DLA_ENABLED=1and pending product decisions on the feature-flag default for v1 (currently off).Proposed Changes
Research artifacts (RSM-3143)
issues/rsm-3143-dla-pi-research/research-report.md— synthesis report. Recommends MCP-stdio bridge as the canonical/liberatepath against pi-coding-agent, with Subprocess as a separatestudio liberate <url>standalone CLI command and Vendor-as-AgentTools as a documented fallback. (Research artifact still references the original/migratename; the as-shipped command is/liberate.)issues/rsm-3143-dla-pi-research/research-plan.md— research plan with wave-1 findings log.issues/rsm-3143-dla-pi-research/wave-1-findings/wave-1-*.md— five wave-1 researcher findings: pi extensibility surface, MCP bridge feasibility, vendor-as-AgentTools, subprocess revisit, upstream + bundling.issues/rsm-3143-dla-pi-research/prior-art/— preserved prior-art bundle (RSM-1639 + RSM-3139 specs/plans/notes).issues/rsm-3143-dla-pi-research/plan.md— 11-task implementation plan derived from the research.issues/rsm-3143-dla-pi-research/review-1.md+review-2.md— code-review verdicts.issues/rsm-3143-dla-pi-research/doc-review-1.md+ thisPR-description.md.Code deliverables (T1–T9)
tools/dla/workspace package scaffold (commita3b2be96): new@studio/dlaworkspace package as a sibling oftools/common/. Addstsconfig.json,package.json, alias wiring inapps/cli/tsconfig.jsonandapps/cli/vite.config.base.ts.2df39446): pins"data-liberation": "github:Automattic/data-liberation-agent#17219c42b0420267302b138bf402930508006e0e"and"tsx": "^4.19.0"inapps/cli/package.jsondependencies.tsxlives in runtime deps so it survives--omit=dev.22d5144a):tools/dla/bridge.ts,agent-tool-adapter.ts,content-adapter.ts,policy.ts,index.ts, plus four vitest files (45 tests). Spawns DLA's MCP server as a child process viaprocess.execPath+tsx, connects an MCPClientover stdio, lists tools, and adapts each into a piToolDefinition. Defaultsdegraded: trueon spawn orlistToolsfailure rather than crashing session startup.286a4c50):apps/cli/ai/runtimes/pi/index.tsconstructsDefaultResourceLoaderwithextensionFactories: [ createDlaPolicyFactory(defaultPolicyBuckets) ]whenSTUDIO_DLA_ENABLED=1. InlineextensionFactoriesload even withnoExtensions: true, so no other loader flags flip./liberateskill + vite prod skills-copy fix (commit0de39aa0):apps/cli/ai/skills/liberate/SKILL.md(frontmattername+descriptiononly; body uses bare DLA tool names and steers callers towarddelegate: true).apps/cli/vite.config.prod.tslearns the sameai/skillsstatic-copy target thatdev.tsandnpm.tsalready had — previously, the prod-bundled CLI silently shipped without skills.aedb5e7b):maybeStartDlaBridgeruns beforecreateStudioAgentSession; the bridge handle is threaded throughbuildAgentToolsand itstoolsspliced into the local-site tool list (not the remote-site branch, to avoid recursive migrations back into Studio). Teardown lives in the existingfinallyblock alongsidesession.dispose()./liberateslash command (commitb1bebaaa): registers{ name: 'liberate', description: __(...) }intools/common/ai/slash-commands.ts'sAI_SKILL_COMMANDS. The existing skill-dispatcher inapps/cli/commands/ai/index.tsroutes throughrunAgentTurn(buildSkillInvocationPrompt('liberate')).studio liberate <url>standalone command (commitb42a8286): new yargs command atapps/cli/commands/liberate/. Thin wrapper that spawns DLA's CLI viaprocess.execPath+tsx, inherits stdio, forwardsSIGINT/SIGTERM, propagates exit code. No agent in the loop — DLA's own Ink UI streams to the terminal.43a7d920): setsPLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1in.buildkite/pipeline.yml,.buildkite/release-build-and-distribute.yml,.buildkite/release-pipelines/code-freeze.yml,.github/workflows/publish-npm-package.yml, andapps/cli/package.json'sinstall:bundlescript. See the Playwright env-var caveat below.65ce8848):tools/dla/bridge.tsresolvestsxastsx/cli(the public exports key) rather thantsx/dist/cli.mjs, which throwsERR_PACKAGE_PATH_NOT_EXPORTEDagainsttsx@4.21.0. Adds three regression tests against the productiondefaultTransportProviderresolution path.Doc deliverables (T10, T11)
apps/cli/README.md(commit7c083282): new "Migrate from a closed platform" section between "Studio Code" and "Import and export". Covers user-facing surface only — both invocation modes (/liberateinsidestudio codewithSTUDIO_DLA_ENABLED=1, andstudio liberate <url>standalone), platform credential env vars (LIBERATION_TOKEN,SHOPIFY_ADMIN_TOKEN), and the Playwright Chromium cost.docs/design-docs/cli.md(commit0cd93cab): new "Data Liberation Agent integration" section. Documents the as-built architecture: dep pin model,tools/dla/layout, bridge spawn pipeline, tool wrapping, two-layer permission policy, feature-flag gating, bare-name tool surface,delegate: truehandoff contract, both user surfaces, the orphan-work caveat (DLA does not honornotifications/cancelled), the Playwright env-var caveat, and the update cadence.Scope
All code changes live in
apps/cli/and the newtools/dla/workspace package. Noapps/studio/changes. No Electron-side touchpoints.Testing Instructions
Build + unit tests
npm install npm run cli:build npm testExpected: 1721+ tests pass across all workspaces (45 in
tools/dla/alone).npm run typecheckpasses for all workspaces including@studio/dla.npx eslint tools/dla apps/cli/ai/runtimes/pi/index.ts apps/cli/commands/liberatereturns 0 errors.Exercise
/liberateend-to-end (agent path)The agent integration is gated behind
STUDIO_DLA_ENABLED=1for v1. Without the flag,studio codebehaves identically to pre-PR.STUDIO_DLA_ENABLED=1 node apps/cli/dist/cli/main.mjs code # inside the session: /liberate https://your-test-wix-or-squarespace-site.exampleExpected: the agent introduces the skill, calls
liberate_inspect, narrates results, asksAskUserQuestionto confirm, runsliberate_extract→liberate_verify→liberate_setup(withdelegate: true), creates a Studio site viasite_createwith an inlineimportWxrblueprint step, then callsliberate_importwithdelegate: trueand handles the returned manifest (media copy, redirect map, authors, optional Shopify products viawp_cli).Smoke-check the bridge is active:
STUDIO_DLA_ENABLED=1 node apps/cli/dist/cli/main.mjs code --json "list all tools available"Expected: the response mentions
liberate_inspect,liberate_extract, etc. alongside Studio's own tools. If the bridge fails to spawn, the runtime logs[studio code] DLA bridge degraded; continuing without DLA tools (...)and proceeds without DLA — the agent still answers, butliberate_*tools are absent.For Webflow / Shopify test sites, also set
LIBERATION_TOKEN=.../SHOPIFY_ADMIN_TOKEN=...before launching.Exercise
studio liberate <url>(standalone path)Expected: DLA's Ink UI streams directly to the terminal.
--outputand--non-interactiveare forwarded; any additional DLA flags work too (yargs is non-strict for this command). Exit code propagates DLA's exit code;SIGINT/SIGTERMare forwarded to the child.Permission policy
The destructive
liberate_importbucket blocks calls withoutdelegate: true. The block fires in two layers: the adapter-layershouldBlock()check inside the tool'sexecute()wrapper, and the runtime-layerpi.on('tool_call', ...)extension hook. Either layer alone is sufficient; the second is defence-in-depth.To smoke-test policy, watch the agent transcript for any
liberate_importcall withoutdelegate: true— it should fail with the Studio policy error rather than hitting DLA. (The skill body explicitly instructs the agent never to callliberate_importwithoutdelegate: true.)Verify CI Playwright env
PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1is now set in:.buildkite/pipeline.yml.buildkite/release-build-and-distribute.yml.buildkite/release-pipelines/code-freeze.yml.github/workflows/publish-npm-package.ymlapps/cli/package.json→install:bundlescriptSee the Playwright env-var caveat below for what this actually does today.
Pre-merge Checklist
Five gates require an explicit human decision before this PR is mergeable. Each gate explains the state of the world and why the decision can't be deferred past merge.
Owner's planned path: the PR will first be tested in its current form (with DLA as a
github:SHA pin) via Gate 1's end-to-end run. If the test succeeds, DLA will be published to npm before this PR merges, and the dep declaration inapps/cli/package.jsonwill be flipped from thegithub:SHA pin to a standard npm semver range. That sequencing turns Gate 4 from "manual SHA decision at every release" into a normal npm-dep workflow (Dependabot / Renovate bumps, standard lockfile semantics). Gates 1 → 3 → (publish DLA) → 4-flip → 5-issue-filed → merge.Gate 1 — Real-site
/liberatelifecycle verifiedSTUDIO_DLA_ENABLED=1 studio codeagainst at least one live Wix or Squarespace test site, walks through/liberate <url>end-to-end, and confirms the new Studio site contains the expected pages, posts, and media.What's happening: All Studio CLI implementation work landed and passes the 1724-test unit/integration suite. The bridge spawns DLA, tools register, the policy gates work, the wrapper-skill body parses cleanly. But none of the implementer agents nor the code-reviewer drove an actual
/liberateflow against a real source site. Doing so requires DLA's adapters to perform real network I/O against a closed platform, drive headless Chromium against Wix/Squarespace, produce a real WXR, and import it into a fresh Studio site.Why it's a merge-time gate: Unit tests exercise the bridge, the adapter, the policy, and the runtime wiring — they don't exercise DLA's adapter code paths or the agent's reasoning over real DLA outputs. There may be runtime issues that only surface end-to-end: a wrong tool-argument shape the wrapper skill emits, the content adapter mishandling some MCP response variant DLA produces against a real site, the
importWxrblueprint failing on an unexpected WXR shape, thedelegate: truemanifest containing an unexpected field. The orchestrator's agents had no way to perform this test (no disposable test site, no credentials, no human in the loop to evaluate "did the migration actually work"). A human reviewer should do it once before merging.Gate 2 — Feature-flag default decided for v1
STUDIO_DLA_ENABLEDships off by default (current state) or on by default for v1. If on, README and design-doc references to the flag must be updated.What's happening: Both the bridge spawn (in
runAgentSessionTurn) and the policy extension factory (inDefaultResourceLoader.extensionFactories) checkSTUDIO_DLA_ENABLED === '1'and early-return when it's unset or'0'. With the flag off (the v1 default we shipped), the runtime is byte-for-byte identical to before this PR: no bridge child process, no DLA tools incustomTools, no policy factory in the resource loader. Users runningstudio codesee no/liberate, no DLA tools, nothing different.Why it's a merge-time gate: This is a product decision, not a code one. Shipping with the flag off means
/liberateis invisible to users until someone manually sets the env var — fine for a staged rollout, but defeats the discovery story (no one will find/liberateby accident; the slash menu won't autocomplete it). Shipping with the flag on means every user gets/liberateby default — better discoverability, but takes on the bridge child-process lifecycle, the ~150 MB Chromium download, and the cancellation-orphan caveat (Gate 5) for the entire user base on first install. The reviewer should pick which posture matches the project's rollout plan. Could also be conditional (on for opt-in beta tracks, off for stable). Either choice is defensible; the orchestrator does not have the project context to choose.Gate 3 — Playwright Chromium download story decided
patch-package, or pre-populatePLAYWRIGHT_BROWSERS_PATHfrom a CI cache.What's happening: DLA's
package.jsonhas"postinstall": "playwright install chromium". When anynpm installpulls DLA in (CI, dev machines, end-usernpm install -g wp-studio), DLA's postinstall fires and downloads ~150 MB of headless Chromium binaries. Wix and Squarespace adapters use that Chromium to scrape JavaScript-rendered pages. T9 setPLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=1in.buildkite/,.github/workflows/, and theinstall:bundlescript — the documented Playwright way to skip the download.Why it's a merge-time gate: The T9 implementer discovered empirically that the env var is currently inert. Modern Playwright (the one DLA pins) no longer has a postinstall hook of its own — that hook was the only place
PLAYWRIGHT_SKIP_BROWSER_DOWNLOADgot consulted. DLA's postinstall calls Playwright'sinstallBrowsers()function directly, which doesn't check the env var. So today, setting it has no effect — Chromium still downloads on every CI build. The env var landed anyway as zero-cost forward-compat (if Playwright re-adds postinstall behavior, or we patch DLA, it's already wired everywhere), but the ~150 MB CI cost is currently unmitigated. The reviewer should pick a mitigation strategy (or explicitly accept the cost) before merge so the gap is closed in the same release cycle, not deferred indefinitely.Gate 4 — DLA dep flipped from
github:SHA pin to npm semver range"data-liberation": "github:Automattic/data-liberation-agent#<sha>"to a normal npm semver range (e.g."^0.1.0") before merge.npm install --omit=devresolves DLA from npm;node apps/cli/dist/cli/main.mjs liberate --helpstill works). Verifytools/dla/policy.tsdefaultPolicyBucketscovers any tools DLA added between the SHA-audit point (2026-05-07) and the published version.Automattic/data-liberation-agentHEAD instead, and the gate stays a manual decision at every future Studio release.What's happening: DLA isn't published to npm and has no git tags as of this PR. To get a reproducible install, we pinned it to a specific commit SHA:
"data-liberation": "github:Automattic/data-liberation-agent#17219c42b0420267302b138bf402930508006e0e"inapps/cli/package.json. That SHA was the DLA HEAD on 2026-05-07 (when wave-1 audited DLA's source). Thegithub:pin works for testing this PR end-to-end but creates ongoing maintenance burden: DLA has no semver discipline, no automated bump PRs (Dependabot / Renovate don't trackgithub:deps reliably), and every Studio release needs a manual SHA review.Why it's a merge-time gate: the owner's plan is to test the PR with the current SHA pin and, on success, publish DLA to npm before merging this PR. After the publish, the dep declaration flips to a standard npm semver range and the gate effectively dissolves into normal npm-dep semantics: Dependabot opens auto-bump PRs, CI exercises the bridge integration on each bump,
package-lock.jsoncaptures the resolved version, and a future Studio engineer doesn't need to manually compare SHAs at release time. The flip is a one-line edit; it needs to land in this same PR (not as a follow-up) because the version of DLA the bridge resolves at runtime determines whetherdefaultPolicyBucketsis complete — and that's part of what reviewers will be checking when they approve the merge.Specifically, before merge: if DLA's published version exposes a 14th MCP tool that wasn't in the wave-1 inventory of 13, the defensive "unknown DLA tool → deny" path in
tools/dla/policy.tswill hide the new tool from/liberateuntil the bucket table is extended. Agit diffof DLA between the wave-1 SHA and the published version will surface this; if a new tool is present, add it todefaultPolicyBucketsin the same merge commit.Gate 5 — Upstream-DLA issue for
notifications/cancelledfiledAutomattic/data-liberation-agentrequesting that DLA's MCP server wirenotifications/cancelled(and ideallyprogressToken) into its tool handlers.What's happening: MCP has a standard protocol message called
notifications/cancelledthat a client sends to the server when it wants to abort an in-flight tool call. Well-behaved MCP servers receive it and stop the work. Studio's bridge wires this correctly — when the agent'sAbortSignalfires (Ctrl+C, model decides to bail), the bridge forwards the abort to DLA vianotifications/cancelled. But DLA's MCP server doesn't honor it — DLA receives the message and keeps working.Why it's a merge-time gate: Concretely: if the agent kicks off
liberate_extractagainst a Wix site and 30 seconds in the user cancels, Studio sees the tool call as cancelled and the agent moves on. From DLA's side, the extraction continues silently to completion, writing partial output to disk. This is bounded — DLA's resume-safe protocol (extraction-log.jsonl,session.json,media-stubs.json) detects and reuses those partial outputs on the nextliberate_extractrun, so it's not data loss. But it is wasted CPU/network/disk in the background and mildly surprising semantics (the user thinks they cancelled, but source-platform requests keep going for a while). The orchestrator left filing the upstream issue to a human because it's cross-team coordination, not an in-repo implementation change. Filing it before merge surfaces the gap, lets DLA's maintainers plan a fix, and tracks the eventual update to Studio's docs.Standard checklist
npm run typecheckclean across all workspaces; lint clean on touched files).apps/studio/changes? — Yes (git diff --stat 46d83870..HEAD -- 'apps/studio/'is empty).