Add catalog/IPC baselines and harden documented catalog parity by osamu2001 · Pull Request #23 · deverman/FocusRelayMCP

osamu2001 · 2026-03-21T06:16:29Z

Summary

add catalog and IPC benchmark commands plus gate coverage for list-projects, list-tags, and bridge health
align project/tag parity logic with documented Omni Automation surfaces and fail unsupported fields explicitly instead of guessing
fix the osascript fallback pipe drain order, preserve default listProjects payloads, and restore cooldown handling for degraded benchmark failures
refresh smoke/realistic benchmark summaries and the catalog progress documentation

Validation

swift test
benchmark-gate-check --tool list-projects
benchmark-gate-check --tool list-tags
benchmark-gate-check --tool all

intent(benchmarks): capture before values for list_projects list_tags and the IPC hot path so later optimizations have a comparison baseline decision(benchmarks): extend the baseline with benchmark-list-projects benchmark-list-tags and benchmark-bridge-health without changing the existing suite decision(catalog-cache): use cacheTTL=0 during benchmarks so measurements reflect the query path instead of catalog cache hits rejected(list-projects-parity): skipped the JXA parity baseline because it currently fails to decode and landed this as a plugin-only contract baseline constraint(cli-contract): kept the normal MCP read path and existing benchmark output contract unchanged by making the new commands and gates additive learned(ipc-baseline): separating bridge timing from end-to-end latency makes transport overhead tails much easier to see

intent(benchmark-gate): list-projects and list-tags smoke gates should fail on payload drift rather than stable bad responses decision(benchmark-gate): compare bridge catalog pages against JXA parity baselines and scope jxa_probe to all-only rejected(benchmark-gate): bridge-vs-bridge contract checks because consistently wrong payloads still pass constraint(bridge-validation): plugin install timed out against the sandbox plugin directory so local live gate acceptance remains environment-blocked learned(omni-automation): project and tag parity baselines needed full page metadata plus stalled and count field mapping before they were trustworthy

intent(jxa-parity): benchmark-gate acceptance should pass under Codex instead of stalling on local JXA authorization and type-conversion failures decision(jxa-parity): retry -1743 failures through osascript and run list-projects/list-tags parity scripts inside OmniFocus via evaluateJavascript rejected(jxa-parity): direct JXA document queries for catalog parity because task.taskStatus cannot be converted reliably in that context constraint(benchmark-gate): keep public CLI and MCP contracts unchanged while restoring install-plugin, restart, swift test, and benchmark-gate acceptance learned(omni-automation): the same catalog parity logic is stable once it runs in Omni Automation, but direct JXA and OSAKit have different failure modes in this environment

intent(project-stalled): list-projects parity should not silently mark non-empty projects as stalled when Omni Automation cannot provide stalled metadata decision(project-stalled): fail the JXA stalled scenario explicitly when nextTask or containsSingletonActions are unavailable instead of coercing them through safe boolean fallbacks rejected(project-stalled): defaulting unsupported stalled inputs to true or false because either guess corrupts gate results and benchmark baselines constraint(benchmark-gate): keep the active_counts_stalled scenario while making unsupported field access surface as a parity failure learned(omni-automation): stalled parity needs a supported-fields probe separate from ordinary nullable nextTask values

intent(tag-status): list-tags parity should not silently mix on-hold or dropped tags into the active scope when tag status is unavailable decision(tag-status): treat missing or unrecognized Omni Automation tag status as an explicit unsupported failure instead of defaulting to active rejected(tag-status): coercing unknown status to active because it hides runtime limitations and corrupts the active filter baseline constraint(benchmark-gate): preserve the existing active tag scenarios while making unsupported status access surface clearly learned(omni-automation): tag status parity is only trustworthy when the runtime yields a concrete status value or enum mapping

intent(tag-counts): list-tags parity should not report zero tasks when Omni Automation cannot provide tag count collections decision(tag-counts): surface missing availableTasks, remainingTasks, or tasks as explicit unsupported failures instead of converting null collections into empty arrays rejected(tag-counts): zero-filling unsupported count inputs because it creates false parity failures and misleading benchmark baselines constraint(benchmark-gate): keep the active_with_counts scenario while making unsupported tag count access obvious learned(omni-automation): tag count parity is only meaningful after every convenience collection resolves to a concrete collection

intent(benchmarks): replace the stale smoke baselines added on this branch with measurements from the current bridge and gate state decision(artifacts): keep only summary.md tracked for benchmark evidence and drop raw.jsonl from the commit scope constraint(validation): smoke artifacts were regenerated only after swift test, plugin reinstall, OmniFocus restart, and list-projects/list-tags/all semantic gates passed

intent(benchmarks): capture merge-confidence baselines for catalog queries and bridge IPC in addition to the refreshed smoke artifacts decision(artifacts): record realistic measurements in separate dated directories so smoke and realistic evidence stay comparable without overwriting each other constraint(benchmarks): these runs use the documented realistic profile settings with summary.md tracked and raw.jsonl excluded

intent(progress-doc): replace the outdated catalog baseline writeup with one that matches the latest smoke and realistic measurements on this branch decision(reporting): move the report to a new 2026-03-20 file so the document date matches the captured artifacts and branch state constraint(references): remove all references to the superseded 2026-03-16 artifact paths and keep transport claims out of this report

intent(catalog-contract): list_tags を documented surface だけで成立させる decision(tag-counts): convenience pools ではなく taskStatus 集計で count を導出する constraint(tag-pagination): flattenedTags 不在でも nested tags を totalCount に含める

intent(catalog-parity): bridge と jxa の tag 集計モデルを一致させる decision(tag-enumeration): root tags と children の完全走査を fallback に使う

intent(project-health): unsupported field を parity 成功に見せない constraint(jxa-projects): undefined は unsupported、null nextTask は正当な stalled 値

intent(gates): benchmark-gate-check を unsupported surface の有無から切り離す rejected(project-health): nextTask と containsSingletonActions を gate の正解に使う方針

intent(contract-docs): 実装と review checklist の不一致を解消する learned(catalog-contract): gate は documented surface に限定しないと誤陽性と誤陰性を両方生む

intent(tag-fallback): flattenedTags 非対応時でも root tags から fallback 列挙を成立させる constraint(tag-fallback): local tagItems と OmniFocus の global tags を分離して TDZ を避ける

intent(osascript-fallback): keep the Apple Event recovery path usable for large list_tasks and list_projects responses decision(pipe-drain): start stdout and stderr readers before waitUntilExit so osascript cannot block on a full pipe buffer constraint(automation-errors): preserve the existing AutomationError surface while making the drain order testable

intent(list-projects): omitted fields calls must keep the legacy default payload instead of probing unsupported project extras decision(project-fields): gate nextTask, containsSingletonActions, hasChildren, and isStalled behind explicit field requests while preserving includeTaskCounts constraint(project-query): keep the completedBefore exclusive filter fix in the same script path while restoring omitted-fields compatibility learned(script-capture): the JXA source assertions need to tolerate escaped inner script strings when validating generated automation code

intent(benchmark): failed benchmark calls must honor cooldown even when the bridge fails without a timeout decision(cooldown): make the shared cooldown helper unconditional for failure paths because its callers already sit on error branches decision(bridge-health): treat ok:false health responses as degraded failures so unhealthy plugin probes back off like thrown bridge errors learned(benchmark-tests): a small async cooldown assertion plus source guards catches both helper regressions and the bridge-health ok:false path

osamu2001 added 18 commits March 19, 2026 23:44

fix(bridge-tags): align tag counts with documented catalog model

0dcabab

intent(catalog-parity): bridge と jxa の tag 集計モデルを一致させる decision(tag-enumeration): root tags と children の完全走査を fallback に使う

fix(jxa-projects): reject absent unsupported health fields

9818595

intent(project-health): unsupported field を parity 成功に見せない constraint(jxa-projects): undefined は unsupported、null nextTask は正当な stalled 値

fix(benchmark-gates): restrict catalog parity to contract-backed fields

a669fd6

intent(gates): benchmark-gate-check を unsupported surface の有無から切り離す rejected(project-health): nextTask と containsSingletonActions を gate の正解に使う方針

docs(contract): align catalog docs with implemented query model

585e635

intent(contract-docs): 実装と review checklist の不一致を解消する learned(catalog-contract): gate は documented surface に限定しないと誤陽性と誤陰性を両方生む

fix(bridge-tags): avoid shadowing root tag collection

7d6361a

intent(tag-fallback): flattenedTags 非対応時でも root tags から fallback 列挙を成立させる constraint(tag-fallback): local tagItems と OmniFocus の global tags を分離して TDZ を避ける

osamu2001 closed this Mar 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add catalog/IPC baselines and harden documented catalog parity#23

Add catalog/IPC baselines and harden documented catalog parity#23
osamu2001 wants to merge 18 commits into
deverman:masterfrom
osamu2001:chore/catalog-ipc-baselines

osamu2001 commented Mar 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

osamu2001 commented Mar 21, 2026

Summary

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant