Skip to content

Fix/integration awareness oauth guard#61

Open
gotham64 wants to merge 112 commits intomainfrom
fix/integration-awareness-oauth-guard
Open

Fix/integration awareness oauth guard#61
gotham64 wants to merge 112 commits intomainfrom
fix/integration-awareness-oauth-guard

Conversation

@gotham64
Copy link
Copy Markdown
Member

@gotham64 gotham64 commented Mar 8, 2026

This pull request introduces several enhancements to the dashboard and chat experience, with a focus on expanding built-in templates, improving agent group chat handling, and clarifying canvas component usage for custom visuals. It also adds new UI elements for telemetry and refines descriptions and behaviors for canvas tools. Below are the most important changes grouped by theme:

Dashboard and Templates Expansion

  • Added three new built-in templates: creative-report (for custom branded visuals and narrative cards), day-planner (daily schedule and task checklist), and service-status (multi-service health board). This expands the available dashboard templates and increases the seed count in tests. [1] [2]
  • Updated the canvas domain summary to clarify live data fetching, auto-refreshing dashboards, interactive forms, and design options for custom visuals using the embed component.

Canvas Tool Improvements

  • Enhanced the canvas_push function and its parameter descriptions to emphasize using the embed type for any custom visual, branded, or creative dashboard, and clarified the data structure for each component type.
  • Improved the canvas_update function description to highlight its role in live-data updates and auto-refreshing dashboards.

Agent Group Chat Handling

  • Implemented a group chat fan-out mechanism: after the primary agent responds, remaining group agents sequentially contribute their perspectives, each with context-enriched prompts and UI attribution. [1] [2]
  • When finalizing a streaming response, now includes agentId and agentName for better attribution in chat history.

Streaming and Event Handling

  • Ensured that the full assembled response from Rust (event.text) is passed through and preferred over accumulated deltas, preventing content truncation if IPC events are dropped. [1] [2]

UI and Styling Enhancements

  • Added styles for tool step indicators and thinking indicators in mini chat, and introduced a new telemetry card layout with stats, charts, and model breakdown columns. Also refined the background and removed hover effects for the brain canvas area. [1] [2] [3]

Copilot Agent and others added 30 commits March 5, 2026 23:23
…fix skill ID inconsistency

- google.rs: 26 tests covering definitions, parameters, dispatch consistency,
  response parsing (extract_body_text, base64url), URL safety
- n8n.rs: 15 tests for integration-to-skill mapping, service_to_skill_id,
  oauth_service_to_n8n_node, and cross-function consistency
- oauth.rs: 15 tests for Google Workspace unified routing, scope coverage,
  n8n type consistency, and tier routing completeness
- integrations.rs: engine_integration_preflight command with 5 checks
  (OAuth services, Google scopes, Google tools, aliases, n8n engine)
- health.rs: n8n minimum version guard (>= 1.76.0 for MCP), parse_version()
  utility with 4 tests
- Fix: service_to_skill_id now returns google_workspace (underscore) matching
  the skill tools dispatch table, not google-workspace (dash)
- Fix: oauth_service_to_n8n_node made pub(crate) for cross-module testing
- Registered engine_integration_preflight in Tauri invoke_handler
- Test count: 723 → 784 (all passing)
…test

Bug fix (critical):
- ToolCallResult.is_error was missing #[serde(rename = "isError")]
  The MCP spec uses camelCase "isError" but Rust expected "is_error",
  causing every MCP server error response to be silently treated as success

Contract tests (~60 new, 836 total):
- n8n.rs: response-contract tests for workflow/package/credential parsing,
  error classification, N8nTestResult serialization, MCP token validation
- types.rs: MCP protocol contract tests (InitializeResult, ToolsListResult,
  ToolCallResult, JsonRpcRequest/Response, StreamableHttp alias, 200-tool cap)
- client.rs: extract_text_content edge cases, protocol version/timeout constants
- health.rs: MCP detection logic, JWT validation, API key parsing,
  login field names, settings request shape, URL construction

Docker smoke test (scripts/integration-smoke-test.sh):
- Spins up real n8n in Docker, tests 22 API contracts end-to-end
- Tests: healthz, owner setup, login, API key generation, workflow CRUD,
  community packages, MCP detection/enable/token, Streamable HTTP
  initialize + tools/list, credentials.json (416 types), nodes.json
- Result: 22 passed, 0 failed, 1 skipped (optional version header)
…uide fixes

Bug fixes:
1. rest_api collision: 12 services all shared skill_id 'rest_api', so
   connecting a second service overwrote the first. Each service now
   gets its own skill vault namespace (notion, linear, stripe, etc.).

2. AI blindness: The AI had zero information about which service was
   connected. Added build_integration_awareness() that injects a
   'Connected REST API Services' section into the system prompt listing
   every connected service with name, base URL, hints, and exact
   rest_api_call syntax.

3. Notion wrong skill: Notion has a dedicated builtin skill with
   detailed API docs, but map_integration_to_skill returned 'rest_api'
   instead of 'notion'. Now correctly routes to the builtin.

4. N8N_OAUTH_SERVICE_IDS: Removed API-key services (stripe, todoist,
   clickup, airtable, trello, etc.) that were incorrectly classified
   as OAuth, causing wrong 'Connect via n8n' button.

5. OAuth setup guides: buildGuide() always showed 'Navigate to
   Settings -> API' even for OAuth services. Now shows auth-type-aware
   instructions.

Tests: 846 passed (was 836), 10 new tests added.
When OPENPAWZ_*_CLIENT_ID env vars aren't set at build time, the OAuth
button now shows a helpful 'OAuth not configured' banner and falls
back to the manual API key setup flow instead of letting the user
click Connect and getting a cryptic error.

Changes:
- oauth.rs: Added 'configured' field to OAuthServiceInfo
- molecules.ts: fetchOAuthAvailability() caches real client IDs
- Detail panel falls back to API key setup for unconfigured services
- _integrations.css: styled the oauth-unavailable info banner
The clean script deleted ~/Library/WebKit/openpawz but the actual
WebView data dir is ~/Library/WebKit/com.openpawz.openpawz (matching
the Tauri identifier). This left stale localStorage (paw-lock-mode)
after a clean, causing the lock screen to show the unlock form with
no visible buttons (vault was wiped but localStorage still pointed
to system auth mode).

Also added ~/Library/Caches/com.openpawz.openpawz to the wipe list.
…nt TDZ crash

OAUTH_SERVICE_IDS and N8N_OAUTH_SERVICE_IDS were declared at line 5799
but referenced at line 39 inside buildGuide(), which is called during
module initialization at line 786. This caused a ReferenceError (temporal
dead zone) that crashed the entire app on startup, leaving the user
stuck on the lock screen with no buttons rendered.

Moved both Set declarations to the top of the file, before any code
that references them.
…th connect

Root cause: Google Workspace had credentialFields: undefined and
nodeType: 'n8n-nodes-base.gmail'. The svc() function fell through
to CREDENTIAL_OVERRIDES['n8n-nodes-base.gmail'] which contained
Service Account fields (Region, Service Account Email, Private Key).

When OAuth wasn't configured (no GOOGLE_CLIENT_ID at build time),
clicking Setup showed this irrelevant Service Account form.

Fixes:
- Set credentialFields to [] for Google Workspace (OAuth-only, no
  manual credential fields needed)
- Setup guide now shows an informative message when a service has
  no credential fields instead of an empty form with Test & Save
…rations

Rebase merged the old bottom-of-file declarations back in, creating
duplicate const errors that crash esbuild/Vite on startup.
…, YouTube, Vector Search

Added 10 new Google API scopes to the PKCE OAuth config:
- Google Chat (messages + spaces)
- Google Tasks (read/write)
- Google Contacts / People API (read-only)
- Google Keep (read/write)
- Google Forms (read-only)
- YouTube Data (read-only)
- Vertex AI Vector Search (cloud-platform)

Updated docs/oauth-app-registration.md to match.
The .env file is gitignored so the Client ID never reached users who
cloned the repo. PKCE Client IDs are public (RFC 8252 §8.1) so it's
safe to ship in source. The env var override still works for forks.
…counts

Google Keep API rejects consumer Gmail accounts with invalid_scope.
Removed from both default_scopes and write_scopes.
OAuth stores tokens under 'oauth:google-workspace' (the service_id)
but the Google tools, calendar, and mail commands were looking for
'oauth:google'. Added fallback chain: try google-workspace first,
then legacy google key.
google_calendar_create:
- Added recurrence param (RRULE array)
- Added timezone param (IANA tz for start/end)
- Added all-day event detection (date-only format)

google_api:
- Handle body sent as JSON string (LLM sometimes stringifies)
- Parse string body back to JSON object before sending
- Preserves existing object/array passthrough
Implement the execute_plan pseudo-tool that lets the model submit a
complete multi-step plan in a single inference call. The engine validates
the DAG, parallelizes independent nodes via tokio::spawn, retries
transient failures with exponential backoff, and degrades gracefully
when dependencies fail.

Rust (src-tauri/src/engine/plan/):
- atoms.rs: Pure types, constants, validation (cycle detection, secret
  scanning, depth computation), 11 unit tests
- molecules.rs: Plan parser, tool-registry validator, result context
  builder, plan description formatter, 5 unit tests
- executor.rs: Parallel phase execution with per-node timeout,
  retry with backoff, dependency-aware skip, plan timeout,
  panic isolation, 2 unit tests
- mod.rs: Barrel exports

Wiring:
- engine/mod.rs: Register plan module
- tools/mod.rs: execute_plan tool definition + builtins() registration
- agent_loop/mod.rs: Intercept execute_plan, route to DAG executor
- chat.rs: Plan awareness in system prompt (standard + budgeted)
- atoms/types.rs: PlanStart, PlanNodeStart, PlanComplete events
- prompts/plan.md: Model instructions for execute_plan usage

TypeScript (src/features/action-dag/):
- atoms.ts: Frontend types and pure display helpers
- molecules.ts: In-memory plan tracker, event handlers
- index.ts: Barrel exports

Tests: 20 new (866 lib + 66 integration = 932 total, all passing)
Clippy clean, cargo fmt clean, tsc clean.
Implement per-provider constrained decoding to eliminate tool call
parse failures. Each provider now receives optimal constraints:

- OpenAI: strict:true on function definitions + additionalProperties:false
  (Structured Outputs for gpt-4o, o1, o3, o4-mini, gpt-4.1*)
- Ollama: format:"json" on request body for grammar-level enforcement
- Anthropic: explicit tool_choice:{type:"auto"} for structured calling
- Google Gemini: tool_config with function_calling_config:{mode:"AUTO"}
- OpenRouter: strict mode forwarded for OpenAI-routed models
- DeepSeek/Grok/Mistral/Moonshot: structured (no strict), no changes
- Custom: no constraints, fallback to parse+retry

Agent loop updated: skip malformed-JSON retry when constrained decoding
is active (parse failure indicates a deeper issue, not a format mistake).

Atomic structure (constrained/):
  atoms.rs     — ConstraintLevel, ConstraintConfig, detect_constraints(),
                 supports_openai_strict(), enforce_additional_properties_false()
  molecules.rs — apply_openai_strict(), apply_ollama_json_format(),
                 apply_anthropic_tool_choice(), apply_google_tool_config()
  mod.rs       — barrel exports

Other changes:
- ProviderKind: derive Copy+Eq (fieldless enum, zero-cost)
- OpenAiProvider: store ProviderKind for constraint detection
- OpenAiProvider::kind() returns actual variant (was hardcoded OpenAI)

19 new tests (13 atoms + 6 molecules), 951 total, all passing.
cargo fmt ✓ | cargo clippy ✓ | cargo test ✓
Persistent tool embedding storage with four-tier search failover.
Extends existing Tool RAG (tool_index.rs) with SQLite persistence,
incremental indexing, hierarchical search, and domain centroids.

New module: engine/tool_registry/
  atoms.rs     (455 lines) — Pure types (ToolSource, ToolEmbeddingRecord,
                SearchTier), BM25 scoring, domain keyword classifier,
                cosine similarity, f32↔bytes conversion. 17 tests.
  molecules.rs (874 lines) — PersistentToolRegistry: SQLite CRUD,
                incremental indexing with circuit breaker, four-tier
                failover search (Vector→BM25→DomainKeyword), hierarchical
                domain centroids, stale pruning. 12 tests.
  mod.rs       (19 lines)  — Barrel re-exports.

Schema: tool_embeddings table (tool_name PK, description, embedding BLOB,
  domain, source, updated_at) with domain and source indices.

Wiring: pub mod tool_registry in engine/mod.rs, pub(crate) schema,
  pub fn tool_domain in tool_index.rs.

Four-tier search failover:
  Tier 1: Local Ollama embeddings (~50ms, zero cost, full privacy)
  Tier 2: Cloud embedding API (OpenAI/Google — tiny cost)
  Tier 3: BM25 keyword search (no embeddings needed)
  Tier 4: Domain keyword matching (absolute fallback, always works)

982 tests (916 lib + 66 integration), 0 failures.
cargo fmt ✓  cargo clippy ✓  cargo test ✓
MessagePack binary serialization infrastructure, delta event batching,
structured agent envelopes, and compact plan result accumulation.

New module: engine/binary_ipc/
  atoms.rs     (682 lines) — WireFormat enum, BinaryEnvelope, MessagePack
                encode/decode (rmp-serde), CompactDelta + DeltaBatch for
                token batching, AgentEnvelope + TypedPayload for structured
                inter-agent messaging (Direct/Broadcast/Handoff/StatusUpdate/
                DataExchange), CompactNodeResult for binary plan results,
                WireStats + measure_wire_format() for benchmarking JSON vs
                MessagePack, should_use_binary() threshold check. 18 tests.
  molecules.rs (886 lines) — EventBatcher: accumulates streaming deltas,
                flushes at MAX_BATCH_SIZE (32) or BATCH_FLUSH_INTERVAL_MS
                (50ms), reduces IPC overhead up to 32×. ResultAccumulator:
                binary buffer for plan DAG results, to_msgpack() + context
                string generation. AgentMessageCodec: encode/decode typed
                envelopes to/from MessagePack, legacy format conversion
                (content + metadata pairs), wire size measurement.
                recommended_format() for per-event format negotiation.
                log_session_stats() for session performance summary. 20 tests.
  mod.rs       (27 lines)  — Barrel re-exports.

Dependency: rmp-serde = "1" (MessagePack, already a transitive dep
  via matrix-sdk and tauri-plugin-sql — now direct).

Key capabilities:
  - MessagePack 60-80% smaller than JSON for IPC payloads
  - Delta batching: 50-100 tokens/sec → 2-3 IPC calls/sec (vs 50-100)
  - Typed agent envelopes: ToolResult, PlanFragment, DataTable, Raw
  - Binary plan result accumulation: N msgpack appends + 1 decode
  - Format negotiation: hot paths (delta, thinking) → binary; cold → JSON
  - Wire benchmarking: measure_wire_format() for data-driven optimization

1020 tests (954 lib + 66 integration), 0 failures.
cargo fmt ✓  cargo clippy ✓  cargo test ✓
… for agents)

Implements the final phase of the Agent Execution Roadmap: speculative
tool execution based on historical transition patterns, analogous to
CPU branch prediction.

New module: engine/speculative/
  atoms.rs     (952 lines) — Pure types, constants, and deterministic functions
  molecules.rs (885 lines) — SQLite persistence, in-memory cache, connection warming
  mod.rs       (36 lines)  — Barrel re-exports

Core components:

1. Transition probability matrix (TransitionStore)
   - SQLite-backed tool_sequences table: (from_tool, to_tool, count, last_seen)
   - Upsert on each tool completion, building a frequency-based matrix
   - Top-N successor queries with normalized probabilities
   - Stale entry pruning for long-running instances

2. Tool mutability classification (classify_tool_mutability)
   - Safety invariant: ONLY read-only tools are ever speculatively executed
   - Explicit classification for 30+ built-in tools (read_file, memory_search, etc.)
   - Pattern-based detection for prefixed tools (google_*, discord_*, mcp_*, etc.)
   - Conservative default: unknown tools → Write (never speculate)

3. Prediction engine (predict_next_tools, predict_and_record)
   - Configurable threshold (default: P ≥ 0.5)
   - Minimum observation count before trusting a transition (default: 3)
   - Top-K candidate selection with read-only filtering
   - Records transitions while predicting (combined operation)

4. Speculative cache (SpeculativeCache)
   - In-memory LRU cache keyed by (tool_name, args_hash)
   - TTL-based expiration (default: 30s)
   - Bounded capacity (default: 64 entries) with oldest-first eviction
   - Hit/miss/eviction statistics tracking

5. Cancellation sessions (SpeculationSession)
   - Arc<AtomicBool> flag for lock-free cross-thread cancellation
   - When model calls a different tool than predicted, cancel in-flight speculation

6. Connection pre-warming (warm_connection, warm_connections_batch)
   - TCP connect + drop to prime DNS cache and verify connectivity
   - Domain-to-host mapping for 9 API domains (Google, Discord, Slack, etc.)
   - Saves 100-300ms per subsequent real API call

7. Outcome resolution and observability
   - Hit/Miss/Cancelled/Expired outcome tracking
   - Per-session statistics with hit rate computation
   - Human-readable summary formatting

Schema changes:
  - New tool_sequences table with (from_tool, to_tool) primary key
  - Indexed on from_tool for fast successor lookups

No new dependencies — uses existing rusqlite, serde, std::net, std::sync.

54 new tests (29 atoms + 25 molecules), 1,074 total (1,008 lib + 66 integration).
All tests pass. cargo fmt clean. cargo clippy clean (0 speculative warnings).
- constrained/molecules.rs: remove unused import ConstraintLevel from tests
- plan/executor.rs: use variables for literal .min() to avoid 'never greater
  than' clippy warning; use local bindings for constant assertion

0 warnings in all 5 roadmap modules. 1,074 tests still passing.
Bug 1 — HTTP 401 'Wrong username or password':
When the OS vault is cleared but the n8n database persists, owner_password()
generates a NEW random password while n8n still has the owner registered with
the OLD password.  Subsequent session logins (POST /rest/login) fail with 401,
blocking enable_mcp_access() and retrieve_mcp_token().

Fix: Both functions now accept an api_key parameter. They first attempt
session-based login (existing behaviour). On 401 or network error they
fall back to X-N8N-API-KEY header authentication, which uses the
separately-managed N8N_API_KEY env var that IS still valid.

Bug 2 — Duplicate log lines:
get_or_retrieve_mcp_token() in commands/n8n.rs redundantly called
setup_owner_if_needed() + enable_mcp_access() before retrieve_mcp_token().
These same functions are ALREADY called by the provisioning / reconnect paths
in process.rs, docker.rs, and mod.rs — so every MCP failure message appeared
twice per engine_n8n_ensure_ready() invocation.

Fix: Removed the redundant calls in get_or_retrieve_mcp_token(). Added a
comment explaining why they are intentionally omitted.

Changed files:
  - health.rs: enable_mcp_access(url) → enable_mcp_access(url, api_key)
               retrieve_mcp_token(url) → retrieve_mcp_token(url, api_key)
               Both try session auth first, fall back to API key on failure
  - process.rs: pass &api_key to enable_mcp_access()
  - docker.rs:  pass &api_key / &config.api_key at both call sites
  - mod.rs:     pass &config.api_key at both reconnect paths
  - n8n.rs:     remove redundant setup_owner + enable_mcp calls,
               pass &config.api_key to retrieve_mcp_token()
Root cause: n8n MCP endpoints (/rest/mcp/settings, /rest/mcp/api-key)
only accept session cookies — NOT the X-N8N-API-KEY header. The previous
commit's API key fallback was ineffective; both auth methods returned 401.

The real problem: owner_password() generated a random password stored in
the OS keychain. If the keychain was cleared but n8n's database persisted
(or vice versa), the passwords went out of sync with no recovery path.

Changes:

1. Deterministic owner password (eliminates root cause):
   owner_password() now derives the password from the n8n encryption key
   via HMAC-SHA256(encryption_key, 'paw-n8n-owner-v1'). The encryption
   key is already carefully managed (vault + config sync), so the password
   is always reproducible from it. No separate vault entry needed.
   Old random passwords in the vault are auto-cleaned on first call.

2. Automatic 401 recovery (handles existing mismatched installs):
   enable_mcp_access(), retrieve_mcp_token(), and session_client() now
   detect 401 on login and attempt recovery:
     a. Delete the stale agent@paw.local row from n8n's SQLite database
        using rusqlite (already a dependency) — workflows are preserved
     b. Wait 1s for n8n to notice the DB change
     c. Recreate the owner via setup_owner_if_needed() with the derived pw
     d. Retry login — should now succeed

3. Removed broken API key fallback:
   The _api_key parameter is kept in the signature (call sites unchanged)
   but no longer used for MCP endpoints since n8n rejects it there.
The n8n database lives at different paths depending on the engine mode:
  - Process mode (npx): ~/.openpawz/n8n-data/.n8n/database.sqlite
    (n8n creates a .n8n subfolder inside N8N_USER_FOLDER)
  - Docker mode: ~/.openpawz/n8n-data/database.sqlite
    (bind mount maps directly to /home/node/.n8n)

The previous commit only checked the Docker path, so the 401 recovery
could never find the database on macOS process-mode installs.

Now tries both candidate paths and uses whichever exists.
The HMAC-derived password change (commit 2190a4c) broke existing installs:
it produced a different password than what n8n had stored, AND deleted
the old vault entry — making the mismatch unrecoverable.

This commit reverts owner_password() to the original vault-backed random
approach: generate once, store in vault, reuse forever. Simple, stable.

The 401 recovery (DB reset) is kept but improved:
  - Searches both process mode (~/.openpawz/n8n-data/.n8n/database.sqlite)
    and Docker mode (~/.openpawz/n8n-data/database.sqlite) paths
  - Sets busy_timeout(5s) in case n8n has the DB locked
  - Disables FK constraints during delete to handle shared_workflow,
    shared_credentials, and other tables that reference the user
  - Cleans up referencing rows to avoid orphans

Recovery flow on first run after this fix:
  1. owner_password() generates NEW random password (old one was deleted
     by the HMAC commit) and stores it in vault
  2. Login with new password → 401 (n8n still has old hash)
  3. DB reset: finds database, deletes stale owner + FK refs
  4. setup_owner_if_needed() recreates owner with new vault password
  5. Retry login → success → MCP enabled
  6. From now on, vault and n8n stay in sync
Restores the HMAC-SHA256 derivation from the encryption key. The previous
revert was unnecessary — the HMAC approach is correct and eliminates the
vault/DB drift problem entirely. The actual bug was the DB reset looking
at the wrong path (fixed in 2ce58b9).

Key difference from the broken 2190a4c commit:
  - Does NOT delete the old PURPOSE_N8N_OWNER vault entry — the old
    random password simply becomes irrelevant since the HMAC path takes
    priority whenever the encryption key exists in the vault.

Flow on existing installs:
  1. owner_password() derives HMAC password from encryption key (stable)
  2. Login → 401 (n8n has hash of old random password)
  3. DB reset finds database at correct path, deletes stale owner + FK refs
  4. setup_owner_if_needed() recreates owner with HMAC-derived password
  5. Retry login → success
  6. From now on: password is always the same HMAC output — no drift possible
Phase 2 — Persistent Tool Registry:
- request_tools.rs now uses PersistentToolRegistry with four-tier
  search failover (Vector → BM25 → Domain keyword) instead of
  requiring embeddings. Tools work even without embedding API.
- SessionStore.conn upgraded to Arc<Mutex<Connection>> for shared
  access by tool_registry and speculative subsystems.
- EngineState gains persistent_tool_registry field.

Phase 3 — Binary IPC:
- EventBatcher wired into agent loop streaming — batches text deltas
  before emitting IPC events, reducing per-token overhead for fast models.
- ResultAccumulator wired into plan executor — accumulates DAG results
  in compact MessagePack format alongside Vec<NodeResult>.
- log_session_stats() called at turn end for observability.

Phase 4 — Speculative Tool Execution:
- predict_and_record() called after every tool execution in the agent
  loop, recording A→B tool transitions in SQLite tool_sequences table.
- Connection pre-warming for predicted next tool's API domain.
- SpeculativeCache + SpeculationConfig added to EngineState.
- log_session_speculation_stats() at turn end.

Build perf: Added .cargo/config.toml with mold linker + reduced debug
info — incremental cargo test drops from ~15min to ~9s.

All 1,008 tests pass. Clean clippy.
Copilot Agent and others added 30 commits March 8, 2026 05:53
- Embed the full force-directed knowledge graph (from Memory Palace) into
  the Engram card on the Today dashboard
- Added destroyPalaceGraph() and renderPalaceGraphInto(container) exports
  to graph.ts so the graph can be hosted in any container without ID conflicts
- renderPalaceGraphInto: creates its own <canvas>, calls memoryList(200) +
  memoryEdges(500), builds the same animated force-graph with particle edges,
  category nebulae, hover tooltips, pan/zoom, double-click to recall
- Removes the custom engram-brain.ts canvas (replaced by the real graph)
- + Memory button and modal remain for quick memory storage
- engram-brain-wrap updated to palace-graph-container style (radial glow bg)
- Move _miniMode to module-level state block (avoids TDZ, cleaner structure)
- Set _miniMode = false in destroyPalaceGraph and renderPalaceGraph so
  full Memory Palace view is never affected
  bleeds outside the Today card bounds
- Suppress tooltip DOM creation in mini mode (no pointer events in card)
- Double-rAF defer in renderPalaceGraphInto so getBoundingClientRect
  returns real dimensions after layout flush
…leton

When Today card called _buildGraph first, _eventsBound was set to true.
Memory Palace _bindEvents then returned early, so hover/drag/zoom/click
never attached to the palace-graph-render canvas.

Fix: reset _eventsBound = false in destroyPalaceGraph and at the start
of renderPalaceGraph so every renderPalaceGraph call re-binds to its canvas.
- Ambient nebulae behind engine/input/output clusters (pulsing radial gradients)
- Per-node independent pulse phases (lazy Map, same pattern as graph.ts)
- Three-ring breathing animation on engine node
- Outer radial glow halos on all non-engine nodes, pulse-animated
- Pulsing node fill and border opacity
- Inner top-left gleam on active nodes (3D highlight)
- Two-layer pulse particles: outer diffuse halo + inner glow + white-hot core
- Perpendicular bezier bow replacing simple upward-bias curve
- Soft glow underlay on active edges
- Off-centre radialGradient fakes lit-sphere 3D illusion (same technique as graph.ts)
- Specular highlight arc top-left on every node
- Sphere radii scale with canvas height
- Labels below spheres for input/output; inside for engine
- Remove rr() and unused colour constants
- Remove signal-flow canvas/import entirely from Today dashboard
- Add RECALL card: AI-generated recap of recent sessions + memories
- Button streams typewriter response from LLM using run_id matching
- Persists last recap text + timestamp to localStorage
- Restore last recap on hydration with relative 'Xm ago' timestamp
- Remove fetchSignalFlow from index.ts auto-refresh loop
- Add _timeAgo() helper, recall-cursor blink animation CSS
Two bugs fixed:
- Fixed session ID 'paw-recall-ephemeral' was reused across every click,
  so the engine injected all prior recap exchanges as conversation history,
  making the LLM give inconsistent follow-up responses. Now each invocation
  uses crypto.randomUUID() for a clean session with no prior context.
- All three listeners (delta/complete/error) now bundled into _recallUnsub
  so a subsequent click atomically cancels all listeners from the previous
  run, preventing stale complete/error handlers from interfering.
- Filter by session_id (known before chatSend) rather than run_id (known
  only after the await) to eliminate the early-event race condition.
…e cutoff

When the complete event fires, calling cleanup() immediately unsubscribes
the delta listener while delta events queued in the same macrotask batch
may not have been processed yet, truncating the streamed text mid-sentence.
Wrapping the complete handler body in setTimeout(..., 0) lets any in-flight
delta events drain before the listener is removed and the final text rendered.
The Rust backend emits Complete { text: final_text } containing the full
assembled response. Using accumulated deltas client-side was unreliable —
dropped or reordered Tauri IPC events caused truncated half-sentences.
Now ev.text from the complete event is used as the definitive final text,
with accumulated as a fallback only if ev.text is empty.
…n chat

The Rust backend emits Complete { text: final_text } with the full assembled
response, but bridge.ts was dropping event.text when translating to the
lifecycle:end agent event. event_bus.ts was consequently resolving the stream
promise with stream_s.content (accumulated deltas) which could be truncated
if any IPC delta events were dropped or arrived out of order.

Fix:
- bridge.ts: pass text: event.text in the lifecycle:end data payload
- event_bus.ts: prefer data.text over stream_s.content when resolving the
  stream promise, falling back to accumulated content only if text is empty

This is the same root cause as the RECALL card half-sentence bug — the
complete event carries the truth, deltas are just for live animation.
All bugs fixed mirror the main chat fixes applied in recent sessions:

1. Fix truncated messages (same root cause as main chat):
   - Use ev.text from the complete event as the authoritative final text
     (full assembled response from Rust) instead of accumulated deltas
   - Accumulated deltas are still used for the live typewriter animation

2. Fix run_id race condition:
   - Listeners previously filtered by mc.runId which was null during
     early delta events (set only after chatSend resolves)
   - Now: pre-compute sessionId = mc.sessionId || crypto.randomUUID()
     before chatSend, filter all listeners by ev.session_id === sessionId
   - Listeners registered PER-SEND (not once-at-open) so each message
     gets a fresh, correctly-scoped set of handlers

3. Atomic listener cleanup:
   - Replace 3 separate unlistenDelta/Complete/Error fields with a single
     _unlistenAll that cancels all 6 listeners atomically on each new send
     and on close, preventing stale handlers from interfering

4. Add thinking_delta support:
   - Reasoning models (Claude, DeepSeek) now show a live thinking indicator
     while they reason before generating the response

5. Add tool step indicator:
   - tool_request events now show '▶ toolName…' in the streaming bubble
   - Cleared on tool_result so only the active tool is shown

6. Add thinkingContent field to MiniChatWindow interface
After Agent 1 (the session primary) finalizes its response, _runGroupTurns()
fires each remaining group member one-by-one in the same session:

- Shows a new streaming bubble attributed to each subsequent agent
- Builds an augmented system prompt with the full discussion context so
  each agent sees what was already said before responding
- Re-uses the same session_id so Rust thread history is preserved correctly
- Inline finalization bypasses finalizeStreaming()'s current-agent guard
  (which would silently drop group agents that differ from currentAgent)
- Stamps agentId + agentName on every assistant message (both Agent 1
  via finalizeStreaming and Agents 2+ via _runGroupTurns)
- Falls through gracefully if an agent profile is missing or errors out

Also fixes: finalizeStreaming() now stamps agentId/agentName on Agent 1's
message so the renderer can show attribution prefixes in group sessions.
canvas/index.ts — Forms and actions now reach the agent:
- Wire canvas:form-submit event: when user submits a canvas form, the field
  values are injected into the chat input and sent as a message. The agent
  receives the structured data and can call rest_api_call / service_api_call
  / write_file etc. in response.
- Wire canvas:action event: card action button clicks are injected as chat
  messages in the same way.
- _injectChatMessage(): shared helper that populates chat-input and clicks the
  send button so the full streaming pipeline is used.
- Both listeners are registered globally once (not per-render) to avoid
  duplicate handlers.

canvas/index.ts — Auto-refresh timer:
- parseIntervalMs(): parse '5m', '1h', '30s', '1d' → milliseconds. Floor is
  10s to prevent runaway refresh.
- startRefreshTimer(interval, prompt): creates a setInterval that injects the
  refresh_prompt as a chat message, triggering the agent to re-fetch data and
  call canvas_update on each affected tile. Skips firing while agent is already
  streaming.
- stopRefreshTimer(): clears the interval.
- loadDashboard() and loadCanvas() call stopRefreshTimer() on entry. If the
  loaded dashboard has refresh_interval set, startRefreshTimer() is called
  automatically — the live dot in the tab bar now actually does something.

canvas/molecules.ts — Form enhancements:
- renderForm() now reads on_submit_message and submit_label from component
  data. on_submit_message becomes a data-submit-message attribute on the
  <form> element, allowing the agent to control exactly what prompt is injected
  when the user submits.
- Form submit event dispatch now sends { values, submitMessage } instead of
  bare values, so canvas/index.ts can pick up the message template.
- Toast changed to 'Sent to agent…' to communicate what actually happens.

canvas.rs:
- canvas_update description now explains the fetch→update live data primitive:
  '(1) fetch fresh data via rest_api_call/service_api_call, (2) call
  canvas_update to push it into the tile instantly'. Also mentions using
  create_task + cron_schedule for scheduled auto-refresh.
- form type bullet updated to explain on_submit_message.
- data parameter description: added form JSON shape with fields, on_submit_message,
  and submit_label examples.

tool_index.rs:
- Canvas domain description restructured into three clear sections:
  LIVE DATA, INTERACTIVE FORMS, and DESIGN — so the agent always knows
  the full capability at-a-glance.
Adds seed_builtin_canvas_skills() called from EngineState::new() so the
model intrinsically knows — via Engram procedural memory — which CDN
library to reach for and how to use it, without the user needing to specify.

Skills seeded (deterministic IDs, idempotent INSERT OR REPLACE):
  builtin-canvas-threejs        → Three.js WebGL 3D scenes, particles, globes
  builtin-canvas-d3             → D3.js SVG charts, force graphs, geo maps, funnels
  builtin-canvas-gsap           → GSAP animation timeline, stagger, counter tweens
  builtin-canvas-chartjs        → Chart.js declarative line/bar/pie charts
  builtin-canvas-pixijs         → Pixi.js 2D WebGL sprites and particles
  builtin-canvas-matterjs       → Matter.js rigid-body physics simulation
  builtin-canvas-p5js           → p5.js generative art and Perlin noise sketches
  builtin-canvas-threejs-d3     → Three.js + D3 combo for data-driven 3D globes
  builtin-canvas-chartjs-gsap   → Chart.js + GSAP animated business KPI dashboards
  builtin-canvas-leaflet        → Leaflet interactive maps (no API key required)
  builtin-canvas-api            → Native Canvas 2D API for glow/hex/scan-line fx

Each skill includes:
  Step 1: exact CDN URL for the libraries[] array
  Step 2: boilerplate pattern and key API surface
  Step 3: height/styling guidance and ideal use cases

Also updated canvas_push type description to explicitly list CDN URLs
and instruct the model to auto-select libraries without user prompting.
Root cause of intermittent half-sentence streaming bug:
- delta_batcher.flush() was called AFTER EngineEvent::Complete
- On the frontend, lifecycle:end resolved the stream promise immediately
  and finalizeStreaming ran — the streaming bubble still showed
  ss.content (accumulated deltas without the last batch) briefly
- The last batch then arrived as a late Delta but ss was already gone

Fix: move flush() to happen BEFORE Complete is emitted.
Sequence is now:
  1. delta_batcher.flush() → final Delta IPC event
  2. EngineEvent::Complete → lifecycle:end → finalizeStreaming

The streaming bubble now always shows the complete text before the
permanent message bubble replaces it. No more visible truncation flash.
- Redirect Enter/Send to steerWithMessage() during streaming instead of
  silently dropping user messages (tunnel-vision / hyper-focus fix)
- Update input placeholder during streaming so users know Enter steers

- Fix new-chat conversation never saved: finalize guard used initial
  empty streamKey, missing the session key shift applied by handleSendResult
- Save partial responses on Stop: teardownStream resolves with
  stream.content so partial content is committed, not discarded with sentinel
- Move loadSessions() to finally block so session list refreshes after
  stop/errors, not just on happy path
- Fix catch block using wrong session key for new chats

- Default daily_budget_usd to 0.0 (disabled) in Rust -- was 10.0,
  silently blocking agents after $10 of cumulative spend
- Remove disconnected localStorage budget alert from token_meter that
  showed fake cost estimates unrelated to the real Rust DailyTokenTracker
- Budget error message now points to Settings -> Advanced -> Daily Budget
- Add Settings button to budget alert banner that navigates there
- Clean up unused SettingsModule import and getBudgetLimit parameter
Rust backend:
- Add sse_events broadcast::Sender<String> to EngineState so non-webview
  clients receive real-time engine events alongside the Tauri frontend
- Introduce fire() helper in agent_loop — all 12 engine-event emit calls
  now broadcast to both the webview AND SSE subscribers; no desktop changes
- POST /chat/stream SSE endpoint in webhook.rs — subscribes to broadcast,
  streams every EngineEvent (delta, tool_request, tool_result, complete,
  error, thinking_delta) as Server-Sent Events; 15s keepalive, 5min cap
- allow_dangerous_tools=true in channels/agent.rs now gives the full
  built-in tool set (read_file, write_file, exec, etc.) instead of the
  restricted remote-channel whitelist — safe because the setting is
  only enabled by trusted local clients (webhook config, not Discord/Slack)

VS Code extension (vscode-extension/):
- package.json: contributes @Pawz chat participant + pawz.showDiff command
- src/pawz-client.ts: typed SSE streaming client for /chat/stream
- src/tool-renderer.ts: maps PawzEvents to VS Code ChatResponseStream —
  live markdown streaming, tool progress spinners, write_file diff preview
  with anchor + 'Show diff' button, error blocks
- src/extension.ts: chat participant handler, workspace context injection
  (active file + selection + workspace root), not-configured UX flow
The webhook start button was always returning 'webhook disabled — enable
in settings' because the config's enabled flag was never exposed in the
UI. Added an Enabled checkbox to the form and wired it into the save
handler so users can actually turn it on before clicking Start Server.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants