Skip to content

perf: fix renderer heap exhaustion on long-running sessions#114

Open
Psypeal wants to merge 1 commit intomatt1398:mainfrom
Psypeal:feat/memory-optimization-and-notifications
Open

perf: fix renderer heap exhaustion on long-running sessions#114
Psypeal wants to merge 1 commit intomatt1398:mainfrom
Psypeal:feat/memory-optimization-and-notifications

Conversation

@Psypeal
Copy link
Contributor

@Psypeal Psypeal commented Mar 11, 2026

Summary

The renderer process crashes with exitCode 133 (V8 heap exhaustion) after ~24 hours of uptime on Linux. This PR addresses the root causes and adds task notification rendering.

Memory optimization

  • Strip raw session data after transformation (detail.chunks/processes → empty arrays)
  • Incremental conversation update — reuse unchanged ChatItem objects, only re-transform the tail (O(delta) instead of O(N) per refresh)
  • Fingerprint-based skip — compare chunk count + last chunk size to avoid transformation when nothing changed
  • Adaptive debounce — scale refresh interval with session size (60s for >1000 AI groups down to 150ms default)
  • O(N²) → O(N) injection accumulation in context/claudeMd trackers
  • Phase 1/Phase 2 split — render conversation immediately, compute context stats async
  • batchAsync utility for concurrency-limited IPC calls
  • Precomputed accumulatedCounts on ContextStats (replaces per-render array filtering)
  • Export re-fetches on demand instead of storing full detail
  • 8GB renderer heap limit + enriched crash logs with memory stats
  • Renderer error capture (uncaught errors + unhandled rejections)

Task notification rendering

  • Parse <task-notification> XML blocks from user messages
  • Strip raw XML from display text
  • Render as compact styled cards (status icon, command name, exit code, output file)

Test plan

  • pnpm typecheck — no type errors
  • pnpm lint:fix — no lint errors
  • All 653 tests pass
  • Manual: run a long session (1h+), verify no crash and adaptive debounce kicks in
  • Manual: open a session with background tasks — task notifications render as cards instead of raw XML

Summary by CodeRabbit

  • New Features

    • Added task notification cards in chat history displaying command status, exit codes, and output file references.
    • Enhanced crash logging and auto-recovery system for improved application stability.
  • Bug Fixes

    • Improved global error handling for uncaught exceptions and unhandled promise rejections.
    • Optimized renderer process monitoring and recovery.
  • Performance

    • Improved chat history rendering efficiency with adaptive virtualization thresholds.
    • Enhanced token display performance and session refresh handling for large conversations.
    • Streamlined export workflow with improved loading feedback.

…task notifications

The renderer process crashes with exitCode 133 (V8 heap exhaustion)
after ~24 hours of uptime. Root causes: full conversation re-creation
on every refresh, O(N²) injection accumulation, and raw session data
stored permanently in memory.

Memory optimization:
- Strip detail.chunks/processes after transformation (slimDetail)
- Incremental conversation update — reuse unchanged ChatItem objects,
  only re-transform the tail (O(delta) instead of O(N) per refresh)
- Fingerprint-based skip — compare chunk count + last chunk size to
  avoid transformation when nothing changed
- Adaptive debounce — 60s for >1000 AI groups, 30s for >500, 10s
  for >200, 3s for >100, 150ms default
- batchAsync utility for concurrency-limited IPC calls
- Phase 1/Phase 2 split in fetchSessionDetail — render conversation
  immediately, compute context stats async
- O(N²) → O(N) injection accumulation in context/claudeMd trackers
- Precomputed accumulatedCounts on ContextStats (replaces per-render
  array filtering)
- Export re-fetches full detail on demand (MoreMenu)
- Raise renderer heap to 8GB
- Enriched crash logs with memory stats
- Renderer error capture (uncaught errors + unhandled rejections)

Task notification rendering:
- Parse <task-notification> XML blocks from user messages
- Strip raw XML from display text via contentSanitizer
- Render as compact styled cards showing command name, status icon,
  exit code, and output filename

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai coderabbitai bot added bug Something isn't working feature request New feature or request labels Mar 11, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the application's stability and performance, particularly for long-running sessions that previously suffered from renderer heap exhaustion. It introduces a suite of memory optimizations, improves error handling and crash logging across both main and renderer processes, and adds a new user-facing feature to render task notifications as interactive cards within the chat history. These changes collectively aim to provide a more robust and responsive user experience.

Highlights

  • Memory Optimization & Performance: Implemented several strategies to reduce renderer process memory consumption and improve performance, including stripping raw session data, incremental conversation updates, fingerprint-based refresh skips, adaptive debounce for session refreshes, and optimizing context tracking calculations from O(N²) to O(N).
  • Enhanced Error Handling & Crash Logging: Increased the renderer V8 heap limit to 8GB and introduced comprehensive crash logging for main and renderer processes, including unhandled rejections, uncaught exceptions, renderer load failures, and process crashes, with auto-reloading on renderer process gone.
  • Task Notification Rendering: Added a new feature to parse and display task notifications from user messages as styled, compact cards in the chat history, improving readability and user experience for background tasks.
  • Export Workflow Improvement: Modified the export functionality to fetch full session details on demand, rather than storing them in memory, reducing memory footprint for long-running sessions.
  • UI Responsiveness: Adjusted chat history virtualization threshold and optimized token usage display by utilizing precomputed counts.
Changelog
  • src/main/index.ts
    • Added writeCrashLog utility to append timestamped entries to a crash log file.
    • Increased V8 heap limit to 8GB for the renderer process to prevent OOM crashes.
    • Integrated crash logging for unhandled rejections, uncaught exceptions, renderer load failures, and renderer process gone events.
    • Implemented auto-reload for the main window upon renderer process crashes.
  • src/renderer/components/chat/ChatHistory.tsx
    • Decreased VIRTUALIZATION_THRESHOLD from 120 to 30 to improve rendering performance for large chat histories.
  • src/renderer/components/chat/UserChatGroup.tsx
    • Imported new icons (CheckCircle, Circle, FileText, XCircle) for task notifications.
    • Added logic to parse task notifications from user message content.
    • Implemented rendering of task notifications as styled cards within user messages, displaying status, command name, exit code, and output file.
  • src/renderer/components/common/TokenUsageDisplay.tsx
    • Updated to use precomputed accumulatedCounts from ContextStats to optimize token display calculations, avoiding iteration over the full injection array.
  • src/renderer/components/layout/MoreMenu.tsx
    • Refactored export functionality to fetch full session details on demand using api.getSessionDetail, reducing memory usage by not storing full details in the store.
    • Added a loading state (exportLoading) to the export menu items to provide user feedback.
    • Changed activeTabSessionDetail prop to activeTabHasSession to reflect the new on-demand export strategy.
  • src/renderer/components/layout/TabBar.tsx
    • Updated MoreMenu component prop from activeTabSessionDetail to activeTabHasSession to align with the new on-demand export strategy.
  • src/renderer/main.tsx
    • Implemented global event listeners for error and unhandledrejection in the renderer process.
    • Configured renderer errors and unhandled rejections to be logged to the main process crash log via console.error.
  • src/renderer/store/index.ts
    • Introduced adaptive debounce logic for session refreshes, scaling the refresh interval based on the number of AI groups in the conversation to prevent memory churn in large sessions.
  • src/renderer/store/slices/sessionDetailSlice.ts
    • Implemented a two-phase update for session details: Phase 1 immediately renders the conversation, and Phase 2 asynchronously computes context statistics.
    • Stripped raw chunks and processes data from sessionDetail after conversation transformation to save memory.
    • Added a fingerprint-based check (sessionChunkFingerprint) to skip expensive conversation transformation when content is unchanged during refreshes.
    • Integrated incrementalUpdateConversation for more efficient conversation updates, reusing unchanged ChatItem objects.
    • Refactored context tracking logic to use batchAsync for concurrency-limited IPC calls when fetching file token data.
  • src/renderer/store/utils/batchAsync.ts
    • Added a new utility function batchAsync to limit concurrency for asynchronous operations, preventing unbounded parallel IPC calls that can exhaust the renderer heap.
  • src/renderer/types/contextInjection.ts
    • Modified ContextStats interface to include accumulatedCounts for precomputed injection counts by category.
    • Clarified that accumulatedInjections are only fully populated for the last AI group in each context phase to save memory.
  • src/renderer/utils/claudeMdTracker.ts
    • Refactored computeClaudeMdStats to pass previousPaths and mutate previousInjections for O(N) performance, avoiding O(N^2) array spreads.
    • Updated processSessionClaudeMd to correctly thread previousPaths and snapshot accumulatedInjections for stored stats.
  • src/renderer/utils/contextTracker.ts
    • Optimized computeContextStats and processSessionContextWithPhases to use previousPaths and mutable previousInjections for O(N) efficiency.
    • Introduced accumulatedCounts in ContextStats for direct access to injection totals, replacing per-render array filtering.
    • Modified accumulatedInjections in ContextStats to only store full arrays for the last group of each phase, significantly reducing memory footprint.
    • Removed redundant linkedTools and displayItems re-computation within processSessionContextWithPhases.
  • src/renderer/utils/groupTransformer.ts
    • Added incrementalUpdateConversation function to efficiently update chat conversations by reusing unchanged ChatItem objects and only re-transforming the tail, reducing allocation pressure during live-session refreshes.
  • src/shared/utils/contentSanitizer.ts
    • Added parseTaskNotifications to extract structured data (taskId, status, summary, outputFile) from <task-notification> XML blocks.
    • Updated sanitizeDisplayContent to remove <task-notification> XML blocks and associated "Read the output file to retrieve the result: /path" instructions from display text.
Activity
  • Author Psypeal performed pnpm typecheck and pnpm lint:fix, both completing without errors.
  • All 653 automated tests passed successfully.
  • Manual testing is planned to verify application stability and adaptive debounce behavior during long sessions (1h+).
  • Manual testing is also planned to confirm correct rendering of task notifications in sessions with background tasks.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This pull request adds comprehensive error handling and crash logging to the main process, optimizes session state management through incremental updates and fingerprinting, introduces task notification parsing and rendering in chat, refactors the menu system to pass boolean session flags instead of full session details, and extends content sanitization to handle task notifications.

Changes

Cohort / File(s) Summary
Main Process Crash Logging
src/main/index.ts
Adds crash-logging utility with persistent writes to ~/.claude/claude-devtools-crash.log. Implements error handlers for unhandled rejections and uncaught exceptions; tracks renderer failures (load failure, process gone, console errors, unresponsiveness) with details and auto-recovery; configures larger V8 heap limit for renderer.
Renderer Global Error Handling
src/renderer/main.tsx
Registers window event listeners for uncaught errors and unhandled promise rejections, forwarding them to console with structured messages for main-process capture and crash logging.
Chat UI Updates
src/renderer/components/chat/ChatHistory.tsx, src/renderer/components/chat/UserChatGroup.tsx
Reduces virtualization threshold from 120 to 30 items; adds parsing and rendering of task notifications from message content with status icons, exit codes, and output file references.
Menu System Refactoring
src/renderer/components/layout/MoreMenu.tsx, src/renderer/components/layout/TabBar.tsx, src/renderer/components/common/TokenUsageDisplay.tsx
Replaces activeTabSessionDetail prop with activeTabHasSession boolean flag; refactors export flow to fetch session details via API; optimizes token usage display to use precomputed counts instead of filtering injections array.
Session Store & Management
src/renderer/store/index.ts, src/renderer/store/slices/sessionDetailSlice.ts, src/renderer/utils/groupTransformer.ts
Implements adaptive debounce scaling based on AI group count; adds per-session fingerprinting to skip redundant transformations; introduces incremental conversation updates for efficient live-session refreshes; implements batched async processing for file/directory reads with request guards.
Context & Path Tracking
src/renderer/utils/claudeMdTracker.ts, src/renderer/utils/contextTracker.ts
Adds path-tracking state across AI groups to avoid reprocessing; refactors injection accumulation with in-place mutations; extends signatures to return both stats and previousPaths; adds phase-boundary backfilling for accumulatedInjections snapshots; tracks accumulated counts separately from injections.
Utility & Type Additions
src/renderer/store/utils/batchAsync.ts, src/renderer/types/contextInjection.ts, src/shared/utils/contentSanitizer.ts
Introduces batchAsync utility for concurrency-limited async operations; extends ContextStats interface with accumulatedCounts field; adds task notification parsing and content sanitization (recognizes and strips <task-notification> blocks).

Possibly related PRs

Suggested labels

bug, feature request


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (5)
src/renderer/store/utils/batchAsync.ts (1)

5-23: Guard the worker count against invalid concurrency values.

If a caller passes 0 or a negative value here, Array.from({ length: Math.min(concurrency, items.length) }) creates no workers, so the function returns a sparse results array without ever calling fn. Clamping to >= 1 or throwing early would make the helper safer to reuse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/store/utils/batchAsync.ts` around lines 5 - 23, The batchAsync
helper currently allows concurrency <= 0 which yields zero workers and a sparse
results array; update batchAsync to validate or clamp the concurrency value
before spawning workers (reference symbols: batchAsync, concurrency, worker,
results). Fix by computing a workerCount = Math.max(1, Math.min(concurrency,
items.length)) (or alternatively throw if concurrency < 1) and use that
workerCount when creating the Array.from workers so at least one worker runs and
results is fully populated.
src/renderer/components/layout/MoreMenu.tsx (1)

74-90: Handle export re-entry and failures inside handleExport().

exportLoading currently only flips local state; if this callback is re-entered, it will still queue another full-session fetch, and any rejected getSessionDetail() promise will bubble to the global rejection handler. An early if (exportLoading) return plus local failure handling would make this path much safer for large exports.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/components/layout/MoreMenu.tsx` around lines 74 - 90, Add a
re-entry guard and local failure handling inside handleExport: at the top of
handleExport check the export loading state (e.g. exportLoading) and return
early if true to avoid concurrent fetches; keep setting setExportLoading(true)
before calling api.getSessionDetail but add a .catch handler for
api.getSessionDetail(projectId, sessionId) to handle failures locally (log/show
an error toast) instead of letting the rejection bubble, and ensure
setExportLoading(false) is always called (you can keep the existing .finally but
also ensure any early-return path respects the guard). Reference: handleExport,
exportLoading, setExportLoading, api.getSessionDetail, triggerDownload.
src/renderer/main.tsx (1)

8-11: Include the stack in uncaught renderer error logs.

This currently flattens uncaught exceptions to message at file:line, which drops the actual JS stack right before the message is forwarded to the main-process crash log. Including event.error?.stack here will make the post-mortem logs much more actionable.

Suggested change
 window.addEventListener('error', (event) => {
+  const stack =
+    event.error instanceof Error ? `\n${event.error.stack ?? event.error.message}` : '';
   console.error(
-    `[RENDERER_UNCAUGHT_ERROR] ${event.message} at ${event.filename}:${event.lineno}`
+    `[RENDERER_UNCAUGHT_ERROR] ${event.message} at ${event.filename}:${event.lineno}${stack}`
   );
 });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/main.tsx` around lines 8 - 11, The uncaught renderer error
handler (window.addEventListener('error', ...)) currently logs only
message/filename/lineno; update the handler to also include the JS stack by
appending event.error?.stack (using optional chaining to avoid throwing) into
the console.error output so logs contain the full stack trace alongside the
existing `[RENDERER_UNCAUGHT_ERROR] ${event.message} at
${event.filename}:${event.lineno}` information.
src/renderer/components/chat/UserChatGroup.tsx (1)

483-490: Consider handling additional status values.

The status icon logic handles completed, failed, and error, with all other statuses falling back to Circle with muted color. Since parseTaskNotifications accepts any string as status (no validation), consider adding explicit handling for expected statuses like running or pending if they exist in the protocol.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/components/chat/UserChatGroup.tsx` around lines 483 - 490, The
current status display in UserChatGroup (variables notif.status, isCompleted,
isFailed, StatusIcon, statusColor) only treats 'completed', 'failed'/'error' and
everything else as a generic muted Circle; update the logic to explicitly handle
common in-progress statuses (e.g., 'running', 'pending') by mapping them to an
in-progress icon (spinner/clock) and an appropriate color (e.g.,
accent/in-progress color) and keep a default fallback for unknown values; ensure
you also review parseTaskNotifications to confirm what status strings are
emitted and match those expected values in the StatusIcon/statusColor selection
so new statuses are rendered distinctly.
src/renderer/store/slices/sessionDetailSlice.ts (1)

303-310: Phase 2 IPC calls continue after requestGeneration invalidation.

The batchAsync utility (lines 338, 431) doesn't support AbortSignal, so in-flight IPC calls (api.readDirectoryClaudeMd, api.readMentionedFile) will complete even after requestGeneration becomes stale. The results are correctly discarded via generation checks, but the IPC work is wasted.

This is acceptable for the current use case since:

  1. IPC calls are lightweight reads
  2. Generation checks prevent stale updates

Consider adding cancellation support to batchAsync in a future iteration if IPC volume becomes problematic.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/store/slices/sessionDetailSlice.ts` around lines 303 - 310, The
issue is that batchAsync does not support cancellation so IPC calls like
api.readDirectoryClaudeMd and api.readMentionedFile continue after
requestGeneration is invalidated; update batchAsync to accept an AbortSignal
parameter and propagate it into the worker calls, then when launching the phase
2 async block create an AbortController tied to requestGeneration (or cancel on
generation change) and pass controller.signal into batchAsync and into
api.readDirectoryClaudeMd/api.readMentionedFile (or guard the call with
signal.throwIfAborted/check signal.aborted before invoking) so in-flight IPC
reads are aborted when requestGeneration becomes stale and wasted work is
avoided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/main/index.ts`:
- Around line 580-598: The render-process-gone handler currently unconditionally
reloads the renderer; change it to only trigger a reload when the crash reason
is one of the recoverable crash reasons and not during app shutdown: inside the
mainWindow.webContents.on('render-process-gone', ...) callback check
app.isQuitting (or app.isQuiting if that field is used) and return early if
true, and only perform the reload logic (calling mainWindow.loadURL or loadFile)
when details.reason is exactly 'crashed', 'oom', or 'memory-eviction'; always
continue to log and call writeCrashLog with the existing payload (writeCrashLog,
getRendererIndexPath, DEV_SERVER_PORT remain unchanged).
- Around line 82-88: The uncaughtException handler currently logs and writes a
crash log but leaves the main process running; after calling logger.error and
writeCrashLog('UNCAUGHT_EXCEPTION (main)', ...), invoke Electron's
app.relaunch() and then app.exit() (or app.quit() if you prefer a clean
shutdown) so the application restarts/terminates instead of continuing in an
undefined state; update the handler around the process.on('uncaughtException',
(error: Error) => { ... }) block to call app.relaunch() followed by
app.exit()/app.quit() after logging.

In `@src/renderer/store/index.ts`:
- Around line 82-97: The adaptive debounce currently reads
useStore.getState().conversation to count AI groups; change it to use the target
session's cached totalAIGroups (the session object passed to
scheduleSessionRefresh or the session/tab state used to trigger refresh) when
available to decide debounceMs (fall back to the global conversation count only
if session.totalAIGroups is undefined). Update the aiGroupCount calculation and
keep the same thresholds (100, 200, 500, 1000) and SESSION_REFRESH_DEBOUNCE_MS
fallback so long-running sessions in other panes no longer force the active tab
to use the 150ms path.

In `@src/renderer/utils/groupTransformer.ts`:
- Around line 245-257: The code clears isOngoing on a reused AI group but sets
status to the wrong terminal literal ('completed'), causing mismatch with full
transforms; update the status assignment in the items loop that touches AIGroup
(the block that checks (group as AIGroup & { isOngoing?: boolean }).isOngoing)
to use the same terminal value as determineAIGroupStatus, e.g. call
determineAIGroupStatus(group) or use the 'complete' literal returned by that
function so incremental refreshes emit the identical terminal status as full
transforms.
- Around line 293-325: The post-pass mutates shared group objects on the reused
items array causing snapshot/state issues; for each mutation site (the
compact-item loop where you set compactItem.group.startingPhaseNumber and
compactItem.group.tokenDelta, and the AI post-pass where you set isOngoing and
status on item.group) clone the outer item or at least clone the group object
before writing fields so you replace items[i] with a new object containing a
shallow-copied group (preserving all other properties) and then set
startingPhaseNumber/tokenDelta or isOngoing/status on that cloned group; use the
same checks around
findLastAiBefore/getLastAssistantTotalTokens/getFirstAssistantTotalTokens and
preserve types CompactGroup and AIGroup while assigning the new object back into
items to ensure identity changes and avoid mutating prevConversation's shared
objects.

---

Nitpick comments:
In `@src/renderer/components/chat/UserChatGroup.tsx`:
- Around line 483-490: The current status display in UserChatGroup (variables
notif.status, isCompleted, isFailed, StatusIcon, statusColor) only treats
'completed', 'failed'/'error' and everything else as a generic muted Circle;
update the logic to explicitly handle common in-progress statuses (e.g.,
'running', 'pending') by mapping them to an in-progress icon (spinner/clock) and
an appropriate color (e.g., accent/in-progress color) and keep a default
fallback for unknown values; ensure you also review parseTaskNotifications to
confirm what status strings are emitted and match those expected values in the
StatusIcon/statusColor selection so new statuses are rendered distinctly.

In `@src/renderer/components/layout/MoreMenu.tsx`:
- Around line 74-90: Add a re-entry guard and local failure handling inside
handleExport: at the top of handleExport check the export loading state (e.g.
exportLoading) and return early if true to avoid concurrent fetches; keep
setting setExportLoading(true) before calling api.getSessionDetail but add a
.catch handler for api.getSessionDetail(projectId, sessionId) to handle failures
locally (log/show an error toast) instead of letting the rejection bubble, and
ensure setExportLoading(false) is always called (you can keep the existing
.finally but also ensure any early-return path respects the guard). Reference:
handleExport, exportLoading, setExportLoading, api.getSessionDetail,
triggerDownload.

In `@src/renderer/main.tsx`:
- Around line 8-11: The uncaught renderer error handler
(window.addEventListener('error', ...)) currently logs only
message/filename/lineno; update the handler to also include the JS stack by
appending event.error?.stack (using optional chaining to avoid throwing) into
the console.error output so logs contain the full stack trace alongside the
existing `[RENDERER_UNCAUGHT_ERROR] ${event.message} at
${event.filename}:${event.lineno}` information.

In `@src/renderer/store/slices/sessionDetailSlice.ts`:
- Around line 303-310: The issue is that batchAsync does not support
cancellation so IPC calls like api.readDirectoryClaudeMd and
api.readMentionedFile continue after requestGeneration is invalidated; update
batchAsync to accept an AbortSignal parameter and propagate it into the worker
calls, then when launching the phase 2 async block create an AbortController
tied to requestGeneration (or cancel on generation change) and pass
controller.signal into batchAsync and into
api.readDirectoryClaudeMd/api.readMentionedFile (or guard the call with
signal.throwIfAborted/check signal.aborted before invoking) so in-flight IPC
reads are aborted when requestGeneration becomes stale and wasted work is
avoided.

In `@src/renderer/store/utils/batchAsync.ts`:
- Around line 5-23: The batchAsync helper currently allows concurrency <= 0
which yields zero workers and a sparse results array; update batchAsync to
validate or clamp the concurrency value before spawning workers (reference
symbols: batchAsync, concurrency, worker, results). Fix by computing a
workerCount = Math.max(1, Math.min(concurrency, items.length)) (or alternatively
throw if concurrency < 1) and use that workerCount when creating the Array.from
workers so at least one worker runs and results is fully populated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59b64bb7-8f75-4cef-85d1-c2cda597063c

📥 Commits

Reviewing files that changed from the base of the PR and between f69493c and d4e4203.

📒 Files selected for processing (15)
  • src/main/index.ts
  • src/renderer/components/chat/ChatHistory.tsx
  • src/renderer/components/chat/UserChatGroup.tsx
  • src/renderer/components/common/TokenUsageDisplay.tsx
  • src/renderer/components/layout/MoreMenu.tsx
  • src/renderer/components/layout/TabBar.tsx
  • src/renderer/main.tsx
  • src/renderer/store/index.ts
  • src/renderer/store/slices/sessionDetailSlice.ts
  • src/renderer/store/utils/batchAsync.ts
  • src/renderer/types/contextInjection.ts
  • src/renderer/utils/claudeMdTracker.ts
  • src/renderer/utils/contextTracker.ts
  • src/renderer/utils/groupTransformer.ts
  • src/shared/utils/contentSanitizer.ts

Comment on lines +82 to 88
process.on('uncaughtException', (error: Error) => {
logger.error('Uncaught exception in main process:', error);
writeCrashLog('UNCAUGHT_EXCEPTION (main)', {
message: error.message,
stack: error.stack ?? '',
});
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Node.js/Electron, what happens after adding a process.on('uncaughtException') handler? Does the process continue running, and what recovery strategy is recommended for desktop apps?

💡 Result:

What changes after you add process.on('uncaughtException', ...)?

  • Without a handler: Node prints the stack trace and exits (exit code 1).
  • With a handler: you override that default behavior; the process will typically keep running (and may even exit with code 0 unless you set process.exitCode). Node’s own example shows code scheduled on the event loop still running after the exception is caught. [1]

Should you keep running?

Node’s guidance is no: an uncaught exception means the app is in an undefined / potentially corrupted state, and “it is not safe to resume normal operation after 'uncaughtException'”. The intended use is: do synchronous cleanup, then shut down. [1]

Recommended recovery strategy for Electron desktop apps

Treat an uncaught exception in the main process as fatal:

  1. Log/report (crash reporting / telemetry).
  2. Optionally show a “Something went wrong, restarting…” dialog.
  3. Restart the app rather than trying to limp on:
    • In-process restart: call app.relaunch() and then app.exit() / app.quit() (Electron explicitly notes you must quit/exit after relaunch). [2]
    • More robust: use a separate “launcher/monitor” process that starts Electron and restarts it if it dies (this matches Node’s recommendation to use an external monitor in a separate process for reliable recovery). [1]

For renderer failures, handle Electron’s app.on('render-process-gone', ...) and recreate/reload the affected window(s), since Electron emits this when a renderer “crashed or [was] killed.” [2]

Sources:
[1] Node.js process docs — 'uncaughtException' behavior + warnings/restart guidance. (nodejs.org)
[2] Electron app docs — app.relaunch() usage; render-process-gone event. (electronjs.org)

Citations:


Fail fast after a main-process uncaughtException by restarting the app.

Adding this listener prevents Node.js from exiting on an uncaught exception. The process continues running in an undefined state, allowing IPC handlers and file watchers to process requests. After logging the error, call app.relaunch() followed by app.exit() to restart the app, or call app.quit() to terminate cleanly.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/index.ts` around lines 82 - 88, The uncaughtException handler
currently logs and writes a crash log but leaves the main process running; after
calling logger.error and writeCrashLog('UNCAUGHT_EXCEPTION (main)', ...), invoke
Electron's app.relaunch() and then app.exit() (or app.quit() if you prefer a
clean shutdown) so the application restarts/terminates instead of continuing in
an undefined state; update the handler around the
process.on('uncaughtException', (error: Error) => { ... }) block to call
app.relaunch() followed by app.exit()/app.quit() after logging.

Comment on lines 580 to +598
mainWindow.webContents.on('render-process-gone', (_event, details) => {
const memUsage = process.memoryUsage();
logger.error('Renderer process gone:', details.reason, details.exitCode);
// Could show an error dialog or attempt to reload the window
writeCrashLog('RENDERER_PROCESS_GONE', {
reason: details.reason,
exitCode: details.exitCode,
mainProcessRssMB: Math.round(memUsage.rss / 1024 / 1024),
mainProcessHeapUsedMB: Math.round(memUsage.heapUsed / 1024 / 1024),
mainProcessHeapTotalMB: Math.round(memUsage.heapTotal / 1024 / 1024),
uptime: `${Math.round(process.uptime())}s`,
});
if (mainWindow && !mainWindow.isDestroyed()) {
if (process.env.NODE_ENV === 'development') {
void mainWindow.loadURL(`http://localhost:${DEV_SERVER_PORT}`);
} else {
void mainWindow.loadFile(getRendererIndexPath());
}
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For Electron BrowserWindow/WebContents render-process-gone, which details.reasonvalues represent crash conditions versus normal exits likeclean-exit, and when is it safe to reload the window automatically?

💡 Result:

details.reason values (Electron render-process-gone)

Electron defines these reason values in RenderProcessGoneDetails: clean-exit, abnormal-exit, killed, crashed, oom, launch-failed, integrity-failure, memory-eviction. [1]

Normal / expected exit

  • clean-exit: renderer exited with code 0 (typically “normal shutdown”). [1]

Abnormal termination (not necessarily a “crash”, but not a clean shutdown)

  • abnormal-exit: exited with a non‑zero exit code (could be a deliberate process.exit(1) or an error path). [1]
  • killed: got SIGTERM or was killed externally (could be OS/user/task-manager, or your app intentionally terminating it). [1]

Crash / failure conditions

  • crashed: renderer crash. [1]
  • oom: out of memory. [1]
  • memory-eviction: proactively terminated to avoid a future OOM. [1]
  • launch-failed: renderer never successfully launched. [1]
  • integrity-failure: Windows code integrity checks failed. [1]

The event itself is described as “unexpectedly disappears… normally because it was crashed or killed.” [2]


When it’s safe to auto-reload

A practical policy:

Safe to auto-reload (usually)

  • crashed, oom, memory-eviction → auto-reload is typically reasonable (often transient). [1]

Do not auto-reload (likely to loop / won’t help)

  • integrity-failure, launch-failed → show an error UI and require user action (update/repair/disable security software, etc.). [1]

Only auto-reload if you’re sure you’re not shutting down

  • killed, abnormal-exit → reload can be OK, but only if you’ve ruled out “app is quitting / window is closing / you intentionally terminated the renderer”.

Guardrails (recommended)

  • Don’t reload if win.isDestroyed() / webContents.isDestroyed() or you’re in your app’s quit path.
  • Add a retry limit + backoff (to avoid crash loops), then fall back to a “Reload” button.

(If you intentionally terminate a stuck renderer via forcefullyCrashRenderer(), Electron notes that calling reload() immediately will reload in a new process.) [2]

Sources: Electron docs for RenderProcessGoneDetails and webContents event behavior. [1] [2]


Gate auto-reload to crash reasons only; skip while app is quitting.

render-process-gone emits for both crashes and normal exits. Unconditionally reloading will fight clean shutdown flows and escalate non-fatal exits.

Only reload on crash/OOM conditions (crashed, oom, memory-eviction). Skip reload on normal exits (clean-exit), unrecoverable failures (launch-failed, integrity-failure), and while app.isQuiting is true.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/main/index.ts` around lines 580 - 598, The render-process-gone handler
currently unconditionally reloads the renderer; change it to only trigger a
reload when the crash reason is one of the recoverable crash reasons and not
during app shutdown: inside the mainWindow.webContents.on('render-process-gone',
...) callback check app.isQuitting (or app.isQuiting if that field is used) and
return early if true, and only perform the reload logic (calling
mainWindow.loadURL or loadFile) when details.reason is exactly 'crashed', 'oom',
or 'memory-eviction'; always continue to log and call writeCrashLog with the
existing payload (writeCrashLog, getRendererIndexPath, DEV_SERVER_PORT remain
unchanged).

Comment on lines +82 to +97
// Adaptive debounce: large sessions refresh less frequently to reduce memory churn.
// For a long-running session (hundreds of AI groups), a 150ms refresh cycle allocates
// GBs of JS objects per minute — GC cannot keep pace. Scale the interval with size.
const aiGroupCount = (useStore.getState().conversation?.items ?? []).filter(
(i) => i.type === 'ai'
).length;
const debounceMs =
aiGroupCount > 1000
? 60000 // ~60s for very long sessions (24h+)
: aiGroupCount > 500
? 30000 // ~30s for long sessions
: aiGroupCount > 200
? 10000 // ~10s for medium sessions
: aiGroupCount > 100
? 3000 // ~3s for moderate sessions
: SESSION_REFRESH_DEBOUNCE_MS; // 150ms default
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Base the adaptive delay on the session being refreshed.

scheduleSessionRefresh() is used for any visible session, but this code reads the global conversation and re-counts AI rows from there. If a long-running session is open in another pane while a short session is active, the background session still falls back to the 150ms path and keeps the high-churn behavior this PR is trying to avoid. Prefer the target session/tab's cached totalAIGroups here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/store/index.ts` around lines 82 - 97, The adaptive debounce
currently reads useStore.getState().conversation to count AI groups; change it
to use the target session's cached totalAIGroups (the session object passed to
scheduleSessionRefresh or the session/tab state used to trigger refresh) when
available to decide debounceMs (fall back to the global conversation count only
if session.totalAIGroups is undefined). Update the aiGroupCount calculation and
keep the same thresholds (100, 200, 500, 1000) and SESSION_REFRESH_DEBOUNCE_MS
fallback so long-running sessions in other panes no longer force the active tab
to use the 150ms path.

Comment on lines +245 to +257
// Clear isOngoing from the last reused AI group (it may have been set
// in the previous transform). Clone to avoid mutating React-referenced objects.
for (let i = items.length - 1; i >= 0; i--) {
if (items[i].type === 'ai') {
const group = items[i].group as AIGroup;
if ((group as AIGroup & { isOngoing?: boolean }).isOngoing) {
items[i] = {
type: 'ai',
group: {
...group,
isOngoing: false,
status: group.status === 'in_progress' ? 'completed' : group.status,
},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Use the existing terminal status literal here.

Line 256 writes 'completed', but determineAIGroupStatus() returns 'complete'. Incremental refreshes will therefore emit a different terminal status than full transforms, which can split downstream rendering logic by accident.

Suggested fix
-            status: group.status === 'in_progress' ? 'completed' : group.status,
+            status: group.status === 'in_progress' ? 'complete' : group.status,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Clear isOngoing from the last reused AI group (it may have been set
// in the previous transform). Clone to avoid mutating React-referenced objects.
for (let i = items.length - 1; i >= 0; i--) {
if (items[i].type === 'ai') {
const group = items[i].group as AIGroup;
if ((group as AIGroup & { isOngoing?: boolean }).isOngoing) {
items[i] = {
type: 'ai',
group: {
...group,
isOngoing: false,
status: group.status === 'in_progress' ? 'completed' : group.status,
},
// Clear isOngoing from the last reused AI group (it may have been set
// in the previous transform). Clone to avoid mutating React-referenced objects.
for (let i = items.length - 1; i >= 0; i--) {
if (items[i].type === 'ai') {
const group = items[i].group as AIGroup;
if ((group as AIGroup & { isOngoing?: boolean }).isOngoing) {
items[i] = {
type: 'ai',
group: {
...group,
isOngoing: false,
status: group.status === 'in_progress' ? 'complete' : group.status,
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/utils/groupTransformer.ts` around lines 245 - 257, The code
clears isOngoing on a reused AI group but sets status to the wrong terminal
literal ('completed'), causing mismatch with full transforms; update the status
assignment in the items loop that touches AIGroup (the block that checks (group
as AIGroup & { isOngoing?: boolean }).isOngoing) to use the same terminal value
as determineAIGroupStatus, e.g. call determineAIGroupStatus(group) or use the
'complete' literal returned by that function so incremental refreshes emit the
identical terminal status as full transforms.

Comment on lines +293 to +325
// Post-pass: CompactGroup token deltas (re-run on full array — cheap iteration)
let phaseCounter = 1;
for (let i = 0; i < items.length; i++) {
if (items[i].type === 'compact') {
phaseCounter++;
const compactItem = items[i] as { type: 'compact'; group: CompactGroup };
compactItem.group.startingPhaseNumber = phaseCounter;

const preAi = findLastAiBefore(items, i);
const postAi = findFirstAiAfter(items, i);
if (preAi && postAi) {
const pre = getLastAssistantTotalTokens(preAi);
const post = getFirstAssistantTotalTokens(postAi);
if (pre !== undefined && post !== undefined) {
compactItem.group.tokenDelta = {
preCompactionTokens: pre,
postCompactionTokens: post,
delta: post - pre,
};
}
}
}
}

// Post-pass: isOngoing flag on last AI group
if (isOngoing && aiCount > 0) {
for (let i = items.length - 1; i >= 0; i--) {
const item = items[i];
if (item.type === 'ai') {
const currentStatus = item.group.status;
if (currentStatus !== 'interrupted') {
(item.group as AIGroup & { isOngoing?: boolean }).isOngoing = true;
(item.group as AIGroup & { status?: AIGroupStatus }).status = 'in_progress';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clone touched groups before the post-pass mutates them.

items intentionally reuses the unchanged prefix from prevConversation, but this block still mutates startingPhaseNumber, tokenDelta, isOngoing, and status on shared group objects. That mutates the previous store snapshot and can suppress rerenders for memoized compact/AI rows because the object identity does not change.

Suggested immutable update pattern
     if (items[i].type === 'compact') {
       phaseCounter++;
       const compactItem = items[i] as { type: 'compact'; group: CompactGroup };
-      compactItem.group.startingPhaseNumber = phaseCounter;
+      const nextGroup: CompactGroup = {
+        ...compactItem.group,
+        startingPhaseNumber: phaseCounter,
+      };

       const preAi = findLastAiBefore(items, i);
       const postAi = findFirstAiAfter(items, i);
       if (preAi && postAi) {
         const pre = getLastAssistantTotalTokens(preAi);
         const post = getFirstAssistantTotalTokens(postAi);
         if (pre !== undefined && post !== undefined) {
-          compactItem.group.tokenDelta = {
+          nextGroup.tokenDelta = {
             preCompactionTokens: pre,
             postCompactionTokens: post,
             delta: post - pre,
           };
         }
       }
+      items[i] = { type: 'compact', group: nextGroup };
     }
       if (item.type === 'ai') {
         const currentStatus = item.group.status;
         if (currentStatus !== 'interrupted') {
-          (item.group as AIGroup & { isOngoing?: boolean }).isOngoing = true;
-          (item.group as AIGroup & { status?: AIGroupStatus }).status = 'in_progress';
+          items[i] = {
+            type: 'ai',
+            group: {
+              ...item.group,
+              isOngoing: true,
+              status: 'in_progress',
+            },
+          };
         }
         break;
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Post-pass: CompactGroup token deltas (re-run on full array — cheap iteration)
let phaseCounter = 1;
for (let i = 0; i < items.length; i++) {
if (items[i].type === 'compact') {
phaseCounter++;
const compactItem = items[i] as { type: 'compact'; group: CompactGroup };
compactItem.group.startingPhaseNumber = phaseCounter;
const preAi = findLastAiBefore(items, i);
const postAi = findFirstAiAfter(items, i);
if (preAi && postAi) {
const pre = getLastAssistantTotalTokens(preAi);
const post = getFirstAssistantTotalTokens(postAi);
if (pre !== undefined && post !== undefined) {
compactItem.group.tokenDelta = {
preCompactionTokens: pre,
postCompactionTokens: post,
delta: post - pre,
};
}
}
}
}
// Post-pass: isOngoing flag on last AI group
if (isOngoing && aiCount > 0) {
for (let i = items.length - 1; i >= 0; i--) {
const item = items[i];
if (item.type === 'ai') {
const currentStatus = item.group.status;
if (currentStatus !== 'interrupted') {
(item.group as AIGroup & { isOngoing?: boolean }).isOngoing = true;
(item.group as AIGroup & { status?: AIGroupStatus }).status = 'in_progress';
// Post-pass: CompactGroup token deltas (re-run on full array — cheap iteration)
let phaseCounter = 1;
for (let i = 0; i < items.length; i++) {
if (items[i].type === 'compact') {
phaseCounter++;
const compactItem = items[i] as { type: 'compact'; group: CompactGroup };
const nextGroup: CompactGroup = {
...compactItem.group,
startingPhaseNumber: phaseCounter,
};
const preAi = findLastAiBefore(items, i);
const postAi = findFirstAiAfter(items, i);
if (preAi && postAi) {
const pre = getLastAssistantTotalTokens(preAi);
const post = getFirstAssistantTotalTokens(postAi);
if (pre !== undefined && post !== undefined) {
nextGroup.tokenDelta = {
preCompactionTokens: pre,
postCompactionTokens: post,
delta: post - pre,
};
}
}
items[i] = { type: 'compact', group: nextGroup };
}
}
// Post-pass: isOngoing flag on last AI group
if (isOngoing && aiCount > 0) {
for (let i = items.length - 1; i >= 0; i--) {
const item = items[i];
if (item.type === 'ai') {
const currentStatus = item.group.status;
if (currentStatus !== 'interrupted') {
items[i] = {
type: 'ai',
group: {
...item.group,
isOngoing: true,
status: 'in_progress',
},
};
}
break;
}
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/renderer/utils/groupTransformer.ts` around lines 293 - 325, The post-pass
mutates shared group objects on the reused items array causing snapshot/state
issues; for each mutation site (the compact-item loop where you set
compactItem.group.startingPhaseNumber and compactItem.group.tokenDelta, and the
AI post-pass where you set isOngoing and status on item.group) clone the outer
item or at least clone the group object before writing fields so you replace
items[i] with a new object containing a shallow-copied group (preserving all
other properties) and then set startingPhaseNumber/tokenDelta or
isOngoing/status on that cloned group; use the same checks around
findLastAiBefore/getLastAssistantTotalTokens/getFirstAssistantTotalTokens and
preserve types CompactGroup and AIGroup while assigning the new object back into
items to ensure identity changes and avoid mutating prevConversation's shared
objects.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an excellent pull request that introduces significant performance and memory optimizations to address renderer heap exhaustion in long-running sessions. The changes, including incremental conversation updates, fingerprint-based skipping, adaptive debouncing, and splitting rendering into phases, are well-implemented and should drastically improve the application's stability and responsiveness. The addition of enhanced crash logging and error handling is also a valuable improvement for diagnostics. The code is well-structured and the optimizations are thoughtful. I have a couple of minor suggestions to improve the maintainability of the new task notification feature by centralizing some parsing logic.

Note: Security Review did not run due to the size of the PR.

Comment on lines +171 to +176
export interface TaskNotification {
taskId: string;
status: string;
summary: string;
outputFile: string;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To support parsing the command name and exit code from the summary string within parseTaskNotifications, I recommend extending this interface. This will make the data structure more explicit and easier to use in consuming components like UserChatGroup.

export interface TaskNotification {
  taskId: string;
  status: string;
  summary: string;
  outputFile: string;
  commandName: string;
  exitCode?: string;
}

Comment on lines +189 to +194
notifications.push({
taskId: /<task-id>([^<]*)<\/task-id>/.exec(block)?.[1] ?? '',
status: /<status>([^<]*)<\/status>/.exec(block)?.[1] ?? '',
summary: /<summary>([\s\S]*?)<\/summary>/.exec(block)?.[1]?.trim() ?? '',
outputFile: /<output-file>([^<]*)<\/output-file>/.exec(block)?.[1] ?? '',
});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability and make the UserChatGroup component more robust, it would be better to parse the commandName and exitCode from the summary string here, rather than in the UI component. This centralizes the parsing logic.

    const summary = /<summary>([\s\S]*?)<\/summary>/.exec(block)?.[1]?.trim() ?? '';
    const cmdMatch = /"([^"]+)"/.exec(summary);
    const commandName = cmdMatch?.[1] ?? summary;
    const exitMatch = /\(exit code (\d+)\)/.exec(summary);

    notifications.push({
      taskId: /<task-id>([^<]*)<\/task-id>/.exec(block)?.[1] ?? '',
      status: /<status>([^<]*)<\/status>/.exec(block)?.[1] ?? '',
      summary,
      outputFile: /<output-file>([^<]*)<\/output-file>/.exec(block)?.[1] ?? '',
      commandName,
      exitCode: exitMatch?.[1],
    });

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working feature request New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant