Add proactive context window management to prevent AI API errors by 2witstudios · Pull Request #708 · 2witstudios/PageSpace

2witstudios · 2026-02-21T17:13:52Z

Summary

This PR implements proactive context window management for AI chat conversations to prevent token limit errors from AI providers. Instead of letting conversations fail when they exceed a model's context window, the system now intelligently truncates older messages before sending requests to the API.

Key Changes

Context-aware message truncation: Added logic to POST /api/ai/chat that calculates available token budget based on the model's context window, system prompt size, and tool definitions, then includes only the most recent messages that fit within that budget.
Enhanced context window detection: Expanded getContextWindowSize() in ai-context-calculator.ts with model-specific context limits for OpenRouter providers (Claude, Gemini, GPT, DeepSeek, Qwen, Llama, Mistral models), replacing the previous generic 200k fallback.
Context length error handling:
- Added isContextLengthError() utility function to detect context/token limit errors from various AI providers
- Updated error response to return HTTP 413 status with context_length_exceeded error code when detected
- Enhanced getAIErrorMessage() to provide user-friendly messaging for context length errors
Improved error messaging in UI: Refactored SidebarChatTab.tsx to use centralized getAIErrorMessage() function instead of inline error detection logic, ensuring consistent error messaging across the application.

Implementation Details

Reserves 25% of the context window as headroom for output tokens and tokenizer inaccuracies
Logs truncation events with detailed metrics (original/included message counts, token budgets) for monitoring
Uses database-loaded messages (not client messages) for accurate context calculation
Converts messages to model format after truncation to ensure accurate token estimation

https://claude.ai/code/session_011dBcfJNRsawEMzZLb67Z7m

Summary by CodeRabbit

New Features
- Chat now proactively trims conversation history to fit model limits, improving reliability for long sessions.
Bug Fixes
- Better handling and clearer user-facing messages when a conversation exceeds context/token limits.
- More consistent error messaging across the AI assistant.
Refactor
- Centralized AI error messaging and improved token/context sizing logic for more accurate length checks.

Two-pronged fix for AI_APICallError when conversations exceed model context windows (e.g. OpenRouter 400k token limit): **Prevention (proactive truncation)** - Move modelMessages conversion to after system prompt is built so we have accurate token budgets before calling streamText - Use determineMessagesToInclude() from ai-context-calculator to trim the oldest messages first, reserving 75% of the context window for input and leaving 25% headroom for output tokens and estimator inaccuracies - Log a warning when truncation occurs with before/after message counts - Add OpenRouter provider context windows to getContextWindowSize so common models (DeepSeek, Qwen, Llama, Mistral, Gemini, Claude) get accurate limits instead of the 200k fallback **Better errors (when context still exceeds after truncation)** - Detect context-length errors in the route error handler and return HTTP 413 with error key 'context_length_exceeded' instead of a generic 500 - Add isContextLengthError() helper to error-messages.ts that matches provider-specific phrasing (OpenRouter, Anthropic, OpenAI, etc.) - getAIErrorMessage() now returns a clear, actionable message for context errors: "The conversation is too long for this model's context window. Older messages have been trimmed to fit — try sending your message again." - Replace hardcoded error-message logic in SidebarChatTab with getAIErrorMessage() so all three chat surfaces handle errors consistently https://claude.ai/code/session_011dBcfJNRsawEMzZLb67Z7m

chatgpt-codex-connector · 2026-02-21T17:13:57Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

coderabbitai · 2026-02-21T17:14:12Z

📝 Walkthrough

Walkthrough

Adds proactive context-length management to the AI chat route: computes model context window and token estimates, truncates messages to fit a 75% input budget, logs truncation metadata, and returns 413 context_length_exceeded for context-limit errors. Also extends provider-specific context-window mappings and centralizes UI error messaging.

Changes

Cohort / File(s)	Summary
Chat API - Context Management `apps/web/src/app/api/ai/chat/route.ts`	Compute contextWindow, estimate system/tool tokens, determine includedMessages with a 75% input budget, set wasTruncated, pass truncated messages to streaming call, log truncation metadata, and return 413 for context-length errors.
Error Utilities `apps/web/src/lib/ai/shared/error-messages.ts`	Add `isContextLengthError()` to detect token/context-limit errors; update `getAIErrorMessage()` to surface context-length messages and exclude these from rate-limit detection.
Client UI Error Display `apps/web/src/components/layout/right-sidebar/ai-assistant/SidebarChatTab.tsx`	Replace inline error matching with `getAIErrorMessage(error.message)` rendering to centralize error messaging.
Token & Context Calculator `packages/lib/src/monitoring/ai-context-calculator.ts`	Revise `estimateTokens()` heuristic for non-ASCII content and expand `getContextWindowSize()` with provider/model-specific token limits (OpenRouter models mapping and conservative defaults).

Sequence Diagram

sequenceDiagram
    participant Client
    participant ChatAPI as "AI Chat API Route"
    participant Calculator as "Context Calculator"
    participant Estimator as "Token Estimator"
    participant AIModel as "AI Model"
    participant ErrorHandler as "Error Handler"

    Client->>ChatAPI: POST /api/ai/chat (messages, systemPrompt, model, provider)
    ChatAPI->>Calculator: getContextWindowSize(provider, model)
    Calculator-->>ChatAPI: contextWindow

    ChatAPI->>Estimator: estimateSystemPromptTokens(fullSystemPrompt)
    Estimator-->>ChatAPI: systemPromptTokens
    ChatAPI->>Estimator: estimateToolDefinitionTokens(tools)
    Estimator-->>ChatAPI: toolTokens

    ChatAPI->>ChatAPI: compute inputBudget (75% of contextWindow)
    ChatAPI->>Estimator: determineMessagesToInclude(messages, inputBudget)
    Estimator-->>ChatAPI: includedMessages (truncated if needed)
    alt Truncation occurred
        ChatAPI->>ChatAPI: log truncation metadata (counts, tokens, budgets)
    end

    ChatAPI->>AIModel: stream with modelMessages (includedMessages + system/tool defs)
    alt Context length error from model
        AIModel-->>ErrorHandler: 413 context-length error
        ErrorHandler-->>Client: 413 response { key: "context_length_exceeded", message }
    else Successful stream
        AIModel-->>Client: streamed response
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

feat(ai): add GPT-5.2 models support #181: Updates related model context-window mappings in the ai-context-calculator; strongly overlaps provider/model token-limit logic.
Add timezone-aware operations to AI tools and system prompts #450: Modifies system-prompt/timestamp construction in the same chat route; touches the area where fullSystemPrompt is built and passed.
feat(ai): add MiniMax-M2.1 and upgrade GLM 4.7 as pro model #133: Prior changes to ai-context-calculator that adjust getContextWindowSize and token estimation heuristics; directly related to the same utilities updated here.

Poem

🐰 I counted tokens, nibbled lines of chat,

Trimmed the carrots — messages — to fit the hat.
If tokens overflow and tumblers rattle,
I send a 413 and tidy up the prattle.
Hop, hop — context snug; now bring on the next battle!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: proactive context window management to prevent AI API errors, which directly aligns with the primary objective of calculating context budgets and truncating messages.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch claude/fix-context-length-errors-WFBfz

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

packages/lib/src/monitoring/ai-context-calculator.ts (1)
220-243: OpenRouter GPT matching is narrower than the direct-provider GPT block.

The OpenRouter section only matches gpt-4o and gpt-4-turbo (line 228), while the direct OpenAI block (lines 141-165) also covers GPT-5.x variants. If a user routes a GPT-5 model through OpenRouter, it will fall to the 200k default instead of the actual 272k–400k window. Not a bug per se — 200k is a safe underestimate — but worth noting for future model additions.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/monitoring/ai-context-calculator.ts` around lines 220 - 243,
Update the OpenRouter model-matching branch in ai-context-calculator.ts to
include GPT-5 variants so routed GPT-5 models get the same token window as the
direct-provider GPT logic: add checks against modelLower for strings like
"gpt-5", "gpt-5o", "gpt-5-turbo" (and any project-specific GPT-5 naming you
supported in the OpenAI block) and return the same token limit used for GPT-5 in
the direct OpenAI branch (use the same numeric constant there, e.g., 272_000 or
the exact constant used in the OpenAI block) instead of falling through to the
200_000 default; modify the if chain near the other GPT checks (the block
referencing providerLower === 'openrouter' and modelLower) to include these new
condition(s).
apps/web/src/lib/ai/shared/error-messages.ts (1)
47-62: isRateLimitError overlaps with context-length phrases but is currently unused—consider adding defensive logic for future safety.

While isRateLimitError checks for the "limit" substring (line 71), which overlaps with "token limit" from isContextLengthError, this function is not currently called anywhere in the codebase. The code is safe today because getAIErrorMessage doesn't use isRateLimitError and checks context length first. However, if isRateLimitError is ever used independently in the future, it would misclassify context-length errors. Adding a guard to exclude context-length errors would be defensive refactoring.
♻️ Suggested improvement
 export function isRateLimitError(errorMessage: string | undefined): boolean {
   if (!errorMessage) return false;
+  // Exclude context-length errors that also contain "limit"
+  if (isContextLengthError(errorMessage)) return false;
   return (
     errorMessage.toLowerCase().includes('rate') ||
     errorMessage.toLowerCase().includes('limit') ||
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/web/src/lib/ai/shared/error-messages.ts` around lines 47 - 62,
isRateLimitError can misclassify context-length issues because it looks for
generic "limit" substrings; update isRateLimitError to early-return false when
isContextLengthError(errorMessage) is true (or otherwise explicitly exclude
context-length phrases) so rate-limit detection never fires for
context/window/token-limit errors — locate the isRateLimitError and
isContextLengthError functions and add the defensive guard in isRateLimitError
(or add an explicit negative-condition for tokens/context/window phrases) to
ensure the two checks don't overlap.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/web/src/app/api/ai/chat/route.ts`:
- Around line 829-835: The token estimator (estimateTokens in
ai-context-calculator.ts) uses a fixed 4-chars-per-token heuristic which
undercounts CJK text; update estimateTokens to detect non-ASCII/CJK content
(e.g., check for Unicode ranges or presence of multibyte chars) and use a more
conservative ratio (e.g., 2 chars/token) or make the ratio configurable, then
recalculate calls that rely on it (estimateSystemPromptTokens,
estimateToolDefinitionTokens, and any callers in route.ts) so truncation
reserves enough budget for CJK; keep the existing fallback error handler but
prefer preventing rejections by selecting the conservative ratio when
non-ASCII/CJK is detected.
- Around line 1241-1250: The code currently returns raw provider error text via
errorMsg when isContextLengthError(...) triggers; replace that leak by returning
a sanitized constant or summarized detail (e.g., details: 'context length
exceeded') instead of the raw errorMsg, or call a small sanitizer function
(e.g., sanitizeProviderMessage(errorMsg)) before including it; update the
NextResponse.json call in the isContextLengthError branch to use the
sanitized/constant detail and ensure the user-facing message string "Older
messages have been trimmed to fit — try sending your message again" (from the
proactive truncation logic around the proactive truncation block) is not used if
truncation already ran and likely won’t help—either remove that retry suggestion
or make it conditional based on whether truncation actually occurred.

---

Nitpick comments:
In `@apps/web/src/lib/ai/shared/error-messages.ts`:
- Around line 47-62: isRateLimitError can misclassify context-length issues
because it looks for generic "limit" substrings; update isRateLimitError to
early-return false when isContextLengthError(errorMessage) is true (or otherwise
explicitly exclude context-length phrases) so rate-limit detection never fires
for context/window/token-limit errors — locate the isRateLimitError and
isContextLengthError functions and add the defensive guard in isRateLimitError
(or add an explicit negative-condition for tokens/context/window phrases) to
ensure the two checks don't overlap.

In `@packages/lib/src/monitoring/ai-context-calculator.ts`:
- Around line 220-243: Update the OpenRouter model-matching branch in
ai-context-calculator.ts to include GPT-5 variants so routed GPT-5 models get
the same token window as the direct-provider GPT logic: add checks against
modelLower for strings like "gpt-5", "gpt-5o", "gpt-5-turbo" (and any
project-specific GPT-5 naming you supported in the OpenAI block) and return the
same token limit used for GPT-5 in the direct OpenAI branch (use the same
numeric constant there, e.g., 272_000 or the exact constant used in the OpenAI
block) instead of falling through to the 200_000 default; modify the if chain
near the other GPT checks (the block referencing providerLower === 'openrouter'
and modelLower) to include these new condition(s).

apps/web/src/app/api/ai/chat/route.ts

- Fix TypeScript build error: cast includedMessages to UIMessage[] for convertToModelMessages compatibility - Improve CJK token estimation: use 2 chars/token when >20% non-ASCII content detected (vs 4 chars/token for Latin text) - Sanitize 413 error response: replace raw provider error with constant, make retry suggestion conditional on whether truncation already ran - Add isContextLengthError guard to isRateLimitError to prevent misclassification of context-length errors as rate-limit errors - Add GPT-5 variant matching to OpenRouter section of getContextWindowSize Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2witstudios · 2026-02-26T00:29:39Z

Addressed all CodeRabbit review feedback in `8a5e5d5`

Actionable comments (fixed):

CJK token estimation (ai-context-calculator.ts): estimateTokens() now detects non-ASCII-heavy text and uses a 2 chars/token ratio (vs 4 for Latin), preventing undercounting for CJK workloads.
Raw provider error leak (route.ts): 413 response now returns a sanitized details: 'context_length_exceeded' constant. The user-facing message is conditional — suggests starting a new conversation if truncation already ran (retry would likely fail), or suggests retrying if truncation hasn't been applied yet.

Nitpick comments (fixed):

isRateLimitError overlap (error-messages.ts): Added isContextLengthError() guard so context-length errors containing "limit" aren't misclassified as rate-limit errors.
OpenRouter GPT-5 matching (ai-context-calculator.ts): Added GPT-5.x variant matching to the OpenRouter section, mirroring the direct-provider GPT block (5.0/5.1/5.2 with mini/nano variants).

Build fix:

TypeScript build error (route.ts:856): Cast includedMessages to SDK UIMessage[] to resolve type mismatch between the calculator's looser UIMessage interface and Vercel AI SDK's convertToModelMessages signature.

The wasTruncated variable was declared inside the try block but referenced in the outer catch block for conditional error messaging. Hoist it to function-level scope so the 413 error response can correctly determine whether truncation already occurred. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (1)

packages/lib/src/monitoring/ai-context-calculator.ts (1)

55-66: Consider consolidating token estimation logic into one shared implementation.

estimateTokens here now differs from packages/lib/src/monitoring/ai-monitoring.ts:404-407 (still fixed 4 chars/token). Keeping two heuristics will drift truncation vs. monitoring metrics over time.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/monitoring/ai-context-calculator.ts` around lines 55 - 66,
The repository now has two different token-estimation heuristics (estimateTokens
in ai-context-calculator.ts and the fixed 4-chars/token logic in
ai-monitoring.ts) which can cause drift; replace the duplicated logic by
extracting a single shared utility (e.g., export a consolidateEstimateTokens
function) and import it from both places so both monitoring and context
truncation use the same algorithm (include the non-ASCII detection and
charsPerToken decision currently in estimateTokens). Update references to call
the shared function (estimateTokens / consolidateEstimateTokens) and remove the
old inline heuristic in ai-monitoring.ts to ensure consistent token counting
across the codebase.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/web/src/app/api/ai/chat/route.ts`:
- Around line 1249-1252: The error message for the non-truncated branch
incorrectly implies older messages were trimmed; update the logic that builds
the error response (the object using wasTruncated, message, and details in
route.ts) so when wasTruncated is false the message does not say messages were
trimmed—instead return a clear statement like "The conversation is too long for
this model's context window. Please start a new conversation or shorten the
conversation" while preserving details: 'context_length_exceeded'.
- Around line 837-844: After calling determineMessagesToInclude, add a guard
that ensures the current/latest user turn was not dropped: find the last user
message in sanitizedMessages (e.g., lastUserIndex or by role === 'user') and
verify includedMessages is non-empty and contains that message (compare id or
index); if it doesn't, return a clear client error (400) or an explicit error
response indicating the user input exceeds the token budget instead of
proceeding. Update the logic around determineMessagesToInclude /
includedMessages / wasTruncated in route.ts to perform this check and
short-circuit with a helpful error when the latest user turn would be removed.

In `@packages/lib/src/monitoring/ai-context-calculator.ts`:
- Around line 225-255: The OpenRouter-specific model limit block is being
bypassed because earlier generic model-name branches match first; update those
earlier checks to exclude OpenRouter by adding a guard like providerLower !==
'openrouter' to the generic model-name conditionals (the branches that inspect
modelLower for 'gpt', 'claude', 'gemini', etc.) or alternatively move the
OpenRouter block earlier so it runs before the generic branches; locate the
providerLower and modelLower checks in ai-context-calculator.ts and adjust the
conditionals around the generic model checks or reposition the OpenRouter if
(providerLower === 'openrouter') block so OpenRouter-specific mappings are
applied correctly.
- Line 60: The regex /[^\x00-\x7F]/g used to compute nonAsciiCount violates the
linter; replace that regex-based approach with explicit code-point checking:
iterate over the string (e.g., using for...of or Array.from) and count
characters whose codePointAt(0) > 0x7F, then store that count in nonAsciiCount.
Update the expression that computes nonAsciiCount accordingly so it handles full
Unicode code points (surrogate pairs) and removes the problematic regex.

---

Nitpick comments:
In `@packages/lib/src/monitoring/ai-context-calculator.ts`:
- Around line 55-66: The repository now has two different token-estimation
heuristics (estimateTokens in ai-context-calculator.ts and the fixed
4-chars/token logic in ai-monitoring.ts) which can cause drift; replace the
duplicated logic by extracting a single shared utility (e.g., export a
consolidateEstimateTokens function) and import it from both places so both
monitoring and context truncation use the same algorithm (include the non-ASCII
detection and charsPerToken decision currently in estimateTokens). Update
references to call the shared function (estimateTokens /
consolidateEstimateTokens) and remove the old inline heuristic in
ai-monitoring.ts to ensure consistent token counting across the codebase.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86c2009 and 72fb8d9.

📒 Files selected for processing (3)

apps/web/src/app/api/ai/chat/route.ts
apps/web/src/lib/ai/shared/error-messages.ts
packages/lib/src/monitoring/ai-context-calculator.ts

🚧 Files skipped from review as they are similar to previous changes (1)

apps/web/src/lib/ai/shared/error-messages.ts

coderabbitai · 2026-02-26T00:41:11Z

apps/web/src/app/api/ai/chat/route.ts

+    const truncationResult = determineMessagesToInclude(
+      sanitizedMessages,
+      inputBudget,
+      systemPromptTokens,
+      toolTokens
+    );
+    const { includedMessages } = truncationResult;
+    wasTruncated = truncationResult.wasTruncated;


⚠️ Potential issue | 🟠 Major

Guard against truncation removing the current user turn entirely.

If the latest message alone exceeds budget, includedMessages can be empty. The call then proceeds with no user prompt, producing an unrelated assistant response instead of a clear error.

✅ Suggested fix

const { includedMessages } = truncationResult; wasTruncated = truncationResult.wasTruncated; + + // If even the newest user turn cannot fit, fail fast with context-length error + if (includedMessages.length === 0) { + return NextResponse.json( + { + error: 'context_length_exceeded', + message: 'Your latest message is too large for this model context window. Please shorten it or start a new conversation.', + details: 'context_length_exceeded', + }, + { status: 413 } + ); + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/web/src/app/api/ai/chat/route.ts` around lines 837 - 844, After calling determineMessagesToInclude, add a guard that ensures the current/latest user turn was not dropped: find the last user message in sanitizedMessages (e.g., lastUserIndex or by role === 'user') and verify includedMessages is non-empty and contains that message (compare id or index); if it doesn't, return a clear client error (400) or an explicit error response indicating the user input exceeds the token budget instead of proceeding. Update the logic around determineMessagesToInclude / includedMessages / wasTruncated in route.ts to perform this check and short-circuit with a helpful error when the latest user turn would be removed.

coderabbitai · 2026-02-26T00:41:11Z

apps/web/src/app/api/ai/chat/route.ts

+          message: wasTruncated
+            ? 'The conversation still exceeds this model\'s context window even after trimming. Please start a new conversation.'
+            : 'The conversation is too long for this model\'s context window. Older messages have been trimmed — try sending your message again.',
+          details: 'context_length_exceeded',


⚠️ Potential issue | 🟡 Minor

The non-truncated error message is misleading.

When wasTruncated is false, the response currently says older messages “have been trimmed,” but no trimming occurred in that branch.

✅ Suggested wording update

- : 'The conversation is too long for this model\'s context window. Older messages have been trimmed — try sending your message again.', + : 'The conversation is too long for this model\'s context window. Please try again or start a new conversation.',

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

message: wasTruncated

? 'The conversation still exceeds this model\'s context window even after trimming. Please start a new conversation.'

: 'The conversation is too long for this model\'s context window. Older messages have been trimmed — try sending your message again.',

details: 'context_length_exceeded',

message: wasTruncated

? 'The conversation still exceeds this model\'s context window even after trimming. Please start a new conversation.'

: 'The conversation is too long for this model\'s context window. Please try again or start a new conversation.',

details: 'context_length_exceeded',

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@apps/web/src/app/api/ai/chat/route.ts` around lines 1249 - 1252, The error message for the non-truncated branch incorrectly implies older messages were trimmed; update the logic that builds the error response (the object using wasTruncated, message, and details in route.ts) so when wasTruncated is false the message does not say messages were trimmed—instead return a clear statement like "The conversation is too long for this model's context window. Please start a new conversation or shorten the conversation" while preserving details: 'context_length_exceeded'.

coderabbitai · 2026-02-26T00:41:11Z

packages/lib/src/monitoring/ai-context-calculator.ts

-  return Math.ceil(text.length / 4);
+  // Detect non-ASCII heavy content (CJK, emoji, etc.)
+  // CJK Unified Ideographs, Hiragana, Katakana, Hangul, etc.
+  const nonAsciiCount = (text.match(/[^\x00-\x7F]/g) || []).length;


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# Check if the file exists and read the content around line 60 fd "ai-context-calculator.ts" --type f

Repository: 2witstudios/PageSpace

Length of output: 117

🏁 Script executed:

# Once we confirm the file exists, read it to verify the exact content at line 60 cat -n packages/lib/src/monitoring/ai-context-calculator.ts | sed -n '55,65p'

Repository: 2witstudios/PageSpace

Length of output: 636

🏁 Script executed:

# Check for linter configuration in the packages/lib directory find packages/lib -maxdepth 3 -name "*.json" -o -name "*.mjs" -o -name "biome.json" -o -name ".eslintrc*" | head -20

Repository: 2witstudios/PageSpace

Length of output: 150

🏁 Script executed:

# Check if Biome is configured at the root or in packages find . -maxdepth 2 -name "biome.json" -o -name "biome.jsonc"

Repository: 2witstudios/PageSpace

Length of output: 77

🏁 Script executed:

# Read the biome.json configuration cat biome.json

Repository: 2witstudios/PageSpace

Length of output: 1839

🏁 Script executed:

# Check the suggested fix in the review comment - verify if the logic is equivalent python3 << 'EOF' # Test if the suggested fix produces the same results as the original # Original: text.match(/[^\x00-\x7F]/g) matches all non-ASCII characters # Suggested: reduce with codePointAt check for > 0x7f test_strings = [ "hello", # All ASCII "hello🌍", # ASCII + emoji "你好", # Chinese characters "café", # ASCII + accented char "مرحبا", # Arabic ] for test_str in test_strings: # Simulate original regex behavior original_count = len([c for c in test_str if ord(c) > 0x7f]) # Simulate suggested fix behavior suggested_count = 0 for char in test_str: code_point = ord(char) if code_point > 0x7f: suggested_count += 1 print(f"String: '{test_str}' | Original: {original_count} | Suggested: {suggested_count} | Match: {original_count == suggested_count}") EOF

Repository: 2witstudios/PageSpace

Length of output: 357

🏁 Script executed:

# Check if noControlCharactersInRegex is part of Biome's recommended rules # and verify if it would be active with the current config web_search "Biome noControlCharactersInRegex rule"

Repository: 2witstudios/PageSpace

Length of output: 115

🏁 Script executed:

# Search for other similar patterns that might have the same issue rg -n '\\x00|\\x[0-9A-Fa-f]{2}' --type=ts --type=tsx packages/

Repository: 2witstudios/PageSpace

Length of output: 92

🏁 Script executed:

# Verify the suggested fix doesn't have any edge cases with surrogate pairs or multi-byte chars python3 << 'EOF' # Test edge cases with the suggested fix test_cases = [ ("hello", 0), # All ASCII ("café", 1), # Single accented character ("🌍", 1), # Single emoji (surrogate pair in JS) ("你好世界", 4), # Chinese characters ("hello🌍world", 1), # Mixed ASCII and emoji ("\x00\x01\x7f", 0), # Control characters at edge of range ("", 0), # Empty string ] for test_str, expected in test_cases: # Suggested fix logic count = 0 for char in test_str: code_point = ord(char) if code_point > 0x7f: count += 1 status = "✓" if count == expected else "✗" print(f"{status} String: {repr(test_str)} | Expected: {expected} | Got: {count}") EOF

Repository: 2witstudios/PageSpace

Length of output: 355

🏁 Script executed:

# Search for control character patterns in the codebase using ts file type only rg -n '\\x00|\\x[0-9A-Fa-f]{2}' --type=ts packages/

Repository: 2witstudios/PageSpace

Length of output: 2413

🏁 Script executed:

# Verify Biome rule documentation is available and check if recommended includes the suspicious rule python3 << 'EOF' # Check Biome's default rule settings for noControlCharactersInRegex # Based on the biome.json, the "suspicious" category doesn't explicitly disable the rule, # which means it uses the recommended default biome_config = { "suspicious": { "noArrayIndexKey": "off", "noExplicitAny": "off", "noImplicitAnyLet": "off", "noDoubleEquals": "off", "noGlobalIsNan": "off", "noAssignInExpressions": "off" # noControlCharactersInRegex is NOT listed, so it uses the default (recommended = true for suspicious) } } print("Analysis of biome.json configuration:") print("- Linter is enabled: true") print("- 'recommended' is enabled globally for linter rules") print("- 'suspicious' category does NOT explicitly disable 'noControlCharactersInRegex'") print("- Therefore, the rule IS active by default") print("\nThe regex pattern /[^\\x00-\\x7F]/ contains control character escapes,") print("which would trigger the 'noControlCharactersInRegex' rule in Biome.") EOF

Repository: 2witstudios/PageSpace

Length of output: 443

Replace the regex pattern to resolve the noControlCharactersInRegex Biome linter violation.

The pattern /[^\x00-\x7F]/g at line 60 triggers Biome's suspicious/noControlCharactersInRegex rule, which is enabled by default. Use explicit code point checking instead:

Suggested fix

- const nonAsciiCount = (text.match(/[^\x00-\x7F]/g) || []).length; + const nonAsciiCount = [...text].reduce((count, char) => { + const codePoint = char.codePointAt(0) ?? 0; + return count + (codePoint > 0x7f ? 1 : 0); + }, 0);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const nonAsciiCount = (text.match(/[^\x00-\x7F]/g) || []).length;

const nonAsciiCount = [...text].reduce((count, char) => {

const codePoint = char.codePointAt(0) ?? 0;

return count + (codePoint > 0x7f ? 1 : 0);

}, 0);

🧰 Tools

🪛 Biome (2.4.4)

[error] 60-60: Unexpected control character in a regular expression.

(lint/suspicious/noControlCharactersInRegex)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/lib/src/monitoring/ai-context-calculator.ts` at line 60, The regex /[^\x00-\x7F]/g used to compute nonAsciiCount violates the linter; replace that regex-based approach with explicit code-point checking: iterate over the string (e.g., using for...of or Array.from) and count characters whose codePointAt(0) > 0x7F, then store that count in nonAsciiCount. Update the expression that computes nonAsciiCount accordingly so it handles full Unicode code points (surrogate pairs) and removes the problematic regex.

coderabbitai · 2026-02-26T00:41:12Z

packages/lib/src/monitoring/ai-context-calculator.ts

+  // OpenRouter - use model-specific limits where known, else 200k conservative default
+  if (providerLower === 'openrouter') {
+    // Claude models via OpenRouter
+    if (modelLower.includes('claude')) return 200_000;
+    // Gemini models via OpenRouter
+    if (modelLower.includes('gemini-2.5')) return 1_000_000;
+    if (modelLower.includes('gemini-2.0') || modelLower.includes('gemini-1.5')) return 1_000_000;
+    // GPT models via OpenRouter
+    if (modelLower.includes('gpt-5.2')) {
+      return modelLower.includes('mini') || modelLower.includes('nano') ? 256_000 : 400_000;
+    }
+    if (modelLower.includes('gpt-5.1')) return 400_000;
+    if (modelLower.includes('gpt-5')) {
+      return modelLower.includes('mini') || modelLower.includes('nano') ? 128_000 : 272_000;
+    }
+    if (modelLower.includes('gpt-4o') || modelLower.includes('gpt-4-turbo')) return 128_000;
+    // DeepSeek models - commonly 64k or 128k
+    if (modelLower.includes('deepseek-r1') || modelLower.includes('deepseek-v3')) return 128_000;
+    if (modelLower.includes('deepseek')) return 64_000;
+    // Qwen models
+    if (modelLower.includes('qwen-2.5') || modelLower.includes('qwq')) return 128_000;
+    if (modelLower.includes('qwen')) return 32_000;
+    // Llama models
+    if (modelLower.includes('llama-3') || modelLower.includes('llama3')) return 128_000;
+    if (modelLower.includes('llama')) return 32_000;
+    // Mistral models
+    if (modelLower.includes('mistral-large') || modelLower.includes('mistral-nemo')) return 128_000;
+    if (modelLower.includes('mistral')) return 32_000;
+    // OpenRouter platform hard cap is 400k for many endpoints - use 200k as safe default
+    return 200_000;
+  }


⚠️ Potential issue | 🟠 Major

OpenRouter-specific limits are partially shadowed by earlier branches.

For OpenRouter models, checks at Line 146 / Line 173 / Line 184 can match first via model name (gpt, claude, gemini), so this block won’t run for many of the models it intends to handle. That defeats the provider-specific conservative mapping.

✅ Suggested fix direction

export function getContextWindowSize(model: string, provider?: string): number { const providerLower = provider?.toLowerCase() || ''; const modelLower = model.toLowerCase(); + // OpenRouter first to ensure provider-specific limits are applied + if (providerLower === 'openrouter') { + // ...existing OpenRouter mapping... + return 200_000; + } + // OpenAI models - if (providerLower === 'openai' || modelLower.includes('gpt')) { + if (providerLower === 'openai' || (!providerLower && modelLower.includes('gpt'))) { ... }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/lib/src/monitoring/ai-context-calculator.ts` around lines 225 - 255, The OpenRouter-specific model limit block is being bypassed because earlier generic model-name branches match first; update those earlier checks to exclude OpenRouter by adding a guard like providerLower !== 'openrouter' to the generic model-name conditionals (the branches that inspect modelLower for 'gpt', 'claude', 'gemini', etc.) or alternatively move the OpenRouter block earlier so it runs before the generic branches; locate the providerLower and modelLower checks in ai-context-calculator.ts and adjust the conditionals around the generic model checks or reposition the OpenRouter if (providerLower === 'openrouter') block so OpenRouter-specific mappings are applied correctly.

Reviews proactive context window management for AI chat, covering token estimation accuracy, OpenRouter model registry, error handling, type safety, and OWASP compliance. Recommends adding tests and verifying error propagation before merge. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix wasTruncated message inversion: non-truncated error no longer claims messages were trimmed - Guard empty messages: return 413 when truncation leaves zero messages instead of sending empty context to streamText - Fix estimateMessageTokens: match actual Vercel AI SDK tool-{name} part format (input/output/state) instead of legacy tool-call/tool-result - Reconcile context windows: extract MODEL_CONTEXT_WINDOWS into shared model-context-windows.ts as single source of truth (fixes glm-5 200000→202752, minimax m2.5 via OpenRouter 1M→204800 divergence) - Fix isContextLengthError: replace broad '413' string match with specific HTTP status pattern to prevent false positives - Fix estimateTokens: replace regex array allocation with efficient charCode loop for CJK detection - Add 42 unit tests for isContextLengthError, estimateTokens, estimateMessageTokens, getContextWindowSize, determineMessagesToInclude Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Fix inverted wasTruncated message logic in catch block 413 response - Guard against empty includedMessages after truncation (return 413 early) - Handle tool-{toolName} part format in estimateMessageTokens (PageSpace DB format) - Extract MODEL_CONTEXT_WINDOWS to shared model-context-windows.ts (single source of truth) - Narrow isContextLengthError '413' check to require context/token keywords - Replace regex with for-loop counter in estimateTokens for memory efficiency - Add unit tests for isContextLengthError and CJK-aware estimateTokens Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai bot reviewed Feb 21, 2026

View reviewed changes

apps/web/src/app/api/ai/chat/route.ts Show resolved Hide resolved

apps/web/src/app/api/ai/chat/route.ts Show resolved Hide resolved

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

2witstudios mentioned this pull request Feb 26, 2026

review: code review of PR #708 context window management #727

Closed

2 tasks

2witstudios mentioned this pull request Feb 26, 2026

fix: address review findings for context window management #730

Closed

4 tasks

-  const nonAsciiCount = (text.match(/[^\x00-\x7F]/g) || []).length;
+  const nonAsciiCount = [...text].reduce((count, char) => {
+    const codePoint = char.codePointAt(0) ?? 0;
+    return count + (codePoint > 0x7f ? 1 : 0);
+  }, 0);

Conversation

2witstudios commented Feb 21, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Changes

Implementation Details

Summary by CodeRabbit

Uh oh!

chatgpt-codex-connector bot commented Feb 21, 2026

Uh oh!

coderabbitai bot commented Feb 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

2witstudios commented Feb 26, 2026

Addressed all CodeRabbit review feedback in 8a5e5d5

Actionable comments (fixed):

Nitpick comments (fixed):

Build fix:

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2witstudios commented Feb 21, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 21, 2026 •

edited

Loading

Addressed all CodeRabbit review feedback in `8a5e5d5`