fix: deduplicate streaming JSONL entries to prevent ~2x cost overcounting#77
Conversation
Summary of ChangesHello @Psypeal, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves an issue where token usage costs were inaccurately inflated due to the way streaming API responses from Claude Code were processed. By implementing a deduplication strategy that considers only the final message for each unique request ID, the system now provides precise cost calculations, aligning with the actual API usage and preventing erroneous overcharges. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively addresses a critical bug causing cost over-counting for streaming API responses. The approach of deduplicating entries by requestId is sound, and the changes are implemented correctly in both the main and renderer processes. The addition of comprehensive test cases is excellent and ensures the reliability of the fix. I have one suggestion to improve the efficiency of the deduplication logic in the renderer process, but overall, this is a solid contribution.
| const lastIndexByRequestId = new Map<string, number>(); | ||
| for (let i = 0; i < messages.length; i++) { | ||
| const rid = messages[i].requestId; | ||
| if (rid) lastIndexByRequestId.set(rid, i); | ||
| } | ||
| for (let i = 0; i < messages.length; i++) { | ||
| const rid = messages[i].requestId; | ||
| if (rid && lastIndexByRequestId.get(rid) !== i) { | ||
| duplicateRequestIndices.add(i); | ||
| } | ||
| } |
There was a problem hiding this comment.
This logic for finding duplicate request indices can be made more efficient. The current implementation iterates over the messages array twice to first find the last index of each requestId and then to populate the set of duplicates. You can achieve the same result more cleanly and efficiently with a single pass by iterating backwards through the array.
| const lastIndexByRequestId = new Map<string, number>(); | |
| for (let i = 0; i < messages.length; i++) { | |
| const rid = messages[i].requestId; | |
| if (rid) lastIndexByRequestId.set(rid, i); | |
| } | |
| for (let i = 0; i < messages.length; i++) { | |
| const rid = messages[i].requestId; | |
| if (rid && lastIndexByRequestId.get(rid) !== i) { | |
| duplicateRequestIndices.add(i); | |
| } | |
| } | |
| const seenRequestIds = new Set<string>(); | |
| for (let i = messages.length - 1; i >= 0; i--) { | |
| const rid = messages[i].requestId; | |
| if (rid) { | |
| if (seenRequestIds.has(rid)) { | |
| duplicateRequestIndices.add(i); | |
| } else { | |
| seenRequestIds.add(rid); | |
| } | |
| } | |
| } |
📝 WalkthroughWalkthroughAdds an optional Changes
Possibly related issues
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/renderer/utils/sessionAnalyzer.ts (1)
616-634:⚠️ Potential issue | 🟠 Major
startupTokensandassistantMsgDatanot guarded byduplicateRequestIndices.The dedup guard added at line 409 protects cost and token stats, but two other per-message token accumulators are not protected:
startupTokens(Lines 618–623): every streaming duplicate entry before the first work tool inflatesstartupPctOfTotaland the startup overhead assessment.assistantMsgData(Lines 628–633): duplicate entries push additional[timestamp, tokenCount]pairs, skewing the quartile-based token density timeline.🐛 Proposed fix
- if (msgType === 'assistant' && !firstWorkToolSeen) { + if (msgType === 'assistant' && !firstWorkToolSeen && !duplicateRequestIndices.has(i)) { startupMessages++; if (m.usage) {- if (msgType === 'assistant' && msgTs && m.usage) { + if (msgType === 'assistant' && msgTs && m.usage && !duplicateRequestIndices.has(i)) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/utils/sessionAnalyzer.ts` around lines 616 - 634, The code adds per-message token counts to startupTokens and pushes to assistantMsgData without checking duplicateRequestIndices, causing duplicates to skew startupPctOfTotal and the token-density timeline; update the two blocks that modify startupTokens/startupMessages and assistantMsgData to first skip processing if duplicateRequestIndices indicates this message index is a duplicate (the same guard used around cost/token stats at the dedup guard near duplicateRequestIndices), i.e., only accumulate into startupTokens/startupMessages and push into assistantMsgData when the message is not a duplicate (check the same index/key used by the existing dedup logic), while preserving the existing conditions (msgType === 'assistant', !firstWorkToolSeen, msgTs, m.usage).
🧹 Nitpick comments (3)
src/renderer/utils/sessionAnalyzer.ts (1)
350-368: Dedup logic is duplicated fromdeduplicateByRequestIdinjsonl.ts.The inline scoped block reimplements the same two-pass algorithm already in
src/main/utils/jsonl.ts. IfdeduplicateByRequestIdwere promoted to@shared/utils, it could be consumed directly here, eliminating the duplication and ensuring both paths evolve together.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/utils/sessionAnalyzer.ts` around lines 350 - 368, The dedup logic here duplicates the two-pass algorithm from deduplicateByRequestId; replace the inline block that builds duplicateRequestIndices by importing and calling deduplicateByRequestId from a shared module (e.g., move deduplicateByRequestId in src/main/utils/jsonl.ts to `@shared/utils/jsonl` and then call it from sessionAnalyzer.ts), passing the messages array and using its result instead of reimplementing the two-pass loop that computes lastIndexByRequestId and duplicateRequestIndices.test/main/utils/costCalculation.test.ts (1)
740-789:deduplicateByRequestIddirect tests are missing a mixed-content case.Both existing direct tests use either all messages without
requestIdor all messages with one. There's no test exercising the case where some messages have arequestIdand some don't — i.e., verifying that non-requestIdmessages are preserved unchanged when dedup does occur. This case is covered indirectly viacalculateMetrics(lines 657–717) but a direct assertion here would close the gap.🧪 Suggested additional test
+ it('should preserve messages without requestId when dedup occurs', () => { + const messages: ParsedMessage[] = [ + { + type: 'assistant', + uuid: 'msg-1a', + timestamp: new Date(), + content: [], + requestId: 'req-1', + usage: { input_tokens: 100, output_tokens: 50 }, + toolCalls: [], + toolResults: [], + isSidechain: false, + }, + { + type: 'user', + uuid: 'msg-user', + timestamp: new Date(), + content: 'hello', + toolCalls: [], + toolResults: [], + isSidechain: false, + isMeta: false, + }, + { + type: 'assistant', + uuid: 'msg-1b', + timestamp: new Date(), + content: [], + requestId: 'req-1', + usage: { input_tokens: 100, output_tokens: 200 }, + toolCalls: [], + toolResults: [], + isSidechain: false, + }, + ]; + + const result = deduplicateByRequestId(messages); + expect(result).toHaveLength(2); + expect(result[0].uuid).toBe('msg-user'); // non-requestId message preserved + expect(result[1].uuid).toBe('msg-1b'); // last requestId entry kept + });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/main/utils/costCalculation.test.ts` around lines 740 - 789, Add a direct unit test for deduplicateByRequestId that mixes messages with and without requestId: create an array containing (a) one or more messages without requestId, (b) multiple messages sharing the same requestId (earlier and later), and (c) another non-requestId message; call deduplicateByRequestId and assert that duplicates for the shared requestId are reduced to the last entry (check uuid and usage.output_tokens), that non-requestId messages remain present and in their original order, and optionally assert reference equality for an untouched non-requestId message to ensure it was not recreated. Use the same ParsedMessage shape as other tests and name the test descriptively (e.g., "should preserve non-requestId messages while deduplicating requestId entries").src/main/utils/jsonl.ts (1)
316-324:messageCountincludes streaming duplicates while all token fields do not.
messageCount: messages.lengthcounts every JSONL entry including the ones discarded by dedup. If callers use this field to infer "number of distinct API calls" it will be inflated for streaming sessions. Consider returningdedupedMessages.length(or exposing both), or at minimum documenting that this reflects raw JSONL entry count.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/utils/jsonl.ts` around lines 316 - 324, The returned object currently sets messageCount: messages.length which counts raw JSONL entries (including duplicates removed by dedup), so update the return to report distinct messages instead: replace messageCount: messages.length with messageCount: dedupedMessages.length (or add a new field dedupedMessageCount: dedupedMessages.length and keep messageCount for raw entries) and ensure any callers that expect "number of API calls" use the deduped value; locate this in the function that builds the return object (the object containing durationMs, totalTokens, inputTokens, etc.) and adjust accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/renderer/utils/sessionAnalyzer.ts`:
- Around line 616-634: The code adds per-message token counts to startupTokens
and pushes to assistantMsgData without checking duplicateRequestIndices, causing
duplicates to skew startupPctOfTotal and the token-density timeline; update the
two blocks that modify startupTokens/startupMessages and assistantMsgData to
first skip processing if duplicateRequestIndices indicates this message index is
a duplicate (the same guard used around cost/token stats at the dedup guard near
duplicateRequestIndices), i.e., only accumulate into
startupTokens/startupMessages and push into assistantMsgData when the message is
not a duplicate (check the same index/key used by the existing dedup logic),
while preserving the existing conditions (msgType === 'assistant',
!firstWorkToolSeen, msgTs, m.usage).
---
Nitpick comments:
In `@src/main/utils/jsonl.ts`:
- Around line 316-324: The returned object currently sets messageCount:
messages.length which counts raw JSONL entries (including duplicates removed by
dedup), so update the return to report distinct messages instead: replace
messageCount: messages.length with messageCount: dedupedMessages.length (or add
a new field dedupedMessageCount: dedupedMessages.length and keep messageCount
for raw entries) and ensure any callers that expect "number of API calls" use
the deduped value; locate this in the function that builds the return object
(the object containing durationMs, totalTokens, inputTokens, etc.) and adjust
accordingly.
In `@src/renderer/utils/sessionAnalyzer.ts`:
- Around line 350-368: The dedup logic here duplicates the two-pass algorithm
from deduplicateByRequestId; replace the inline block that builds
duplicateRequestIndices by importing and calling deduplicateByRequestId from a
shared module (e.g., move deduplicateByRequestId in src/main/utils/jsonl.ts to
`@shared/utils/jsonl` and then call it from sessionAnalyzer.ts), passing the
messages array and using its result instead of reimplementing the two-pass loop
that computes lastIndexByRequestId and duplicateRequestIndices.
In `@test/main/utils/costCalculation.test.ts`:
- Around line 740-789: Add a direct unit test for deduplicateByRequestId that
mixes messages with and without requestId: create an array containing (a) one or
more messages without requestId, (b) multiple messages sharing the same
requestId (earlier and later), and (c) another non-requestId message; call
deduplicateByRequestId and assert that duplicates for the shared requestId are
reduced to the last entry (check uuid and usage.output_tokens), that
non-requestId messages remain present and in their original order, and
optionally assert reference equality for an untouched non-requestId message to
ensure it was not recreated. Use the same ParsedMessage shape as other tests and
name the test descriptively (e.g., "should preserve non-requestId messages while
deduplicating requestId entries").
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/main/types/messages.tssrc/main/utils/jsonl.tssrc/renderer/utils/sessionAnalyzer.tstest/main/utils/costCalculation.test.ts
|
Nice work on this! The We independently landed on the same streaming dedup fix in #79 (now closed in favor of this one). Our PR went a step further by making Thanks again — the numbers look much more reasonable now. |
|
PR #87 reverted PRs #60, #65, #73 which removes sessionAnalyzer.ts. The jsonl.ts dedup fix is still valid and wanted, but the sessionAnalyzer.ts changes in this PR will conflict. |
d24942b to
3fd862a
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/main/utils/jsonl.ts`:
- Around line 287-288: Change the mutable declaration "let costUsd = 0;" to an
immutable "const costUsd = 0;" since costUsd is never reassigned, and update the
surrounding comment to avoid implying a per-message calculation that doesn't
exist—either remove the misleading "Calculate cost per-message, then sum" line
or replace it with a short TODO (e.g., "TODO: implement per-message cost
calculation") if you plan to implement it later; update the symbol referenced
("costUsd") and the nearby comment in jsonl.ts accordingly.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/main/types/messages.tssrc/main/utils/jsonl.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/main/types/messages.ts
src/main/utils/jsonl.ts
Outdated
| // Calculate cost per-message, then sum (tiered pricing applies per-API-call, not to aggregated totals) | ||
| let costUsd = 0; |
There was a problem hiding this comment.
Fix let to const — pipeline failure.
costUsd is never reassigned, so it must be declared with const. Additionally, the comment mentions "Calculate cost per-message, then sum" but no cost calculation is implemented—the variable stays 0. Consider removing the misleading comment or adding a TODO if cost calculation is planned for a follow-up.
🔧 Proposed fix
- // Calculate cost per-message, then sum (tiered pricing applies per-API-call, not to aggregated totals)
- let costUsd = 0;
+ // Cost calculation deferred; costUsd remains undefined for now
+ const costUsd = 0;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Calculate cost per-message, then sum (tiered pricing applies per-API-call, not to aggregated totals) | |
| let costUsd = 0; | |
| // Cost calculation deferred; costUsd remains undefined for now | |
| const costUsd = 0; |
🧰 Tools
🪛 ESLint
[error] 288-288: 'costUsd' is never reassigned. Use 'const' instead.
(prefer-const)
🪛 GitHub Actions: CI
[error] 288-288: 'costUsd' is never reassigned. Use 'const' instead prefer-const
🪛 GitHub Check: validate
[failure] 288-288:
'costUsd' is never reassigned. Use 'const' instead
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/main/utils/jsonl.ts` around lines 287 - 288, Change the mutable
declaration "let costUsd = 0;" to an immutable "const costUsd = 0;" since
costUsd is never reassigned, and update the surrounding comment to avoid
implying a per-message calculation that doesn't exist—either remove the
misleading "Calculate cost per-message, then sum" line or replace it with a
short TODO (e.g., "TODO: implement per-message cost calculation") if you plan to
implement it later; update the symbol referenced ("costUsd") and the nearby
comment in jsonl.ts accordingly.
…ting Claude Code writes multiple JSONL entries per API response during streaming, each with the same requestId but incrementally increasing output_tokens. Our parser summed every entry independently, inflating costs by ~2x vs /cost. Fix: keep only the last entry per requestId (the final, complete token counts). - Add requestId to ParsedMessage type and populate from AssistantEntry - Add deduplicateByRequestId() helper (last-entry-wins strategy) - Apply dedup in calculateMetrics() before summing tokens/cost - Apply same dedup in sessionAnalyzer cost accounting loop - Add 6 test cases covering streaming dedup scenarios Closes matt1398#74
3fd862a to
ef2e086
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/main/utils/jsonl.ts (1)
235-250: Constrain dedup to assistant messages for future safety.The function docs say “streaming assistant entries,” but implementation currently dedups any message with
requestId. Narrowing scope prevents accidental filtering if other message types later carry that field.♻️ Proposed refactor
export function deduplicateByRequestId(messages: ParsedMessage[]): ParsedMessage[] { // Map from requestId -> index of last occurrence const lastIndexByRequestId = new Map<string, number>(); for (let i = 0; i < messages.length; i++) { - const rid = messages[i].requestId; + const msg = messages[i]; + const rid = msg.type === 'assistant' ? msg.requestId : undefined; if (rid) { lastIndexByRequestId.set(rid, i); } } // If no requestIds found, no dedup needed if (lastIndexByRequestId.size === 0) { return messages; } return messages.filter((msg, i) => { - if (!msg.requestId) return true; + if (msg.type !== 'assistant' || !msg.requestId) return true; return lastIndexByRequestId.get(msg.requestId) === i; }); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/main/utils/jsonl.ts` around lines 235 - 250, The dedup currently removes any message with a requestId; change it so only assistant messages are considered: when building lastIndexByRequestId only record indices for messages where msg.requestId is set AND the message is an assistant (e.g., check messages[i].role === 'assistant' or the project's assistant-type field), and in the final filter allow non-assistant messages to pass through unmodified (i.e., only apply the lastIndexByRequestId check for assistant messages). Use the existing identifiers (messages, requestId, lastIndexByRequestId) and update both the initial loop and the filter predicate accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/main/utils/jsonl.ts`:
- Around line 235-250: The dedup currently removes any message with a requestId;
change it so only assistant messages are considered: when building
lastIndexByRequestId only record indices for messages where msg.requestId is set
AND the message is an assistant (e.g., check messages[i].role === 'assistant' or
the project's assistant-type field), and in the final filter allow non-assistant
messages to pass through unmodified (i.e., only apply the lastIndexByRequestId
check for assistant messages). Use the existing identifiers (messages,
requestId, lastIndexByRequestId) and update both the initial loop and the filter
predicate accordingly.
Hi, matt, I have fixed the conflicts here, please check it. |
Summary
requestIdbut incrementally increasingoutput_tokens/costcommandrequestId(the final, complete token counts) — same approach as ryoppippi/ccusage#835Changes
src/main/types/messages.tsrequestId?: stringtoParsedMessagesrc/main/utils/jsonl.tsrequestIdfromAssistantEntry; adddeduplicateByRequestId(); apply dedup incalculateMetrics()src/renderer/utils/sessionAnalyzer.tstest/main/utils/costCalculation.test.tsHow it works
deduplicateByRequestId()scans messages to find the last index perrequestId, then filters out earlier duplicatesrequestId(user, system) pass through unchangedcalculateMetrics()(main process) andsessionAnalyzer(renderer)Test plan
pnpm typecheckpassespnpm lint:fix— 0 errorspnpm test— 833 tests pass (6 new)/costoutputCloses #74
Summary by CodeRabbit
New Features
Improvements