Fix: incorrect list truncation in fast mode when comma is not followed by space#16
Conversation
by space
Fixes an edge case in `loads(..., ALL & ~STR, )` where an unterminated
string inside a list caused early truncation if commas were used without
a space after.
The issue was from an off by one error in the upper bound of `rfind()`
which excluded the last string element unless a space followed the
comma.
Before:
`loads('{"key": ["a", "b", "c","d', ALL & ~STR)` -> `{"key": ["a",
"b"]}`
After:
`loads('{"key": ["a", "b", "c", "d', ALL & ~STR)` -> `{"key": ["a", "b",
"c", "d"]}`
This change adjusts the search range in `rfind()` to correctly detect
the last comma, even when it is not followed by a whitespace.
|
|
There was a problem hiding this comment.
Hey @nickomodee - I've reviewed your changes and they look great!
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
WalkthroughThe change updates the substring search logic for locating the last comma in a JSON string segment. Specifically, it modifies the end index parameter from exclusive to inclusive, extending the search range by one character. No other logic, error handling, or branching is affected. Changes
Poem
✨ Finishing Touches
🧪 Generate Unit Tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/partial_json_parser/core/myelin.py (1)
123-123: LGTM! The fix correctly addresses the off-by-one error, but consider line length.The logic change is correct - extending the search range by one character to include the comma at
last_string_start - 1fixes the premature list truncation when commas aren't followed by spaces. This aligns perfectly with the PR objectives.However, the line exceeds 79 characters (102 > 79) as flagged by flake8.
Consider breaking the line for better readability:
- last_comma = json_string.rfind(",", max(stack[-1][0], last_string_end) + 1, last_string_start) + last_comma = json_string.rfind( + ",", max(stack[-1][0], last_string_end) + 1, last_string_start + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/partial_json_parser/core/myelin.py(1 hunks)
🧰 Additional context used
🪛 Flake8 (7.2.0)
src/partial_json_parser/core/myelin.py
[error] 123-123: line too long (102 > 79 characters)
(E501)
I think |
|
That was a keen observation, thank you @nickomodee! |
c301b2c to
dda8aa1
Compare
|
Oh no. Found another inconsistent case 😂: >>> fix('["a" ,"b', ~STR)
('["a"', ']')
>>> fix_fast('["a" ,"b', ~STR)
('["a" ', ']') |
Fixes an edge case in
loads(..., ALL & ~STR, )where an unterminated string inside a list caused early truncation if commas were used without a space after.The issue was from an off by one error in the upper bound of
rfind()which excluded the last string element unless a space followed the comma.Before:
loads('{"key": ["a", "b", "c","d', ALL & ~STR)->{"key": ["a", "b"]}After:
loads('{"key": ["a", "b", "c", "d', ALL & ~STR)->{"key": ["a", "b", "c", "d"]}This change adjusts the search range in
rfind()to correctly detect the last comma, even when it is not followed by a whitespace.This issue does not appear when using
use_fast_fix=False.Summary by CodeRabbit