Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More helpful message when passing a JSON-SEQ file without the --seq flag #3156

Open
LPardue opened this issue Aug 2, 2024 · 4 comments
Open

Comments

@LPardue
Copy link

LPardue commented Aug 2, 2024

Describe the bug
When attempting to parse a JSON-SEQ file, without the --seq flag, jq prints parse error: Invalid numeric literal at line 1, column 4

To Reproduce
The JSON-SEQ format uses an ASCII Record Separator (RS) character at the start of each entry, this is being mistakenly overlooked (by some argument). Example json.

�{"qlog_version":"0.3","qlog_format":"JSON-SEQ","title":"quiche-client qlog","description":"quiche-client qlog id=1928a4431b958983545fe79a00fcf7a670f39d8e","trace":{"vantage_point":{"type":"client"},"title":"quiche-client qlog","description":"quiche-client qlog id=1928a4431b958983545fe79a00fcf7a670f39d8e","configuration":{"time_offset":0.0}}}
�{
  "time": 0.0,
  "name": "transport:parameters_set",
  "data": {
    "owner": "local",
    "tls_cipher": "None",
    "disable_active_migration": true,
    "max_idle_timeout": 30000,
    "max_udp_payload_size": 1350,
    "ack_delay_exponent": 3,
    "max_ack_delay": 25,
    "active_connection_id_limit": 2,
    "initial_max_data": 10000000,
    "initial_max_stream_data_bidi_local": 1000000,
    "initial_max_stream_data_bidi_remote": 1000000,
    "initial_max_stream_data_uni": 1000000,
    "initial_max_streams_bidi": 100,
    "initial_max_streams_uni": 100
  }
}

Expected behavior
It would be nice if jq could relize the record separator character is there, and print a helpful message to retry with --seq
The RFC sort of hints at this use case

since RS may not appear in JSON texts in any other form, RS unambiguously delimits the start of any element in the sequence. RS is sufficient to unambiguously delimit all top-level JSON value types other than numbers.

Environment (please complete the following information):

  • Ubuntu
  • jq-1.6
@wader
Copy link
Member

wader commented Aug 2, 2024

Hi, yeah not great error message. I think this might be related to #501? the parser currently return this error on lots of syntax issues.

@LPardue
Copy link
Author

LPardue commented Oct 16, 2024

Thanks for the pointer @wader . #501 was recently resolved via a PR but I suspect it won't address this specific issue. I'll see if I can draw inspiration from it and make up another PR that implements my initial suggestion.

@wader
Copy link
Member

wader commented Oct 16, 2024

@LPardue 👍

LPardue added a commit to LPardue/jq that referenced this issue Oct 17, 2024
JSON Test Sequences, aka JSON-SEQ, aka application/json-seq are defined in
https://datatracker.ietf.org/doc/html/rfc7464. Per the RFC, the format is:

   any number of JSON texts, each encoded in UTF-8 [RFC3629],
   each preceded by one ASCII RS character, and each followed by a line
   feed (LF).

jq supports this format but requires the --seq parameter to be used in order to
correct parse it. If the option is omitted, then an ambiguous and confusing
error message is printed. The RFC is designed to avoid this ambiguity:

   Since RS is an ASCII control character, it may only
   appear in JSON strings in escaped form (see [RFC7159]), and since RS
   may not appear in JSON texts in any other form, RS unambiguously
   delimits the start of any element in the sequence.  RS is sufficient
   to unambiguously delimit all top-level JSON value types other than
   numbers.

This change adds ASCII RS character (0x1e) detection when --seq is omitted, and
prints a useful error message recommending to retry with the option.

Fixes jqlang#3156.
@LPardue
Copy link
Author

LPardue commented Oct 17, 2024

See #3191

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants