Skip to content

Commit 25c11a8

Browse files
committed
Support JSON5 features
1 parent 8cf4921 commit 25c11a8

File tree

8 files changed

+596
-124
lines changed

8 files changed

+596
-124
lines changed

Makefile

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
PACKAGE_NAME = ljson
88

99
staticlib = libljson.a
10-
sharedlib = libljson.so 2 0 0
10+
sharedlib = libljson.so 2 1 0
1111
testedbin = ljson
1212
testednum = jnum_test
1313

README.md

+14-7
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[中文版](./README_zh-cn.md)
44

5-
LJSON is a C implemented JSON library that is much faster than cJSON and substantially faster than RapidJSON, it is currently the fastest general-purpose JSON library.
5+
LJSON is a C implemented JSON library that is much faster than cJSON and substantially faster than RapidJSON, it is currently the fastest general-purpose JSON library and supports the vast majority of JSON5 features.
66
LJSON supports JSON parsing, printing and editing, provides DOM and SAX APIs, and I/O supports string and file, it fully supports the test cases of nativejson-benchmark.
77
By default, LJSON uses the personally developed ldouble algorithm to print double to string. Compared with the standard library, it may only be the 15th decimal place difference. It is currently the fastest double to string algorithm; users can also choose the personally optimized grisu2 algorithm or dragonbox algorithm.
88

@@ -12,6 +12,7 @@ By default, LJSON uses the personally developed ldouble algorithm to print doubl
1212
* Lighter: Provide a variety of methods to save memory, such as pool memory, file parsing while reading, file writing while printing, and SAX APIs. It can make memory usage a constant
1313
* Stronger: Support DOM and SAX-style APIs, provide APIs for JSON in classic mode and memory pool mode, support string and file as input and output, is extended to support long long integer and hexadecimal number
1414
* More friendly: C language implementation, does not depend on any other library, does not contain platform-related code, only one header file and source file, and the interface corresponding to cJSON. the code logic is clearer than any other JSON libraries
15+
* JSON5: Supports the vast majority of JSON5 features, such as hexadecimal digits, comments, array and object tail element comma, but does not support strings without double quotes
1516

1617
## Compile and Run
1718

@@ -51,13 +52,19 @@ make O=<output path> CROSS_COMPILE=<tool prefix> && make O=<output path> DESTDIR
5152

5253
* Set the value of the variable `JSON_ERROR_PRINT_ENABLE` in `json.c` to `1` and then re-compile
5354

54-
### Error detection
55+
### Parse Config
5556

56-
* Set the value of the variable `JSON_STRICT_PARSE_MODE` in `json.c` to `0` / `1` / `2` and then re-compile
57-
* 0: Turn off not common error detection, such as trailing characters left after parsing
58-
* 1: Detect more errors and allow key to be empty string
59-
* 2: In addition to error detection enabled by 1, some non-standard features are also turned off, such as hexadecimal numbers, the first json object is not an array or object
60-
* It 100% matches the test cases of [nativejson-benchmark](https://github.com/miloyip/nativejson-benchmark) when set to 2
57+
* `#define JSON_PARSE_SKIP_COMMENT 1` : Whether to allow C-like single-line comments and multi-line comments(JSON5 feature)
58+
* `#define JSON_PARSE_LAST_COMMA 1` : Whether to allow comma in last element of array or object(JSON5 feature)
59+
* `#define JSON_PARSE_EMPTY_KEY 0` : Whether to allow empty key
60+
* `#define JSON_PARSE_SPECIAL_CHAR 1` : Whether to allow special characters such as newline in the string(JSON5 feature)
61+
* `#define JSON_PARSE_HEX_NUM 1` : Whether to allow HEX number(JSON5 feature)
62+
* `#define JSON_PARSE_SPECIAL_NUM 1` :Whether to allow special number such as starting with '.', '+', '0', for example: `+99` `.1234` `10.` `001`(JSON5 feature)
63+
* `#define JSON_PARSE_SPECIAL_DOUBLE 1` : Whether to allow special double such as `NaN`, `Infinity`, `-Infinity`(JSON5 feature)
64+
* `#define JSON_PARSE_SINGLE_VALUE 1` : Whether to allow json starting with non-array and non-object
65+
* `#define JSON_PARSE_FINISHED_CHAR 0` : Whether to allow characters other than spaces after finishing parsing
66+
67+
Note: It 100% matches the test cases of [nativejson-benchmark](https://github.com/miloyip/nativejson-benchmark) when only `JSON_PARSE_EMPTY_KEY` is set to 1, all others are set to 0.
6168

6269
## Speed Test
6370

README_zh-cn.md

+14-7
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
[English Edition](./README.md)
44

5-
LJSON 是一个远远快于 cJSON、大幅度快于 RapidJSON 的 C 实现的 JSON 库,他是目前最快的通用 JSON 库。
5+
LJSON 是一个远远快于 cJSON、大幅度快于 RapidJSON 的 C 实现的 JSON 库,他是目前最快的通用 JSON 库,也支持JSON5的大多数特性
66
LJSON 支持 JSON 的解析、打印、编辑,提供 DOM 和 SAX 接口,I/O 支持字符串和文件,且完全支持 nativejson-benchmark 的测试用例。
77
LJSON 默认使用个人开发的 ldouble 算法打印浮点数,和标准库对比可能只有第15位小数的区别,是目前最快的浮点数转字符串算法;也可选择个人优化过的 grisu2 算法或 dragonbox 算法。
88

@@ -12,6 +12,7 @@ LJSON 默认使用个人开发的 ldouble 算法打印浮点数,和标准库
1212
* 更省:提供多种省内存的手段,例如内存池、文件边读边解析、边打印边写文件、SAX方式的接口,可做到内存占用是个常数
1313
* 更强:支持DOM和SAX风格的API,提供普通模式和内存池模式JSON的接口,支持字符串和文件作为输入输出(可扩展支持其它流),扩展支持长长整形和十六进制数字
1414
* 更友好:C语言实现,不依赖任何库,不含平台相关代码,只有一个头文件和库文件,和cJSON对应一致的接口,代码逻辑比任何JSON库都更清晰
15+
* JSON5:支持绝大多数JSON5特性,例如十六进制数字、注释、数组和对象的尾元素逗号等,不支持无双引号的字符串
1516

1617
## 编译运行
1718

@@ -51,13 +52,19 @@ make O=<编译输出目录> CROSS_COMPILE=<交叉编译器前缀> && make O=<编
5152

5253
* 设置 json.c 中的变量 `JSON_ERROR_PRINT_ENABLE` 的值为 `1` 后重新编译
5354

54-
### 错误检测
55+
### 解析配置
5556

56-
* 设置 json.c 中的变量 `JSON_STRICT_PARSE_MODE` 的值为 `0` / `1` / `2` 后重新编译
57-
* 0: 关闭不是常见的错误检测,例如解析完成后还剩尾后字符
58-
* 1: 检测更多的错误,且允许 key 为空字符串
59-
* 2: 除去 1 开启的错误检测之外,还关闭某些不是标准的特性,例如十六进制数字,第一个json对象不是array或object
60-
* 设置为2时 100% 符合 [nativejson-benchmark](https://github.com/miloyip/nativejson-benchmark) 的测试用例
57+
* `#define JSON_PARSE_SKIP_COMMENT 1` : 是否允许类似C语言的单行注释和多行注释(JSON5特性)
58+
* `#define JSON_PARSE_LAST_COMMA 1` : 是否允许JSON_ARRAY或JSON_OBJECT的最后一个元素的末尾有逗号(JSON5特性)
59+
* `#define JSON_PARSE_EMPTY_KEY 0` : 是否允许键为空字符串
60+
* `#define JSON_PARSE_SPECIAL_CHAR 1` : 是否允许字符串中有特殊的字符,例如换行符(JSON5特性)
61+
* `#define JSON_PARSE_HEX_NUM 1` : 是否允许十六进制的解析(JSON5特性)
62+
* `#define JSON_PARSE_SPECIAL_NUM 1` : 是否允许特殊的数字,例如前导0,加号,无整数的浮点数等,例如 `+99` `.1234` `10.` `001` 等(JSON5特性)
63+
* `#define JSON_PARSE_SPECIAL_DOUBLE 1` : 是否允许特殊的double值 `NaN` `Infinity` `-Infinity`(JSON5特性)
64+
* `#define JSON_PARSE_SINGLE_VALUE 1` : 是否允许不是JSON_ARRAY或JSON_OBJECT开头的JSON值
65+
* `#define JSON_PARSE_FINISHED_CHAR 0` : 是否解析完成后忽略检查字符串尾部的合法性
66+
67+
注:如果需要100%符合 [nativejson-benchmark](https://github.com/miloyip/nativejson-benchmark),需要将 `JSON_PARSE_EMPTY_KEY` 置为1,其它值全部置为0。
6168

6269
## 性能测试
6370

jnum.c

+71-35
Original file line numberDiff line numberDiff line change
@@ -1312,11 +1312,11 @@ int jnum_dtoa(double num, char *buffer)
13121312
switch (exponent) {
13131313
case DP_EXPONENT_MAX:
13141314
if (significand) {
1315-
memcpy(buffer, "nan", 4);
1315+
memcpy(buffer, "NaN", 4);
13161316
return 3;
13171317
} else {
1318-
memcpy(s, "inf", 4);
1319-
return signbit + 3;
1318+
memcpy(s, "Infinity", 9);
1319+
return signbit + 8;
13201320
}
13211321
break;
13221322

@@ -1360,14 +1360,14 @@ int jnum_dtoa(double num, char *buffer)
13601360
return s - buffer;
13611361
}
13621362

1363-
int jnum_parse_hex(const char *str, jnum_type_t *type, jnum_value_t *value)
1363+
static int jnum_parse_hex(const char *str, jnum_type_t *type, jnum_value_t *value)
13641364
{
13651365
const char *s = str;
13661366
char c;
13671367
uint64_t m = 0;
13681368

13691369
s += 2;
1370-
while (1) {
1370+
while ((s - str) < 18) {
13711371
switch ((c = *s)) {
13721372
case '0': case '1': case '2': case '3': case '4':
13731373
case '5': case '6': case '7': case '8': case '9':
@@ -1398,7 +1398,7 @@ int jnum_parse_hex(const char *str, jnum_type_t *type, jnum_value_t *value)
13981398
return s - str;
13991399
}
14001400

1401-
static int jnum_parse_num_unit(const char *str, jnum_type_t *type, jnum_value_t *value)
1401+
static int jnum_parse_num(const char *str, jnum_type_t *type, jnum_value_t *value)
14021402
{
14031403
static const double div10_lut[20] = {
14041404
1 , 1e-1 , 1e-2 , 1e-3 , 1e-4 , 1e-5 , 1e-6 , 1e-7 , 1e-8 , 1e-9 ,
@@ -1420,6 +1420,17 @@ static int jnum_parse_num_unit(const char *str, jnum_type_t *type, jnum_value_t
14201420
default: break;
14211421
}
14221422

1423+
switch (*s) {
1424+
case '0': case '1': case '2': case '3': case '4':
1425+
case '5': case '6': case '7': case '8': case '9':
1426+
case '.':
1427+
break;
1428+
default:
1429+
*type = JNUM_NULL;
1430+
value->vint = 0;
1431+
return 0;
1432+
}
1433+
14231434
while (*s == '0')
14241435
++s;
14251436

@@ -1430,9 +1441,6 @@ static int jnum_parse_num_unit(const char *str, jnum_type_t *type, jnum_value_t
14301441

14311442
if (k < 20) {
14321443
switch (*s) {
1433-
case 'e': case 'E':
1434-
d = m;
1435-
goto next4;
14361444
case '.':
14371445
d = m;
14381446
goto next2;
@@ -1456,8 +1464,6 @@ static int jnum_parse_num_unit(const char *str, jnum_type_t *type, jnum_value_t
14561464
case '0': case '1': case '2': case '3': case '4':
14571465
case '5': case '6': case '7': case '8': case '9':
14581466
break;
1459-
case 'e': case 'E':
1460-
goto next4;
14611467
case '.':
14621468
goto next2;
14631469
default:
@@ -1509,37 +1515,70 @@ static int jnum_parse_num_unit(const char *str, jnum_type_t *type, jnum_value_t
15091515
++s;
15101516
}
15111517

1512-
switch (*s) {
1513-
case 'e': case 'E':
1514-
goto next4;
1515-
default:
1516-
break;
1517-
}
1518-
15191518
next3:
15201519
*type = JNUM_DOUBLE;
15211520
value->vdbl = sign == 1 ? d : -d;
15221521
return s - str;
1523-
1524-
next4:
1525-
*type = JNUM_BOOL; /* Only used to mark exponential */
1526-
value->vdbl = sign == 1 ? d : -d;
1527-
return s - str;
15281522
}
15291523

1530-
int jnum_parse_num(const char *str, jnum_type_t *type, jnum_value_t *value)
1524+
int jnum_parse(const char *str, jnum_type_t *type, jnum_value_t *value)
15311525
{
1532-
int len = jnum_parse_num_unit(str, type, value);
1533-
if (*type == JNUM_BOOL) {
1534-
jnum_value_t e;
1526+
const char *s = str;
1527+
int len = 0, len2 = 0;
1528+
jnum_type_t t;
1529+
jnum_value_t v;
1530+
1531+
while (1) {
1532+
switch (*s) {
1533+
case '\b': case '\f': case '\n': case '\r': case '\t': case '\v': case ' ':
1534+
++s;
1535+
break;
1536+
default:
1537+
goto next;
1538+
}
1539+
}
1540+
1541+
next:
1542+
len = s - str;
1543+
if (*s == '0' && (*(s + 1) == 'x' || *(s + 1) == 'X')) {
1544+
len2 = jnum_parse_hex(s, type, value);
1545+
if (len2 == 2) {
1546+
*type = JNUM_INT;
1547+
value->vint = 0;
1548+
return len + 1;
1549+
}
1550+
return len + len2;
1551+
}
1552+
1553+
len += jnum_parse_num(s, type, value);
1554+
if (*type == JNUM_NULL)
1555+
return 0;
1556+
1557+
switch (*(str + len)) {
1558+
case 'e': case 'E':
1559+
len2 = jnum_parse_num(str + len + 1, &t, &v);
1560+
if (t == JNUM_NULL)
1561+
return len;
15351562

1536-
len += 1 + jnum_parse_num_unit(str + len + 1, type, &e);
15371563
switch (*type) {
1538-
case JNUM_INT: value->vdbl *= pow(10, e.vint); break;
1539-
case JNUM_LINT: value->vdbl *= pow(10, e.vlint); break;
1540-
default: value->vdbl *= pow(10, e.vdbl); break;
1564+
case JNUM_INT: value->vdbl = value->vint; break;
1565+
case JNUM_LINT: value->vdbl = value->vlint; break;
1566+
default: break;
15411567
}
1568+
15421569
*type = JNUM_DOUBLE;
1570+
len += len2 + 1;
1571+
1572+
switch (t) {
1573+
case JNUM_INT: value->vdbl *= pow(10, v.vint); break;
1574+
case JNUM_LINT: value->vdbl *= pow(10, v.vlint); break;
1575+
case JNUM_DOUBLE: value->vdbl *= pow(10, v.vdbl); break;
1576+
default: break;
1577+
}
1578+
break;
1579+
1580+
default:
1581+
break;
15431582
}
15441583

15451584
return len;
@@ -1551,9 +1590,6 @@ rtype fname(const char *str) \
15511590
jnum_type_t type; \
15521591
jnum_value_t value; \
15531592
rtype val = 0; \
1554-
unsigned char c = *(unsigned char *)str; \
1555-
\
1556-
while (c <= ' ' && c) c = *(unsigned char *)++str; \
15571593
jnum_parse(str, &type, &value); \
15581594
switch (type) { \
15591595
case JNUM_BOOL: val = (rtype)value.vbool;break; \
@@ -1562,7 +1598,7 @@ rtype fname(const char *str) \
15621598
case JNUM_LINT: val = (rtype)value.vlint;break; \
15631599
case JNUM_LHEX: val = (rtype)value.vlhex;break; \
15641600
case JNUM_DOUBLE: val = (rtype)value.vdbl; break; \
1565-
default: break; \
1601+
default: val = 0; break; \
15661602
} \
15671603
return val; \
15681604
}

jnum.h

+2-12
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ extern "C" {
1313
#endif
1414

1515
typedef enum {
16+
JNUM_NULL,
1617
JNUM_BOOL,
1718
JNUM_INT,
1819
JNUM_HEX,
@@ -41,18 +42,7 @@ int64_t jnum_atol(const char *str);
4142
uint32_t jnum_atoh(const char *str);
4243
uint64_t jnum_atolh(const char *str);
4344
double jnum_atod(const char *str);
44-
45-
int jnum_parse_hex(const char *str, jnum_type_t *type, jnum_value_t *value);
46-
int jnum_parse_num(const char *str, jnum_type_t *type, jnum_value_t *value);
47-
48-
static inline int jnum_parse(const char *str, jnum_type_t *type, jnum_value_t *value)
49-
{
50-
const char *s = str;
51-
52-
if (*s == '0' && (*(s+1) == 'x' || *(s+1) == 'X'))
53-
return jnum_parse_hex(str, type, value);
54-
return jnum_parse_num(str, type, value);
55-
}
45+
int jnum_parse(const char *str, jnum_type_t *type, jnum_value_t *value);
5646

5747
#ifdef __cplusplus
5848
}

0 commit comments

Comments
 (0)