New feature non-decimal integer literals (SQL:2023 T661) and underscores in numeric literals (SQL:2023 T662) #8564
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
non-decimal integer literals (SQL:2023 T661)
HEX
Hexadecimal numeric constants are prefixed with the digits '0' and 'X' (or 'x') immediately followed by a sequence of bytes represented by the digits 0-9, a-f or A-F. Integer values are expressed in hexadecimal notation. Numbers consisting of 1-8 hexadecimal characters are interpreted as INTEGER, numbers consisting of 9-16 characters are interpreted as BIGINT, and numbers consisting of 17-32 characters are interpreted as INT128. Longer character sequences raise an error. Odd number of hexadecimal characters implies the leading zero character.
Hexadecimal numbers in the range from 0x0 to 0x7FFFFFFFFF are positive INTEGER numbers with values from 0 to 2147483647.
Hexadecimal numbers in the range from 0x80000000 to 0x7FFFFFFFFFFFFFFF are positive BIGINT numbers with values from 2147483648 to 9223372036854775807.
Hexadecimal numbers in the range from 0x8000 0000 0000 0000 to 0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF are positive INT128 numbers with values from 9223372036854775808 to 170141183460469231731687303715884105727.
OCT
Octadecimal numeric constants are prefixed with the digits '0' and 'O' (or 'o') immediately followed by a sequence of bytes represented by the digits 0-7. Integer values are expressed in octal notation. Numbers consisting of 1-11 octal digits are interpreted as INTEGER, those consisting of 11-21 digits - as BIGINT, and those consisting of 22-43 digits - as INT128.
Longer character sequences raise an error.
BIN
Binary numeric constants are prefixed with the digits '0' and 'B' (or ‘b') immediately followed by a sequence of bytes represented by the digits 0 or 1. Integer values are expressed in binary notation. Numbers consisting of 1-31 binary digits are interpreted as INTEGER, those consisting of 32-63 digits are interpreted as BIGINT. 64-127 digits are interpreted as INT128. Longer character sequences raise an error.
The value is a numeric value defined by applying the usual mathematical interpretation of positional hexadecimal notation to a string that is an unsigned hexadecimal integer. Similarly for an unsigned octal integer and an unsigned binary integer. To specify a negative number with any of these notations (hexadecimal, octadecimal or binary), use the unary minus prefix character. For example, 0xFFFFFFFF is BIGINT 4294967295 (positive), but -0xFFFFFFFF is BIGINT -4294967295 (negative). If you need to specify -1 as hexadecimal INTEGER, use -0x1.
Underlining in numeric literals according to (SQL:2023 T662).
For a non-decimal literal, the following is allowed: 0x_FFFF, 0xFF_FF, 0x_FF_FF, 0x_FF_FF.
NOT allowed for a non-decimal literal: 0x_FF__FF, 0xFFFFFF_, ‘0x_FF_FF_FF_’;
For a decimal literal the following is allowed: 10_10, 10_10.10_10, 10.10E-10_0;
NOT allowed for a decimal literal: 1010, 100, 1010.1010, 1010.1, 10.10E_-100_;
Regressions found
HEX constants were already in FB, but the behavior has changed. In for the old behavior 0xFFFFFFFF is the number -1 INTEGER, similarly for BEGIN and INT128. The new behavior violates this rule.
It was discovered that errors started to occur when merging characters. For example. When automatically generating scripts sql.
As you can see the numeric literal is glued to the symbol. This behavior used to be normal, but now an error is generated.
I consider the new behavior in these cases to be correct and there are no plans to provide backward compatibility.
How many more examples: