Skip to content

New feature non-decimal integer literals (SQL:2023 T661) and underscores in numeric literals (SQL:2023 T662) #8564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

ChudaykinAlex
Copy link
Contributor

non-decimal integer literals (SQL:2023 T661)

HEX

Hexadecimal numeric constants are prefixed with the digits '0' and 'X' (or 'x') immediately followed by a sequence of bytes represented by the digits 0-9, a-f or A-F. Integer values are expressed in hexadecimal notation. Numbers consisting of 1-8 hexadecimal characters are interpreted as INTEGER, numbers consisting of 9-16 characters are interpreted as BIGINT, and numbers consisting of 17-32 characters are interpreted as INT128. Longer character sequences raise an error. Odd number of hexadecimal characters implies the leading zero character.

Hexadecimal numbers in the range from 0x0 to 0x7FFFFFFFFF are positive INTEGER numbers with values from 0 to 2147483647.

Hexadecimal numbers in the range from 0x80000000 to 0x7FFFFFFFFFFFFFFF are positive BIGINT numbers with values from 2147483648 to 9223372036854775807.

Hexadecimal numbers in the range from 0x8000 0000 0000 0000 to 0x7FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF are positive INT128 numbers with values from 9223372036854775808 to 170141183460469231731687303715884105727.

OCT

Octadecimal numeric constants are prefixed with the digits '0' and 'O' (or 'o') immediately followed by a sequence of bytes represented by the digits 0-7. Integer values are expressed in octal notation. Numbers consisting of 1-11 octal digits are interpreted as INTEGER, those consisting of 11-21 digits - as BIGINT, and those consisting of 22-43 digits - as INT128.
Longer character sequences raise an error.

BIN

Binary numeric constants are prefixed with the digits '0' and 'B' (or ‘b') immediately followed by a sequence of bytes represented by the digits 0 or 1. Integer values are expressed in binary notation. Numbers consisting of 1-31 binary digits are interpreted as INTEGER, those consisting of 32-63 digits are interpreted as BIGINT. 64-127 digits are interpreted as INT128. Longer character sequences raise an error.

The value is a numeric value defined by applying the usual mathematical interpretation of positional hexadecimal notation to a string that is an unsigned hexadecimal integer. Similarly for an unsigned octal integer and an unsigned binary integer. To specify a negative number with any of these notations (hexadecimal, octadecimal or binary), use the unary minus prefix character. For example, 0xFFFFFFFF is BIGINT 4294967295 (positive), but -0xFFFFFFFF is BIGINT -4294967295 (negative). If you need to specify -1 as hexadecimal INTEGER, use -0x1.

Underlining in numeric literals according to (SQL:2023 T662).

For a non-decimal literal, the following is allowed: 0x_FFFF, 0xFF_FF, 0x_FF_FF, 0x_FF_FF.
NOT allowed for a non-decimal literal: 0x_FF__FF, 0xFFFFFF_, ‘0x_FF_FF_FF_’;
For a decimal literal the following is allowed: 10_10, 10_10.10_10, 10.10E-10_0;
NOT allowed for a decimal literal: 1010, 100, 1010.1010, 1010.1, 10.10E_-100_;

Regressions found

HEX constants were already in FB, but the behavior has changed. In for the old behavior 0xFFFFFFFF is the number -1 INTEGER, similarly for BEGIN and INT128. The new behavior violates this rule.

It was discovered that errors started to occur when merging characters. For example. When automatically generating scripts sql.

-	select * from rdb$database where rdb$relation_id>1and 1=2;
-       select substring(r.rdb$character_set_name from 1for 2) from rdb$database r;
-	select 1,2,3 from rdb$database where 1=0order by 1;

As you can see the numeric literal is glued to the symbol. This behavior used to be normal, but now an error is generated.

I consider the new behavior in these cases to be correct and there are no plans to provide backward compatibility.

How many more examples:

SELECT 1234567_890 FROM rdb$database;
SELECT 1_234_567_890 FROM rdb$database;
SELECT 1_23_45.6_78_09 FROM rdb$database;
SELECT -1_2_3_4_5.6E-1_0 FROM rdb$database;

SELECT  0b1100_1010  FROM rdb$database;
SELECT  0b11_00_10_10 FROM rdb$database;
SELECT  0b_11001010  FROM rdb$database;
SELECT  -0b11_00_10_10 FROM rdb$database;

SELECT  0o1234_5670 FROM rdb$database;
SELECT  0o12_34_56_70 FROM rdb$database;
SELECT  0o_12345670  FROM rdb$database;
SELECT  -0o_12345670  FROM rdb$database;

select  0o1_777_777_777_777_777_777_777_777_777_777_777_777_777_777  FROM rdb$database;
select  0x7FFF_FFFF_FFFF_FFFF_FFFF_FFFF_FFFF_FFFF FROM rdb$database;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant