Skip to content

Fixed bug #8524 : ISQL will truncate lines longer than 255 when pasting #8561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

hvlad
Copy link
Member

@hvlad hvlad commented May 10, 2025

No description provided.

@hvlad hvlad linked an issue May 10, 2025 that may be closed by this pull request
else
{
DWORD read2;
if (!ReadConsoleW(handle, pWide + 1, 1, &read2, NULL))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already read whole line into wideBuf, there is nothing to read here. WideCharToMultiByte below will take care about surrogates automatically. Whole this if is meaningless AFAIK (as well as outer loop, perhaps - you just read the line and convert it, no piecewise read is possible).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code is generic and doesn't assume that whole line is read by single call of ReadConsoleW().
I see no need to break this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, surrogates aren't supported in modern Unicode standard at all.

Second, you cannot enter them from console in a way that only a half of surrogate pair is been read exactly because of the issue: the console disallow to enter more characters than fits the requested buffer. For old code such situation was theoretically possible because it read by tiny pieces and the system console buffer was bigger than requested buffer. In the new code this call will wait for the next line requesting one character which, in turn, will again limit such continuation line by 254 characters, no?.. If you want to preserve such logic, move the incomplete character into the beginning of the buffer and read MAX_LINE_LENGTH - 1 here at least.

Copy link
Member Author

@hvlad hvlad May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested the code with small buffer and actually able to enter surrogate pair in the way you consider impossible.
There is no documentation and no guarantee that next Windows version keeps internal buffer of 255 characters, so I prefer to keep existing working already tested code.
I see no reason to convince you here, it is senseless.

PS ask yourself - why wideBuf is declared with MAX_LINE_LENGTH +1 length

Copy link
Member

@mrotteveel mrotteveel May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, surrogates aren't supported in modern Unicode standard at all.

That is not correct, Unicode 16 (the latest version) still covers surrogates, and explicitly says they should only be used with UTF-16, and UTF-16 needs surrogates to be able to encode all characters defined by Unicode. And guess what ReadConsoleW produces: UTF-16.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I'd guess that pasting a U+10000 or higher character in a console that supports it will produce surrogate pairs.

@@ -329,6 +329,11 @@ static int win32ReadConsole(FILE* file, char* buffer, size_t bufferSize)
return -1;
}

const size_t MAX_LINE_LENGTH = 32765;
static WCHAR wideBuf[MAX_LINE_LENGTH +1];
Copy link
Member

@mrotteveel mrotteveel May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiosity question: should this be allocated on the heap instead of on the stack?

For example Visual Studio analysis will issue a C6262 warning for this, and its caller (readNextInputLine) also allocates a buffer of 64 KiB on the stack, so these two methods combined will allocate 128 KiB+ on the stack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curiosity question: should this be allocated on the heap instead of on the stack?

For example Visual Studio analysis will issue a C6262 warning for this, and its caller (readNextInputLine) also allocates a buffer of 64 KiB on the stack, so these two methods combined will allocate 128 KiB+ on the stack.

This is static buffer, it is not on stack. And it is done intentionally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I had missed that, but the similar buffer in readNextInputLine is not static. Should it be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pWide and charsRead are more obvious candidates for static duration. Or at least a comment why they are fine as local variables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I had missed that, but the similar buffer in readNextInputLine is not static. Should it be?

It is not necessary.

@hvlad hvlad self-assigned this May 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ISQL will truncate lines longer than 255 when pasting
3 participants