Skip to content

fix(plpgsql): handle multibyte diagnostic offsets#737

Merged
psteinroe merged 1 commit into
mainfrom
fix/umalute
May 18, 2026
Merged

fix(plpgsql): handle multibyte diagnostic offsets#737
psteinroe merged 1 commit into
mainfrom
fix/umalute

Conversation

@psteinroe
Copy link
Copy Markdown
Collaborator

Prevent PL/pgSQL diagnostics from panicking when source text contains UTF-8 multibyte characters before a reported query span.

The diagnostic mapper now keeps internal source ranges as UTF-8 byte offsets while explicitly converting plpgsql_check's one-based character positions at the boundary. Query lookup also iterates on character boundaries instead of byte-by-byte slices, avoiding invalid string indexing inside characters like umlauts.

Regression Coverage

Added focused tests for umlauts in comments before a query and in string literals before a reported query position.

Fixes #735

@psteinroe psteinroe merged commit 0fa637d into main May 18, 2026
9 checks passed
@psteinroe psteinroe deleted the fix/umalute branch May 18, 2026 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Character boundary error with UTF-8 multi-byte characters (umlauts)

1 participant