Skip to content

Conversation

@ignat980
Copy link

@ignat980 ignat980 commented Nov 10, 2025

fix(compiler): robustly strip psql meta commands without breaking SQL

Implements a single-pass state machine that correctly distinguishes psql meta-commands from backslashes in SQL code, literals, and comments.

This fixes schema parsing failures when files contain psql meta-commands like \connect, \set, \d, etc., which are PostgreSQL client commands that aren't valid SQL.

The Problem

Backslashes can appear in valid SQL:

  • Backslashes in string literals (e.g. E'\\n', escape sequences)
  • Meta-command text in comments or documentation
  • Dollar-quoted function bodies with backslash content

A naive line-based approach would incorrectly strip these, breaking valid SQL.

Changes

  • Track parsing state for single quotes, dollar quotes, and block comments
  • Only remove backslash commands at true line starts outside any literal context
  • Properly handle escaped quotes (''), nested block comments (/* /* */ */)
  • Support dollar-quoted tags with identifiers (e.g. $tag$...$tag$)
  • Add comprehensive test suite covering:
    • All documented psql meta-commands (\connect, \set, \d*, etc.) See PostgreSQL psql docs
    • String literals with backslashes and nested quotes
    • Dollar-quoted blocks with various tag formats
    • Nested block comments containing meta-command text
    • Edge cases: empty input, whitespace-only, missing newlines

Performance improvements

  • Pre-allocate output buffer with strings.Builder.Grow()
  • Single pass eliminates redundant string operations
  • Reduces allocations by avoiding intermediate line slices

Testing

  • go test ./internal/compiler
  • 100% test coverage of new function removePsqlMetaCommands()

Credits

Co-authored-by: Andrew Benton [email protected]

Addresses gbarr's comment in #4082 which closes #4065

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 🔧 golang labels Nov 10, 2025
@ignat980
Copy link
Author

@andrewmbenton please review

@gbarr
Copy link

gbarr commented Nov 10, 2025

Thanks @ignat980 this looks a much more complete solution than I was expecting

@kyleconroy
Copy link
Collaborator

@ignat980 if you open this against main I can get this merged. Just make sure to includes Andrew's commits.

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 10, 2025
@ignat980 ignat980 force-pushed the fix/strip-psql-meta branch from 3436dde to 60585cd Compare December 11, 2025 07:40
andrewmbenton and others added 2 commits December 11, 2025 10:45
Replace naive line-based removal with a single-pass state machine that correctly distinguishes psql meta-commands from backslashes in SQL code, literals, and comments.

The previous implementation would incorrectly strip any line starting with a backslash, breaking valid SQL containing:
- Backslashes in string literals (E'\\n', escape sequences)
- Meta-command text in comments or documentation
- Dollar-quoted function bodies with backslash content

Changes:
- Track parsing state for single quotes, dollar quotes, and block comments
- Only remove backslash commands at true line starts outside any literal context
- Properly handle escaped quotes (''), nested block comments (/* /* */ */)
- Support dollar-quoted tags with identifiers ($tag$...$tag$)
- Add comprehensive test suite covering:
  * All documented psql meta-commands (\connect, \set, \d*, etc.)
  * String literals with backslashes and nested quotes
  * Dollar-quoted blocks with various tag formats
  * Nested block comments containing meta-command text
  * Edge cases: empty input, whitespace-only, missing newlines

Performance improvements:
- Pre-allocate output buffer with strings.Builder.Grow()
- Single pass eliminates redundant string operations
- Reduces allocations by avoiding intermediate line slice
@ignat980 ignat980 force-pushed the fix/strip-psql-meta branch from 60585cd to 2181f98 Compare December 11, 2025 08:46
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Dec 11, 2025
@ignat980 ignat980 changed the base branch from andrew/fix-4065 to main December 11, 2025 08:53
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Dec 11, 2025
@ignat980
Copy link
Author

@kyleconroy Thanks! I rebased to latest sqlc/main and changed this PR's merge-into branch as sqlc/main. Just waiting on the test CI to finish

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files. 🔧 golang

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SQLC fails for psql meta-commands like \restrict

4 participants