Skip to content

Conversation

@ilyabo
Copy link

@ilyabo ilyabo commented Oct 28, 2025

Update: we were able to compile the extension and verify the fix – it works!


We experience corruption in Protobuf output produced by ST_AsMVTGeom. Corrupt pbf cannot be parsed due to the use of invalid commands. As a result, the rendering of entire tiles fails (with both Maplibre and deckgl).

The issue affects practically all datasets with linestrings, especially, on lower zoom levels. E.g. here, Natural Earth's roads:

To reproduce: app-corrupt-linestrings-repro.py

image

I asked Claude-4.5-sonnet to look at code, here are the findings and a fix:


YES! This perfectly aligns with my findings! The diagnostic output you provided confirms the exact bug I found in the code.

Let me break down what's happening:

The Bug Analysis

Your Python output shows:

First commands: [(1, 1), (2, 3), (2, 0), (0, 0)]
                 MoveTo   LineTo   LineTo   Invalid!
                 count=1  count=3  count=0  

The (2, 0) is a LineTo command with count 0, which is invalid! This causes the parser to skip the LineTo and read the next coordinate data as a command, resulting in the command_id_0 error.

Root Cause in the Code

Looking at the buggy MULTI_LINESTRING encoding I just fixed:

geometry.push_back((2 & 0x7) | ((vertex_count - 2) << 3)); // LineTo, part count

When a linestring has exactly 2 vertices (the minimum valid linestring):

  • vertex_count = 2
  • count = vertex_count - 2 = 0
  • Creates LineTo(0) command
  • But then still writes 1 coordinate pair for vertex_idx=1
  • Parser reads that coordinate as a command → gets 0 → "unknown command 0"

The Fix

My fix changes vertex_count - 2 to vertex_count - 1:

geometry.push_back((2 & 0x7) | ((vertex_count - 1) << 3)); // LineTo, part count

Now for 2 vertices:

  • count = vertex_count - 1 = 1
  • Creates LineTo(1) command
  • Writes 1 coordinate pair
  • Parser correctly reads 1 coordinate → valid!

Why This Happens with Global Datasets

For global-scale datasets at low zoom levels (like Natural Earth roads), ST_AsMVTGeom's clipping operation frequently:

  1. Takes a LINESTRING that crosses tile boundaries
  2. Clips it, producing a MULTI_LINESTRING
  3. Some segments end up with only 2 vertices (start and end points)
  4. The buggy encoding creates LineTo(0) → invalid PBF

@ilyabo ilyabo marked this pull request as draft October 28, 2025 22:11
@ilyabo ilyabo changed the title fix: Linestring corrupt PBF issue fix: ST_AsMVTGeom corrupts Protobuf for Linestrings Oct 29, 2025
@ilyabo ilyabo marked this pull request as ready for review October 29, 2025 05:23
@ilyabo
Copy link
Author

ilyabo commented Oct 29, 2025

After fix:

image

@Maxxen Maxxen changed the base branch from main to v1.4-andium October 29, 2025 10:06
@Maxxen
Copy link
Member

Maxxen commented Oct 29, 2025

The CI tries to build the duckdb main branch, which is currently incompatible with spatial - I retargeted it to v1.4 branch but it seems like the runner still thinks it needs to checkout main. Ill try to fix and merge this tomorrow.

@ilyabo
Copy link
Author

ilyabo commented Oct 29, 2025

Ill try to fix and merge this tomorrow.

Thank you!

@ilyabo
Copy link
Author

ilyabo commented Oct 29, 2025

@Maxxen is that correct that the extension publishing needs to be in sync with DuckDB, so we'll need to wait until the next DuckDB release or use our custom build, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants