fix(data export): properly handle 128-bit item id by xurui-c · Pull Request #7769 · getsentry/snuba

xurui-c · 2026-02-25T18:03:57Z

EAP-427

GDPR breaks when we pass in a 128-bit item id and the pagination advances within the same project id, item type, timestap, and trace id.

Root cause

The `HexIntColumnProcessor` pipeline

item_id is stored as UInt128 in ClickHouse. HexIntColumnProcessor (configured with size=32) has two jobs:

_process_expressions — replaces any Column("item_id") reference with lower(leftPad(hex(item_id), 32, '0')), converting the raw UInt128 to a 32-char zero-padded
lowercase hex string for display/comparison.
_translate_literal — converts a hex string literal to a ClickHouse-safe decimal string (e.g. '295990755636730491527085960987654321'), because ClickHouse cannot
correctly parse bare 128-bit integer literals in SQL.

The pagination cursor

The query uses a tuple comparison for the cursor:

f.greater(
    f.tuple(column("project_id"), column("item_type"), column("timestamp"),
            column("trace_id"), <item_id_expression>),
    f.tuple(literal(last_project_id), literal(last_item_type), literal(last_timestamp),
            literal(last_trace_id), literal(last_item_id)),
)

BaseTypeConverter.__process_optimizable_condition only pattern-matches simple op(col, literal) forms. A greater(tuple(...), tuple(...)) expression does not match, so
_translate_literal is never called on the cursor's literals. The only transformation that fires is _process_expressions walking the tree and replacing any Column("item_id")
it finds.

Before the fix

init

self.last_seen_item_id = int(last_seen_item_id, 16) # Python int, up to 128 bits

to_protobuf

value=AttributeValue(val_str=hex(self.last_seen_item_id)) # e.g. "0xdeadbeef..."

_build_query — LHS

f.reinterpretAsUInt128(f.reverse(f.unhex(column("item_id"))))

_build_query — RHS

literal(page_token.last_seen_item_id) # Literal(value=295990755636730491527085960987654321)

After _process_expressions replaces Column("item_id") → lower(leftPad(hex(item_id), 32, '0')), the final SQL is:

reinterpretAsUInt128(reverse(unhex(lower(leftPad(hex(item_id), 32, '0')))))
> 295990755636730491527085960987654321

The LHS is a round-trip identity (hex → bytes → reverse → reinterpret = original item_id as UInt128). The RHS is a bare decimal integer, serialized via str(exp.value).

The bug: ClickHouse cannot correctly parse bare 128-bit integer literals. They get interpreted as Float64, which has only 52 bits of mantissa — the lower-order bits are
lost. If Float64(item_id_X) rounds down from the true value:

item_id_X > Float64(item_id_X) → True

So when the cursor is set to item_id_X, the next query's condition is still satisfied by item_id_X itself. The same item is returned, the cursor never advances, and
pagination loops forever.

After the fix

init

self.last_seen_item_id = last_seen_item_id # hex string as-is from ClickHouse

to_protobuf

value=AttributeValue(val_str=self.last_seen_item_id) # e.g. "deadbeef0000...cafebabe"

_build_query — LHS

column("item_id") # _process_expressions converts this to lower(leftPad(hex(item_id), 32, '0'))

_build_query — RHS

literal(page_token.last_seen_item_id) # Literal(value="deadbeef0000...cafebabe")

The final SQL is:

lower(leftPad(hex(item_id), 32, '0')) > 'deadbeef0000...cafebabe'

Both sides are 32-char zero-padded lowercase hex strings. No _translate_literal needed, no 128-bit literal parsing, no float truncation. Lexicographic ordering of
fixed-width zero-padded hex strings is equivalent to numeric ordering of the underlying integers, so the comparison is consistent with the ORDER BY item_id (raw UInt128).
Pagination advances correctly.

Copilot

Pull request overview

This PR aims to fix GDPR export pagination for trace items when item_id is 128-bit, by adjusting how pagination compares and serializes the last-seen item_id.

Changes:

Adds a regression test intended to validate pagination progress with 128-bit item_ids.
Updates ExportTraceItemsPageToken to store/serialize last_seen_item_id as a string rather than converting to an int.
Changes the pagination “seek” tuple comparison to use column("item_id") instead of a reinterpretAsUInt128(reverse(unhex(...))) expression.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`snuba/web/rpc/v1/endpoint_export_trace_items.py`	Alters page-token handling and the pagination tuple comparison for `item_id`.
`tests/web/rpc/v1/test_endpoint_export_trace_items.py`	Adds a new pagination test targeting 128-bit `item_id` scenarios.

Comments suppressed due to low confidence (1)

snuba/web/rpc/v1/endpoint_export_trace_items.py:244

The pagination filter compares column("item_id") (stored as UInt128 in eap_items) against literal(page_token.last_seen_item_id) which is a hex string coming from the selected sentry.item_id value. Because this comparison is inside a tuple, the HexIntColumnProcessor condition matcher will not translate the hex literal into a UInt128-compatible value, which can cause the tuple comparison to cast unexpectedly and pagination to stop making progress or skip/duplicate rows. Convert/cast the page-token item id to the same type as the item_id column (e.g., parse hex -> decimal and cast to UInt128) before putting it into the tuple, or store the numeric value in the page token and serialize it consistently.

                f.tuple(
                    column("project_id"),
                    column("item_type"),
                    column("timestamp"),
                    column("trace_id"),
                    column("item_id"),
                ),
                f.tuple(
                    literal(page_token.last_seen_project_id),
                    literal(page_token.last_seen_item_type),
                    literal(page_token.last_seen_timestamp),
                    literal(page_token.last_seen_trace_id),
                    literal(page_token.last_seen_item_id),
                ),

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-25T20:00:30Z

snuba/web/rpc/v1/endpoint_export_trace_items.py

                                name="last_seen_item_id", type=AttributeKey.Type.TYPE_STRING
                            ),
                            op=ComparisonFilter.OP_EQUALS,
-                            value=AttributeValue(val_str=hex(self.last_seen_item_id)),
+                            value=AttributeValue(val_str=self.last_seen_item_id),
                        )


Changing the page token serialization for last_seen_item_id from hex(...) to the raw value alters the on-the-wire format (previous tokens likely included a 0x prefix and were not zero-padded). This can break clients retrying/resuming pagination with a previously issued token. Consider keeping the old format, or making from_protobuf accept both formats (e.g., strip an optional 0x prefix / normalize length) while writing the new normalized format.

Copilot · 2026-02-25T20:00:30Z

tests/web/rpc/v1/test_endpoint_export_trace_items.py

+        items_data = [
+            gen_item_message(
+                start_timestamp=BASE_TIME,
+                trace_id=uuid.uuid4().hex,
+                item_id=uuid.uuid4().int.to_bytes(16, byteorder="little"),
+                project_id=1,
+            )
+            for i in range(num_items)


This test uses a different trace_id for every generated item, but the reported pagination bug occurs when pagination advances within the same project_id, item_type, timestamp and trace_id (leaving item_id as the tie-breaker). With randomized trace_ids, pagination can still make progress even if the item_id tie-breaker is broken. To cover the regression, use a constant trace_id for all generated items (and keep timestamp/item_type constant) so that ordering/progress depends on item_id.

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

tests/web/rpc/v1/test_endpoint_export_trace_items.py

sentry · 2026-02-25T20:22:20Z

snuba/web/rpc/v1/endpoint_export_trace_items.py

                    column("timestamp"),
                    column("trace_id"),
-                    f.reinterpretAsUInt128(f.reverse(f.unhex(column("item_id")))),
+                    column("item_id"),


Bug: A tuple comparison in a ClickHouse query compares a UInt128 column (item_id) with a string literal, creating a type mismatch that relies on implicit database coercion.
_{Severity: MEDIUM}

Suggested Fix

Ensure type consistency in the tuple comparison. Convert the item_id column to a hex string within the query to explicitly match the type of the literal value. This can be done using ClickHouse functions like lower(hex(item_id)) to match the string literal's type and avoid relying on implicit coercion.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: snuba/web/rpc/v1/endpoint_export_trace_items.py#L236 Potential issue: The code change introduces a type mismatch in a ClickHouse tuple comparison used for pagination. The query at `endpoint_export_trace_items.py:236` now compares the `item_id` column, which is a `UInt128` type, with a string literal representing the last seen item ID. ClickHouse tuple comparisons require matching types. While tests indicate this may work due to implicit type coercion by the database, relying on this behavior is fragile. This could lead to query errors or incorrect sorting if ClickHouse's type coercion rules change or if it handles certain hex values in a way that differs from a numeric comparison, breaking pagination.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

there is no typing mismatch:
greater((project_id, item_type, timestamp, replaceAll(toString(trace_id), '-', ''), lower(leftPad(hex(item_id), if(greater(length(hex(item_id)), 16), 32, 16), '0'))), (1, 1, 1772038800.0, 'c0b108ace16742a3a74a6f3cfe0b3fb6', '00bb7052a98b41c9a4088a9355be9c0c'))

onewland · 2026-02-25T21:32:21Z

tests/web/rpc/v1/test_endpoint_export_trace_items.py

+        )
+        items: list[TraceItem] = []
+        seen_item_ids: set[bytes] = set()
+        while True:


let's find another way to do this loop so if somebody breaks response.page_token.end_pagination in the response it doesn't require a CI timeout to cause this to fail. for _i in range(1,100) would probably always iterate through enough pages

for this one specifically, we will iterate 120 times because limit = 1, but yeah good point, implemented this change in both tests

onewland

seems fine, though I'm curious if you've verified that test_pagination_with_128_bit_item_id fails without your change in snuba/web/rpc/v1/ (e.g. do we know we're solving the right problem?)

xurui-c · 2026-02-25T21:47:06Z

@onewland yep!

Rachel Chen added 2 commits February 24, 2026 16:47

c

5c6d707

clean fix

a64e071

xurui-c marked this pull request as ready for review February 25, 2026 19:53

xurui-c requested review from a team as code owners February 25, 2026 19:53

Copilot AI review requested due to automatic review settings February 25, 2026 19:53

Copilot started reviewing on behalf of xurui-c February 25, 2026 19:54 View session

Copilot AI reviewed Feb 25, 2026

View reviewed changes

cursor bot reviewed Feb 25, 2026

View reviewed changes

tests/web/rpc/v1/test_endpoint_export_trace_items.py Outdated Show resolved Hide resolved

trace id

3f06c76

sentry bot reviewed Feb 25, 2026

View reviewed changes

onewland reviewed Feb 25, 2026

View reviewed changes

xurui-c changed the title ~~fix(gdpr): 128-bit item id~~ fix(data export): properly handle 128-bit item id Feb 25, 2026

test

2a48a1a

onewland approved these changes Feb 25, 2026

View reviewed changes

xurui-c enabled auto-merge (squash) February 25, 2026 21:57

xurui-c merged commit 0e623bf into master Feb 25, 2026
35 checks passed

xurui-c deleted the rachel/loop branch February 25, 2026 22:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(data export): properly handle 128-bit item id#7769

fix(data export): properly handle 128-bit item id#7769
xurui-c merged 4 commits intomasterfrom
rachel/loop

xurui-c commented Feb 25, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

sentry bot Feb 25, 2026

Uh oh!

xurui-c Feb 25, 2026

Uh oh!

onewland Feb 25, 2026

Uh oh!

xurui-c Feb 25, 2026

Uh oh!

onewland left a comment

Uh oh!

xurui-c commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

xurui-c commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root cause

The HexIntColumnProcessor pipeline

The pagination cursor

init

to_protobuf

_build_query — LHS

_build_query — RHS

init

to_protobuf

_build_query — LHS

_build_query — RHS

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sentry bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

xurui-c Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

onewland Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

xurui-c Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

onewland left a comment

Choose a reason for hiding this comment

Uh oh!

xurui-c commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

xurui-c commented Feb 25, 2026 •

edited

Loading

The `HexIntColumnProcessor` pipeline