Skip to content

feat(eap): Add wildcard search support for array string attributes#7770

Open
phacops wants to merge 4 commits intomasterfrom
pierre/eap-array-wildcard-search
Open

feat(eap): Add wildcard search support for array string attributes#7770
phacops wants to merge 4 commits intomasterfrom
pierre/eap-array-wildcard-search

Conversation

@phacops
Copy link
Contributor

@phacops phacops commented Feb 25, 2026

Summary

  • Add TYPE_ARRAY handling in attribute_key_to_expression using DangerousRawSQL to extract string values from the JSON attributes_array column
  • Extend OP_LIKE/OP_NOT_LIKE filter translation to support TYPE_ARRAY keys via arrayExists with a lambda
  • LIKE on arrays: arrayExists(x -> like(x, pattern), array) — true if any element matches
  • NOT_LIKE on arrays: NOT(arrayExists(x -> like(x, pattern), array)) — true if no element matches
  • Both support ignore_case flag (switches likeilike)

Test plan

  • Unit tests for attribute_key_to_expression with TYPE_ARRAY (expression shape, alias, backtick escaping)
  • Unit tests for LIKE/NOT_LIKE on array keys (case-sensitive, case-insensitive, NOT variants)
  • Unit tests verifying LIKE/NOT_LIKE on non-string/non-array types still raises errors
  • All pre-commit hooks pass (ruff format, ruff lint, mypy)

🤖 Generated with Claude Code

Agent transcript: https://claudescope.sentry.dev/share/CSlg1g-5P5OFoOJQo-B83K2SA1OibDF7tSh1ralLTgs

Support LIKE/NOT_LIKE operations on TYPE_ARRAY attribute keys in the EAP
RPC filter layer. This enables searching for rows where any element in an
array attribute matches a wildcard pattern (e.g. `%error%`).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/MPMg6cd5C44sTcO2w7G21dFgr2ROlgzLl2IsE6tXT_M
@phacops phacops requested review from a team as code owners February 25, 2026 19:51
alias,
)

if attr_key.type == AttributeKey.Type.TYPE_ARRAY:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be no need for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call — moved the array expression building out of attribute_key_to_expression and into a local helper _build_array_attr_expression in the filter code. The k_expression for TYPE_ARRAY is now resolved directly at the filter site without touching the shared proto function.

Copy link
Member

@volokluev volokluev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a test of the API actually doing stuff, not just the query generation

…ession

Address review feedback: instead of adding TYPE_ARRAY handling to the
shared attribute_key_to_expression function, build the array extraction
expression directly in the filter code where it's needed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/pjQ2FphvcZ1pKVc4nPolPe74-FfTxXiVaeKc5q4XyME
Handle TYPE_ARRAY as an early return in comparison filter processing
so unsupported operations (EQUALS, IN, etc.) raise immediately instead
of producing invalid SQL. Add test for this error path.

Co-Authored-By: Claude <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/hhni4eW0BB_sX1hx1k3ocFkKDVAR5tFkoNu10Z-uERw
Add check that the comparison value is a string before using val_str
for TYPE_ARRAY LIKE/NOT_LIKE operations. Without this, a non-string
value silently defaults to an empty string pattern.

Co-Authored-By: Claude <noreply@anthropic.com>

Agent transcript: https://claudescope.sentry.dev/share/H-XBt-koklI88xXMMmB8rnyAuP-g8fRDyDXx9KD7YI4
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

def _build_array_attr_expression(attr_name: str) -> DangerousRawSQL:
"""Build a DangerousRawSQL expression that extracts string values from
the attributes_array JSON column for a given attribute name."""
safe_name = attr_name.replace("`", "``")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect identifier escaping in raw SQL construction

High Severity

The _build_array_attr_expression escapes backticks by doubling (replace("", "")`), but the codebase's own `escape_identifier` in `snuba/clickhouse/escaping.py` uses backslash-prefix escaping for both backticks and backslashes ( `` → `` \ , `\` → `\\`). This inconsistency means backslashes in `attr_name` are not escaped at all. An attribute name containing a backslash followed by content could cause ClickHouse to misparse the backtick-quoted identifier boundary, since `\ would be interpreted as an escaped backtick rather than the closing delimiter. Since this flows through DangerousRawSQL which bypasses all safety checks, this could lead to SQL parsing errors or injection.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants