ref(lw-deletions): Add partition_column to DeletionSettings#7765
ref(lw-deletions): Add partition_column to DeletionSettings#7765
Conversation
Add an optional `partition_column` field to DeletionSettings and the JSON schema. Configure it for eap_items (timestamp) and search_issues (receive_timestamp). This is a no-op change — the field is not yet used by any logic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…rages eap_items and all its downsample tables were configured with partition_format: [date] but the actual ClickHouse partitioning is (retention_days, toMonday(timestamp)). Fix to [retention_days, date] so decode_part_str works correctly for cleanup and optimize. Also add partition_format: [retention_days, date] to search_issues which was missing it entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
snuba/datasets/configuration/issues/storages/search_issues.yaml
Outdated
Show resolved
Hide resolved
The actual ClickHouse partition key is toMonday(client_timestamp), not receive_timestamp. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MeredithAnya
left a comment
There was a problem hiding this comment.
I think I might be a little confused but are you going to be using PARTITION IN partition_expr? Or just adding the timestamp to the WHERE condition. It seems like the latter from look at your other PR but wanted to clairfy
I was just planning on doing the latter. Do you think PARTITION IN is a better solution? I guess there's less manual parsing/manipulation |
MeredithAnya
left a comment
There was a problem hiding this comment.
I think that this is probably the easier change to start with so I think you can go ahead and do that and we can iterate on this
Split lightweight deletes into one mutation per Monday partition date,
to reduce load spikes
Changes:
- Adds Redis-based tracking (SET per conditions hash) to prevent
duplicate mutations on consumer restart
- Feature is **off by default**, controlled by runtime config
`lw_deletes_split_by_partition_{storage_name}`
- Falls back to un-split DELETE if no partitions found in `system.parts`
Depends on #7765.
### New runtime config
| Key | Default | Purpose |
|-----|---------|---------|
| `lw_deletes_split_by_partition_eap_items` | 0 | Enable partition
splitting for eap_items |
| `lw_deletes_split_by_partition_search_issues` | 0 | Enable partition
splitting for search_issues |
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
partition_columnfield toDeletionSettingsdataclass and JSON schemapartition_column: timestampforeap_itemsandpartition_column: receive_timestampforsearch_issuesThis is part 1 of a stacked PR series. Part 2 ( #7766 ) uses this field to split lightweight deletes by partition.