Description
Problem Statement
Over time nodestore_node
table gets bigger and bigger and currently there is no procedure to clean it up.
A forum comment explaining what nodestore_node
is by @untitaker :
To answer the question about purpose of nodestore:
nodestore stores the raw, unindexed parts of the data of an event. For example all event breadcrumbs and contexts are stored there. Clickhouse only stores the data you can search and aggregate by. It only has indices, so to speak.
nodestore has multiple backends. On sentry.io it is not postgres, but google bigtable. We use the postgres backend for local development and for onpremise, but for large-scale installations it is suboptimal.
We don’t have a solution right now for the storage problem in any case, sorry. One thing you should ensure actually runs is
sentry cleanup
, but it’s definitely possible that nodestore takes this amount of disk space.
https://forum.sentry.io/t/postgres-nodestore-node-table-124gb/12753/3
Solution Brainstorm
There was an idea suggested on forum which worked for me, but I lost all event details.
Something like this would work:
DELETE FROM public.nodestore_node WHERE “timestamp” < NOW() - INTERVAL '1 day';
VACUUM FULL public.nodestore_node;
Change 1 day according to your needs.
Maybe put this in a cron
container which gets run every night, we should think about its performance issues though, this took a long time to get executed on our instance, maybe because it wasn't run before, but I'm not sure.
Metadata
Metadata
Assignees
Type
Projects
Status
Status