You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I self hosted trigger for a long time. I have t many runs stored in my db so I recently decided to clean the older ones. But my script seems to took a very long time to only delete 1500 runs. When I tried to describe the query plan to delete one row it appears to me that two relations where missing some key indexes.
Find slow foreign keys
postgres=# begin;
explain (analyze,buffers,timing)
deletefrom"TaskRun"where id='cmawag7nafpp8o0281yebzmdz';
rollback;
BEGIN
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------Deleteon"TaskRun" (cost=0.42..8.44 rows=0 width=0) (actual time=3.989..3.990 rows=0 loops=1)
Buffers: shared hit=7 read=5 dirtied=1-> Index Scan using "TaskRun_pkey"on"TaskRun" (cost=0.42..8.44 rows=1 width=6) (actual time=1.936..1.938 rows=1 loops=1)
Index Cond: (id ='cmawag7nafpp8o0281yebzmdz'::text)
Buffers: shared hit=2 read=2
Planning Time: 0.136 ms
Trigger for constraint TaskRunAttempt_taskRunId_fkey on TaskRun: time=2.418 calls=1
Trigger for constraint TaskRunDependency_taskRunId_fkey on TaskRun: time=2.381 calls=1
Trigger for constraint BatchTaskRunItem_taskRunId_fkey on TaskRun: time=2.630 calls=1
Trigger for constraint Checkpoint_runId_fkey on TaskRun: time=0.861 calls=1
Trigger for constraint CheckpointRestoreEvent_runId_fkey on TaskRun: time=0.742 calls=1
Trigger for constraint BulkActionItem_sourceRunId_fkey on TaskRun: time=0.032 calls=1
Trigger for constraint BulkActionItem_destinationRunId_fkey on TaskRun: time=0.015 calls=1
Trigger for constraint _TaskRunToTaskRunTag_A_fkey on TaskRun: time=2.354 calls=1
Trigger for constraint ProjectAlert_taskRunId_fkey on TaskRun: time=0.015 calls=1
Trigger for constraint TaskRun_rootTaskRunId_fkey on TaskRun: time=1.790 calls=1
Trigger for constraint TaskRun_parentTaskRunId_fkey on TaskRun: time=1.418 calls=1
Trigger for constraint TaskRunDependency_dependentAttemptId_fkey on TaskRunAttempt: time=0.022 calls=1
Trigger for constraint BatchTaskRun_dependentTaskAttemptId_fkey on TaskRunAttempt: time=0.638 calls=1
Trigger for constraint Checkpoint_attemptId_fkey on TaskRunAttempt: time=0.814 calls=1
Trigger for constraint BatchTaskRunItem_taskRunAttemptId_fkey on TaskRunAttempt: time=1.489 calls=1
Trigger for constraint CheckpointRestoreEvent_attemptId_fkey on TaskRunAttempt: time=40.039 calls=1
Trigger for constraint ProjectAlert_taskRunAttemptId_fkey on TaskRunAttempt: time=0.033 calls=1
Trigger for constraint TaskRun_parentTaskRunAttemptId_fkey on TaskRunAttempt: time=336.831 calls=1
Execution Time: 398.553 ms
(25 rows)
ROLLBACK
Fixing the indexes
postgres=# -- does not block writers while it buildsCREATEINDEXCONCURRENTLY IF NOT EXISTS
"TaskRun_parentTaskRunAttemptId_idx"ON"TaskRun" ("parentTaskRunAttemptId");
CREATEINDEXCONCURRENTLY IF NOT EXISTS
"CheckpointRestoreEvent_attemptId_idx"ON"CheckpointRestoreEvent" ("attemptId");
CREATE INDEX
CREATE INDEX
postgres=# begin;
explain (analyze,buffers,timing)
deletefrom"TaskRun"where id='cmawag7nafpp8o0281yebzmdz';
rollback;
BEGIN
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------Deleteon"TaskRun" (cost=0.42..8.44 rows=0 width=0) (actual time=0.132..0.133 rows=0 loops=1)
Buffers: shared hit=11 dirtied=1-> Index Scan using "TaskRun_pkey"on"TaskRun" (cost=0.42..8.44 rows=1 width=6) (actual time=0.046..0.047 rows=1 loops=1)
Index Cond: (id ='cmawag7nafpp8o0281yebzmdz'::text)
Buffers: shared hit=4
Planning:
Buffers: shared hit=17 read=1
Planning Time: 0.301 ms
Trigger for constraint TaskRunAttempt_taskRunId_fkey on TaskRun: time=0.134 calls=1
Trigger for constraint TaskRunDependency_taskRunId_fkey on TaskRun: time=0.092 calls=1
Trigger for constraint BatchTaskRunItem_taskRunId_fkey on TaskRun: time=0.087 calls=1
Trigger for constraint Checkpoint_runId_fkey on TaskRun: time=0.138 calls=1
Trigger for constraint CheckpointRestoreEvent_runId_fkey on TaskRun: time=0.379 calls=1
Trigger for constraint BulkActionItem_sourceRunId_fkey on TaskRun: time=0.109 calls=1
Trigger for constraint BulkActionItem_destinationRunId_fkey on TaskRun: time=0.061 calls=1
Trigger for constraint _TaskRunToTaskRunTag_A_fkey on TaskRun: time=0.079 calls=1
Trigger for constraint ProjectAlert_taskRunId_fkey on TaskRun: time=0.050 calls=1
Trigger for constraint TaskRun_rootTaskRunId_fkey on TaskRun: time=0.189 calls=1
Trigger for constraint TaskRun_parentTaskRunId_fkey on TaskRun: time=0.245 calls=1
Trigger for constraint TaskRunDependency_dependentAttemptId_fkey on TaskRunAttempt: time=0.067 calls=1
Trigger for constraint BatchTaskRun_dependentTaskAttemptId_fkey on TaskRunAttempt: time=0.081 calls=1
Trigger for constraint Checkpoint_attemptId_fkey on TaskRunAttempt: time=0.059 calls=1
Trigger for constraint BatchTaskRunItem_taskRunAttemptId_fkey on TaskRunAttempt: time=0.063 calls=1
Trigger for constraint CheckpointRestoreEvent_attemptId_fkey on TaskRunAttempt: time=0.135 calls=1
Trigger for constraint ProjectAlert_taskRunAttemptId_fkey on TaskRunAttempt: time=0.043 calls=1
Trigger for constraint TaskRun_parentTaskRunAttemptId_fkey on TaskRunAttempt: time=0.154 calls=1
Execution Time: 2.353 ms
(27 rows)
ROLLBACK
It could be usefull to add those by default.
If you are okay with that I could maybe create a pr for this.
@rharkor unfortunately we won’t be able to merge this into main. Each index slows insert performance and we already have too many on this table (we’ll be removing some soon when we make Clickhouse the default way we show run lists and metrics).
The real solution for this is to get rid of foreign key constraints completely for TaskRun then there won’t be any cascades on delete. That requires some changes to the application logic to deal with nulls but would be better.
Provide environment information
System:
OS: Linux 6.8 Ubuntu 24.04.2 LTS 24.04.2 LTS (Noble Numbat)
CPU: (4) x64 AMD EPYC-Milan Processor
Memory: 10.10 GB / 15.24 GB
Container: Yes
Shell: 5.2.21 - /bin/bash
Binaries:
Node: 18.19.1 - /usr/bin/node
npm: 9.2.0 - /usr/bin/npm
Describe the bug
I self hosted trigger for a long time. I have t many runs stored in my db so I recently decided to clean the older ones. But my script seems to took a very long time to only delete 1500 runs. When I tried to describe the query plan to delete one row it appears to me that two relations where missing some key indexes.
Find slow foreign keys
Fixing the indexes
It could be usefull to add those by default.
If you are okay with that I could maybe create a pr for this.
Thanks
Reproduction repo
https://trigger.dev/docs/open-source-self-hosting#self-hosting
To reproduce
Just selfhost trigger and try having more than >500k task run in your db.
Additional information
No response
The text was updated successfully, but these errors were encountered: