Skip to content

Commit 3d683c2

Browse files
Update on-call log with incidents for 2025-11-17
Added on-call log entries for the week of 2025-11-17, detailing incidents and actions taken.
1 parent 85f9239 commit 3d683c2

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed

engineering/on-call-log.mdx

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,23 @@ When everything is running smoothly, the on-call engineer can handle maintenance
1010

1111
The goal of this rolling log is to ease handover between on-call for unresolved issues, and keep a log of what's been handled recently.
1212

13+
## Week of 2025-11-17
14+
15+
On-call: Pieter Beulque
16+
17+
### Incidents
18+
19+
#### 2025-11-22
20+
21+
We had a jump scare on Saturday because of the undelivered webhooks monitor, but it was a false alarm, the queue grew up to 17 undelivered webhooks but gradually backed down again.
22+
I'm not sure there was an actual root cause or just a traffic spike. We can reconsider the threshold for the monitor because it was arbitrarily set to 10 back in the day.
23+
24+
#### 2025-11-17
25+
26+
Our worker went out of memory because of a huge spike in subscription renewals. We have a merchant that is using daily subscriptions for a use case. This in turn triggered a lot of invoices trying to generate concurrently and the invoice number unique index caused the database to lock.
27+
28+
As a mitigation we moved the merchant to the new customer-based invoice indexing. As a future mitigation we also made this default for new organizations. The same issue may still happen for existing organizations that see sudden spikes in orders but the same fix can then also be applied.
29+
1330
## Week of 2025-11-10
1431

1532
On-call: Jesper Bränn

0 commit comments

Comments
 (0)