-
Notifications
You must be signed in to change notification settings - Fork 11
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Problem description
We're running a standard deployment of eoapi-k8s for IFRC Montandon with these configuration values:
ingress:
host: "montandon-eoapi-stage.ifrc.org"
tls:
enabled: false
pgstacBootstrap:
settings:
envVars:
LOAD_FIXTURES: "0"
RUN_FOREVER: "1"
postgrescluster:
instances:
- name: eoapi
replicas: 1
dataVolumeClaimSpec:
accessModes:
- "ReadWriteOnce"
resources:
requests:
storage: "500Gi"
cpu: "1024m"
memory: "3048Mi"
We've hit a snag where the database service eventually runs out of disk space after ingesting data for a while. When this happens, it fails its health checks and becomes unresponsive, which brings down the raster service with it.
Looking into our staging instance, we found that the PostgreSQL WAL (Write-Ahead Log) directory is the culprit - it's eating up almost all of our storage. Here's what we're seeing:
bash-4.4$ df -h /pgdata/
Filesystem Size Used Avail Use% Mounted on
/dev/sdd 493G 493G 0 100% /pgdata
bash-4.4$ du -sh /pgdata/*
16K /pgdata/lost+found
116M /pgdata/pg16
492G /pgdata/pg16_wal
12K /pgdata/pgbackrest
The 492GB WAL directory is quite large compared to the 116MB data directory.
Expected Output
Ideally, the WAL directory should stay under certain usage limit and should not cause down time.
Environment Information
Crunchy Postgres Operator: v5.5.2
eoapi-k8s: v0.5.0
jankovicgd
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working