Description
The pyroscope_compaction_size_bytes histogram has bucket boundaries that are far too small to capture actual compacted block sizes, making the metric effectively useless.
Current bucket definition
https://github.com/grafana/pyroscope/blob/main/pkg/compactor/bucket_compactor.go#L290-L294
m.Size = prometheus.NewHistogramVec(prometheus.HistogramOpts{
Name: "pyroscope_compaction_size_bytes",
Help: "Final block size after compaction by level",
Buckets: prometheus.ExponentialBuckets(32, 1.5, 12),
}, []string{"level"})
This generates buckets: 32, 48, 72, 108, 162, 243, 364, 546, 820, 1230, 1845, 2768 — maxing out at ~2.7 KB.
Problem
Compacted blocks are typically in the MB to GB range. Every observation lands in the +Inf bucket, so histogram_quantile() returns a flat value near the last finite bucket boundary (~3KB) regardless of actual block sizes.
-- Always returns ~3KB regardless of actual compaction sizes
histogram_quantile(0.95, sum(rate(pyroscope_compaction_size_bytes_bucket{cluster="$cluster"}[15m])) by (le))
Suggested fix
Use buckets that cover the realistic range of compacted block sizes (MB to GB):
Buckets: prometheus.ExponentialBuckets(1<<20, 2, 15),
// 1MB, 2MB, 4MB, 8MB, 16MB, 32MB, 64MB, 128MB, 256MB, 512MB, 1GB, 2GB, 4GB, 8GB, 16GB
Alternatively, this could be made configurable so operators can tune the bucket boundaries to match their workload characteristics.
Description
The
pyroscope_compaction_size_byteshistogram has bucket boundaries that are far too small to capture actual compacted block sizes, making the metric effectively useless.Current bucket definition
https://github.com/grafana/pyroscope/blob/main/pkg/compactor/bucket_compactor.go#L290-L294
This generates buckets:
32, 48, 72, 108, 162, 243, 364, 546, 820, 1230, 1845, 2768— maxing out at ~2.7 KB.Problem
Compacted blocks are typically in the MB to GB range. Every observation lands in the
+Infbucket, sohistogram_quantile()returns a flat value near the last finite bucket boundary (~3KB) regardless of actual block sizes.Suggested fix
Use buckets that cover the realistic range of compacted block sizes (MB to GB):
Alternatively, this could be made configurable so operators can tune the bucket boundaries to match their workload characteristics.