Skip to content

persistenceMaxQPS=0 breaks queue reader host-level rate limiter, causing unthrottled error log spam #9599

@mykaul

Description

@mykaul

I'm trying to squeeze as much performance as I can from my poor laptop running Omes against Temporal with ScyllaDB / Cassandra, and I'm unsure where the bottleneck is. Among other things, noticed this issue (well, AI and I noticed it). Here's the AI description, which I think is reasonable:

Setting history.persistenceMaxQPS: 0 in dynamic config (intended to mean "unlimited") causes all queue reader host-level rate limiters to be created with rate=0, burst=0. This results in every loadAndSubmitTasks call failing its Wait() and logging an unthrottled error with full stacktrace.

Root cause:
NewHostRateLimiterRateFn in service/history/queue_factory_base.go:224-233 falls back to persistenceMaxRPS() * ratio when MaxPollHostRPS=0. If persistenceMaxQPS is also 0, the effective rate becomes 0 * 0.3 = 0, creating a rate limiter with burst=0. Go's rate.Limiter.ReserveN(now, 1) returns OK()=false when burst < tokens, triggering the error path at service/history/queues/reader.go:433.
Impact:

  • 317K error log lines in a 5-minute run (99.2% of all server log output)
  • Each log line includes a full JSON-serialized stacktrace
  • Affects all queue processors: transfer (106K), timer (105K), archival (104K), visibility (22K), outbound (2K)
  • All 128 shards affected (~2500 errors per shard)
  • Significant CPU and I/O overhead from log serialization

Secondary issue:
The error log at reader.go:433 has no rate limiting despite being in a hot loop. Even when triggered legitimately, it should use a throttled logger.

Suggested fixes:

  1. In NewHostRateLimiterRateFn, handle persistenceMaxRPS() <= 0 by using the default value (9000) or returning math.MaxFloat64
  2. Add rate limiting to the error log at reader.go:433
    Reproduction: Set history.persistenceMaxQPS: 0 in dynamic config, start server with 128 shards, observe log output.

I can of course work on fixing this / these.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions