feat(kafka): expose more kafka_franz parameters#4028
feat(kafka): expose more kafka_franz parameters#4028dyurchanka wants to merge 2 commits intoredpanda-data:mainfrom
Conversation
|
|
|
Commits
Review
|
3e8e30e to
e9b5df1
Compare
|
replaced commit with suggested semantic. Removed redpanda plugin changes out of this PR and edited PR description |
|
ah, some other changed were pushed, give me a sec, will fix |
e9b5df1 to
c915715
Compare
…max_in_flight_requests, record_retries, record_delivery_timeout
c915715 to
f977a0b
Compare
|
Must be OK now for review) Sorry |
|
Commits
Review
|
So I confused. It is approved, but do I need to push changes based your last comment? |
|
@dyurchanka all good from my end, I'd like to get others to review that before merging |
Hi, I would like to introduce several changes to kafka_franz output (shared with redpanda output) to expose more go client parameters into redpanda-connect. My main use case is low throughput of kafka output (I tried all, sarama, franz, redpanda). I stopped on kafka_franz for now (I know it is deprecated, but actually redpanda uses franz under the hood) and main issue is default limitation of 10k buffer in kafka_franz which results in limited throughput over latency networks. As well as minor concurrency bug in redpanda output. Also default values for all new params are same as before (equal to kafka_franz defaults)
1: Add configurable
acksfield to redpanda/kafka_franz outputFile:
internal/impl/kafka/franz_writer.goProblem: The
redpandaandkafka_franzoutputs hardcodeacks=all(all ISR replicas) with no way to configure it. Whenidempotent_write: false, users may wantacks=leader(acks=1) for higher throughput at the cost of durability, oracks=none(acks=0) for fire-and-forget scenarios. Previously, the franz-go default ofacks=allwas always used regardless of theidempotent_writesetting.Fix: Add a new
acksfield toFranzProducerFields()with three options:allleaderidempotent_write: false)noneidempotent_write: false)Validation: If
idempotent_write: trueandacksis set to anything other thanall, the config fails with an error at both lint time and runtime. This matches the Kafka protocol requirement that idempotent producers must useacks=all.Config example:
Changes:
kfwFieldAcksconstant andStringAnnotatedEnumFieldwithall,leader,noneoptions (default:all)FranzProducerOptsFromConfig: error ifidempotent_write: true && acks != "all"kgo.AllISRAcks(),kgo.LeaderAck(),kgo.NoAck()this.idempotent_write == true && this.acks.or("all") != "all"for early config validationBackward compatible: Default is
all, matching the previous implicit behavior.2: Expose producer buffer and in-flight tuning fields
File:
internal/impl/kafka/franz_writer.goProblem: The franz-go producer has several critical tuning parameters hardcoded to defaults and not exposed in the
redpanda/kafka_franzoutput config:MaxBufferedRecordsProduce()blocks when hit. For high-throughput pipelines, records arrive faster than they drain, causing the caller goroutine to stall.MaxBufferedBytesMaxProduceRequestsInflightPerBrokerRecordRetriesWriteBatchhangs because callbacks never fire, the process becomes a zombie — alive but non-functional.RecordDeliveryTimeoutdelivery.timeout.ms.Fix: Add five new fields to
FranzProducerLimitsFields():Validation:
max_buffered_recordsmust be >= 1max_buffered_bytesaccepts human-readable sizes ("256MB", "50mib"), 0 disablesmax_in_flight_requestsmust be >= 1; withidempotent_write: true, franz-go internally caps at 5 (Kafka v1+) or 1 (Kafka < v1)record_retries0 = unlimited (default), > 0 = fail after N retriesrecord_delivery_timeout0s = unlimited (default), > 0 = fail after durationidempotent_write: true, both retry/timeout options are only enforced when safe (no invalid sequence numbers)Thanks