Skip to content

Commit e387e9b

Browse files
committed
Expand KafkaProducer docstring
1 parent 75ae411 commit e387e9b

File tree

1 file changed

+43
-2
lines changed

1 file changed

+43
-2
lines changed

kafka/producer/kafka.py

Lines changed: 43 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -37,8 +37,8 @@ class KafkaProducer(object):
3737
The producer is thread safe and sharing a single producer instance across
3838
threads will generally be faster than having multiple instances.
3939
40-
The producer consists of a pool of buffer space that holds records that
41-
haven't yet been transmitted to the server as well as a background I/O
40+
The producer consists of a RecordAccumulator which holds records that
41+
haven't yet been transmitted to the server, and a Sender background I/O
4242
thread that is responsible for turning these records into requests and
4343
transmitting them to the cluster.
4444
@@ -77,6 +77,47 @@ class KafkaProducer(object):
7777
The key_serializer and value_serializer instruct how to turn the key and
7878
value objects the user provides into bytes.
7979
80+
From Kafka 0.11, the KafkaProducer supports two additional modes:
81+
the idempotent producer and the transactional producer.
82+
The idempotent producer strengthens Kafka's delivery semantics from
83+
at least once to exactly once delivery. In particular, producer retries
84+
will no longer introduce duplicates. The transactional producer allows an
85+
application to send messages to multiple partitions (and topics!)
86+
atomically.
87+
88+
To enable idempotence, the `enable_idempotence` configuration must be set
89+
to True. If set, the `retries` config will default to `float('inf')` and
90+
the `acks` config will default to 'all'. There are no API changes for the
91+
idempotent producer, so existing applications will not need to be modified
92+
to take advantage of this feature.
93+
94+
To take advantage of the idempotent producer, it is imperative to avoid
95+
application level re-sends since these cannot be de-duplicated. As such, if
96+
an application enables idempotence, it is recommended to leave the
97+
`retries` config unset, as it will be defaulted to `float('inf')`.
98+
Additionally, if a :meth:`~kafka.KafkaProducer.send` returns an error even
99+
with infinite retries (for instance if the message expires in the buffer
100+
before being sent), then it is recommended to shut down the producer and
101+
check the contents of the last produced message to ensure that it is not
102+
duplicated. Finally, the producer can only guarantee idempotence for
103+
messages sent within a single session.
104+
105+
To use the transactional producer and the attendant APIs, you must set the
106+
`transactional_id` configuration property. If the `transactional_id` is
107+
set, idempotence is automatically enabled along with the producer configs
108+
which idempotence depends on. Further, topics which are included in
109+
transactions should be configured for durability. In particular, the
110+
`replication.factor` should be at least `3`, and the `min.insync.replicas`
111+
for these topics should be set to 2. Finally, in order for transactional
112+
guarantees to be realized from end-to-end, the consumers must be
113+
configured to read only committed messages as well.
114+
115+
The purpose of the `transactional_id` is to enable transaction recovery
116+
across multiple sessions of a single producer instance. It would typically
117+
be derived from the shard identifier in a partitioned, stateful,
118+
application. As such, it should be unique to each producer instance running
119+
within a partitioned application.
120+
80121
Keyword Arguments:
81122
bootstrap_servers: 'host[:port]' string (or list of 'host[:port]'
82123
strings) that the producer should contact to bootstrap initial

0 commit comments

Comments
 (0)