@@ -37,8 +37,8 @@ class KafkaProducer(object):
37
37
The producer is thread safe and sharing a single producer instance across
38
38
threads will generally be faster than having multiple instances.
39
39
40
- The producer consists of a pool of buffer space that holds records that
41
- haven't yet been transmitted to the server as well as a background I/O
40
+ The producer consists of a RecordAccumulator which holds records that
41
+ haven't yet been transmitted to the server, and a Sender background I/O
42
42
thread that is responsible for turning these records into requests and
43
43
transmitting them to the cluster.
44
44
@@ -77,6 +77,47 @@ class KafkaProducer(object):
77
77
The key_serializer and value_serializer instruct how to turn the key and
78
78
value objects the user provides into bytes.
79
79
80
+ From Kafka 0.11, the KafkaProducer supports two additional modes:
81
+ the idempotent producer and the transactional producer.
82
+ The idempotent producer strengthens Kafka's delivery semantics from
83
+ at least once to exactly once delivery. In particular, producer retries
84
+ will no longer introduce duplicates. The transactional producer allows an
85
+ application to send messages to multiple partitions (and topics!)
86
+ atomically.
87
+
88
+ To enable idempotence, the `enable_idempotence` configuration must be set
89
+ to True. If set, the `retries` config will default to `float('inf')` and
90
+ the `acks` config will default to 'all'. There are no API changes for the
91
+ idempotent producer, so existing applications will not need to be modified
92
+ to take advantage of this feature.
93
+
94
+ To take advantage of the idempotent producer, it is imperative to avoid
95
+ application level re-sends since these cannot be de-duplicated. As such, if
96
+ an application enables idempotence, it is recommended to leave the
97
+ `retries` config unset, as it will be defaulted to `float('inf')`.
98
+ Additionally, if a :meth:`~kafka.KafkaProducer.send` returns an error even
99
+ with infinite retries (for instance if the message expires in the buffer
100
+ before being sent), then it is recommended to shut down the producer and
101
+ check the contents of the last produced message to ensure that it is not
102
+ duplicated. Finally, the producer can only guarantee idempotence for
103
+ messages sent within a single session.
104
+
105
+ To use the transactional producer and the attendant APIs, you must set the
106
+ `transactional_id` configuration property. If the `transactional_id` is
107
+ set, idempotence is automatically enabled along with the producer configs
108
+ which idempotence depends on. Further, topics which are included in
109
+ transactions should be configured for durability. In particular, the
110
+ `replication.factor` should be at least `3`, and the `min.insync.replicas`
111
+ for these topics should be set to 2. Finally, in order for transactional
112
+ guarantees to be realized from end-to-end, the consumers must be
113
+ configured to read only committed messages as well.
114
+
115
+ The purpose of the `transactional_id` is to enable transaction recovery
116
+ across multiple sessions of a single producer instance. It would typically
117
+ be derived from the shard identifier in a partitioned, stateful,
118
+ application. As such, it should be unique to each producer instance running
119
+ within a partitioned application.
120
+
80
121
Keyword Arguments:
81
122
bootstrap_servers: 'host[:port]' string (or list of 'host[:port]'
82
123
strings) that the producer should contact to bootstrap initial
0 commit comments