Skip to content

Commit 52f47ac

Browse files
committed
[improve][pip] PIP-445: Add Builder Methods to Create Message-based TableView
1 parent 771ce41 commit 52f47ac

File tree

1 file changed

+126
-0
lines changed

1 file changed

+126
-0
lines changed

pip/pip-445.md

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# PIP-445: Add Builder Methods to Create Message-based TableView
2+
3+
# Background knowledge
4+
5+
* **TableView**: In Pulsar, a `TableView` is a client-side abstraction that provides a key-value map interface over a Pulsar topic. It consumes messages from the topic (typically a compacted one) and maintains an in-memory view of the latest value for each key. This allows applications to easily query the current state of a key without managing a consumer manually.
6+
7+
* **Pulsar `Message<T>`**: A Pulsar message is not just its data payload. The `Message<T>` object is a container that includes the deserialized **payload** (`T`) as well as important **metadata**, such as a message key, user-defined properties (a key-value map), event time, publish time, and more.
8+
9+
# Motivation
10+
11+
The current `TableView` API provides a `get(String key)` method that only returns the deserialized **value** (`T`) of the latest message for a given key. This limits its usefulness for applications that need access to the message's metadata.
12+
13+
For instance, a user might need to inspect the message **properties** to get a trace-id or check the **event time** to determine if the data is recent. Currently, the only way to access this metadata is to create a separate, redundant `Consumer` on the same topic, which is inefficient and undermines the convenience of using a `TableView`.
14+
15+
This proposal aims to solve this problem by providing a way to create a `TableView` that exposes the entire `Message<T>` object.
16+
17+
# Goals
18+
19+
## In Scope
20+
21+
* Add new methods, `createForMessages()` and `createForMessagesAsync()`, to the `TableViewBuilder<T>` interface.
22+
* Allow users to create a `TableView<Message<T>>` instance, which provides access to the complete `Message<T>` object for each key, including its payload, properties, and all other metadata.
23+
* Ensure the change is fully backward-compatible and does not impact the performance of existing `TableView` users.
24+
25+
## Out of Scope
26+
27+
* Modifying the behavior of the existing `create()` and `createAsync()` methods in the builder.
28+
* Changing the underlying topic compaction logic or any broker-side functionality.
29+
30+
# High Level Design
31+
32+
The proposed solution is a simple and non-breaking addition to the public client API. Instead of modifying the existing `TableView` implementation, we will introduce new methods to the `TableViewBuilder`.
33+
34+
1. New methods, `TableView<Message<T>> createForMessages()` and `CompletableFuture<TableView<Message<T>>> createForMessagesAsync()`, will be added to the `TableViewBuilder<T>` interface.
35+
2. These methods will create a new, specialized `TableView` implementation (`MessageTableViewImpl`) that stores the entire `Message<T>` object for each key.
36+
3. The existing `create()` and `createAsync()` methods will continue to create the standard `TableView` that stores only the message value (`T`).
37+
38+
This opt-in design provides the new functionality efficiently without impacting the performance or behavior of existing `TableView` use cases.
39+
40+
# Detailed Design
41+
42+
## Design & Implementation Details
43+
44+
The changes will be confined to the Pulsar client library.
45+
46+
* **Interface `org.apache.pulsar.client.api.TableViewBuilder<T>`**:
47+
New methods will be added to this interface to create a `TableView` for messages.
48+
49+
* **Class `org.apache.pulsar.client.impl.TableViewBuilderImpl<T>`**:
50+
The new `createForMessages` methods will be implemented to instantiate a new `MessageTableViewImpl`.
51+
52+
* **New Class `org.apache.pulsar.client.impl.MessageTableViewImpl<T>`**:
53+
A new class will be created that implements `TableView<Message<T>>`. It will be based on the existing `TableViewImpl` but its internal map will store `Message<T>` objects instead of just `T` values. Its `get(key)` method will return the full `Message<T>` object.
54+
55+
* **Class `org.apache.pulsar.client.impl.TableViewImpl<T>`**:
56+
This class will remain unchanged, ensuring no impact on existing users.
57+
58+
## Public-facing Changes
59+
60+
### Public API
61+
62+
New methods will be added to the `org.apache.pulsar.client.api.TableViewBuilder<T>` interface.
63+
64+
* **Method Signatures**:
65+
```java
66+
TableView<Message<T>> createForMessages() throws PulsarClientException;
67+
68+
CompletableFuture<TableView<Message<T>>> createForMessagesAsync();
69+
```
70+
* **Description**: Creates a `TableView` instance where the values in the map are the full `Message<T>` objects, including payload and metadata. This allows access to message properties, event time, etc.
71+
* **Return Value**: A `TableView<Message<T>>` instance.
72+
73+
### Binary protocol
74+
75+
No changes.
76+
77+
### Configuration
78+
79+
No changes.
80+
81+
### CLI
82+
83+
No changes.
84+
85+
### Metrics
86+
87+
No changes.
88+
89+
# Monitoring
90+
91+
No new metrics are introduced by this change. Existing client-side metrics are unaffected.
92+
93+
# Security Considerations
94+
95+
This proposal has no security implications. The new method exposes message metadata that the client is already authorized to receive by consuming the topic. It does not alter any authentication or authorization mechanisms.
96+
97+
# Backward & Forward Compatibility
98+
99+
This change is fully backward-compatible.
100+
101+
* The addition of new methods to the builder interface is a non-breaking change. Existing code that uses `create()` or `createAsync()` will continue to function as before with no performance or behavioral changes.
102+
103+
## Upgrade
104+
105+
The upgrade process is seamless. Applications can update their client dependency to a version containing this feature and start using the new builder methods without any other changes.
106+
107+
## Downgrade / Rollback
108+
109+
A downgrade is also seamless. If an application that uses the new `createForMessages` methods is rolled back to an older client version, it will fail at compile time. Applications that do not use the new methods can be rolled back without any issues.
110+
111+
## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations
112+
113+
This is a client-side change and has no impact on geo-replication.
114+
115+
# Alternatives
116+
117+
## Add `getRawMessage(String key)` to `TableView`
118+
119+
An alternative considered was to add a `getRawMessage(String key)` method directly to the `TableView` interface. This would have required modifying the existing `TableViewImpl` to store the entire `Message<T>` object for all users.
120+
121+
This approach was rejected because it would be a **breaking change in terms of performance**. It would increase memory and CPU consumption for all `TableView` users, even those who do not need access to the raw message. The proposed builder-based approach is superior as it makes this an opt-in feature, preserving the performance characteristics of the existing `TableView`.
122+
123+
# Links
124+
125+
* Mailing List discussion thread: TBD
126+
* Mailing List voting thread: TBD

0 commit comments

Comments
 (0)