Skip to content

Commit b6cb0e7

Browse files
committed
Update PIP-445 based on review feedback
1 parent a640434 commit b6cb0e7

File tree

1 file changed

+35
-30
lines changed

1 file changed

+35
-30
lines changed

pip/pip-445.md

Lines changed: 35 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# PIP-445: Expose Raw Message Access in TableView API
1+
# PIP-445: Add Builder Methods to Create Message-based TableView
22

33
# Background knowledge
44

@@ -12,60 +12,63 @@ The current `TableView` API provides a `get(String key)` method that only return
1212

1313
For instance, a user might need to inspect the message **properties** to get a trace-id or check the **event time** to determine if the data is recent. Currently, the only way to access this metadata is to create a separate, redundant `Consumer` on the same topic, which is inefficient and undermines the convenience of using a `TableView`.
1414

15-
This proposal aims to solve this problem by adding a method to the `TableView` API to retrieve the entire, raw `Message<T>` object.
15+
This proposal aims to solve this problem by providing a way to create a `TableView` that exposes the entire `Message<T>` object.
1616

1717
# Goals
1818

1919
## In Scope
2020

21-
* Add a new method, `getRawMessage(String key)`, to the `TableView<T>` interface.
22-
* Allow users to access the complete `Message<T>` object, including its payload, properties, and all other metadata.
23-
* Ensure the change is fully backward-compatible with the existing API.
21+
* Add new methods, `createForMessages()` and `createForMessagesAsync()`, to the `TableViewBuilder<T>` interface.
22+
* Allow users to create a `TableView<Message<T>>` instance, which provides access to the complete `Message<T>` object for each key, including its payload, properties, and all other metadata.
23+
* Ensure the change is fully backward-compatible and does not impact the performance of existing `TableView` users.
2424

2525
## Out of Scope
2626

27-
* Modifying the behavior of the existing `get(String key)` method.
27+
* Modifying the behavior of the existing `create()` and `createAsync()` methods in the builder.
2828
* Changing the underlying topic compaction logic or any broker-side functionality.
2929

3030
# High Level Design
3131

32-
The proposed solution is a simple and non-breaking addition to the public client API.
32+
The proposed solution is a simple and non-breaking addition to the public client API. Instead of modifying the existing `TableView` implementation, we will introduce new methods to the `TableViewBuilder`.
3333

34-
1. A new method, `Message<T> getRawMessage(String key)`, will be added to the `TableView<T>` interface.
35-
2. The internal implementation (`TableViewImpl`) will be updated to store the entire `Message<T>` object for each key, instead of just the deserialized value.
36-
3. The existing `get(String key)` method will remain unchanged in its signature and behavior. Internally, it will now retrieve the stored `Message<T>` and simply return its value, ensuring 100% backward compatibility.
34+
1. New methods, `TableView<Message<T>> createForMessages()` and `CompletableFuture<TableView<Message<T>>> createForMessagesAsync()`, will be added to the `TableViewBuilder<T>` interface.
35+
2. These methods will create a new, specialized `TableView` implementation (`MessageTableViewImpl`) that stores the entire `Message<T>` object for each key.
36+
3. The existing `create()` and `createAsync()` methods will continue to create the standard `TableView` that stores only the message value (`T`).
3737

38-
This design provides the new functionality efficiently without impacting existing users.
38+
This opt-in design provides the new functionality efficiently without impacting the performance or behavior of existing `TableView` use cases.
3939

4040
# Detailed Design
4141

4242
## Design & Implementation Details
4343

4444
The changes will be confined to the Pulsar client library.
4545

46-
* **Interface `org.apache.pulsar.client.api.TableView<T>`**:
47-
A new method will be added to this interface.
46+
* **Interface `org.apache.pulsar.client.api.TableViewBuilder<T>`**:
47+
New methods will be added to this interface to create a `TableView` for messages.
48+
49+
* **Class `org.apache.pulsar.client.impl.TableViewBuilderImpl<T>`**:
50+
The new `createForMessages` methods will be implemented to instantiate a new `MessageTableViewImpl`.
51+
52+
* **New Class `org.apache.pulsar.client.impl.MessageTableViewImpl<T>`**:
53+
A new class will be created that implements `TableView<Message<T>>`. It will be based on the existing `TableViewImpl` but its internal map will store `Message<T>` objects instead of just `T` values. Its `get(key)` method will return the full `Message<T>` object.
4854

4955
* **Class `org.apache.pulsar.client.impl.TableViewImpl<T>`**:
50-
The internal data structure, a `ConcurrentNavigableMap`, will be changed to store `Message<T>` objects instead of just `T` values.
51-
* The message listener within `TableViewImpl` will now place the entire received `Message<T>` object into the map.
52-
* The `get(key)` method will be adapted to fetch the `Message<T>` from the map and return `message.getValue()`.
53-
* The new `getRawMessage(key)` method will fetch and return the `Message<T>` object directly.
56+
This class will remain unchanged, ensuring no impact on existing users.
5457

5558
## Public-facing Changes
5659

5760
### Public API
5861

59-
A new method will be added to the `org.apache.pulsar.client.api.TableView<T>` interface.
62+
New methods will be added to the `org.apache.pulsar.client.api.TableViewBuilder<T>` interface.
6063

61-
* **Method Signature**:
64+
* **Method Signatures**:
6265
```java
63-
Message<T> getRawMessage(String key);
66+
TableView<Message<T>> createForMessages() throws PulsarClientException;
67+
68+
CompletableFuture<TableView<Message<T>>> createForMessagesAsync();
6469
```
65-
* **Description**: Retrieves the latest raw `Message<T>` object for the given key. This object contains the message payload as well as all associated metadata (properties, event time, etc.).
66-
* **Parameters**:
67-
* `key`: The `String` key to look up.
68-
* **Return Value**: The `Message<T>` object associated with the key, or `null` if the key does not exist.
70+
* **Description**: Creates a `TableView` instance where the values in the map are the full `Message<T>` objects, including payload and metadata. This allows access to message properties, event time, etc.
71+
* **Return Value**: A `TableView<Message<T>>` instance.
6972

7073
### Binary protocol
7174

@@ -95,27 +98,29 @@ This proposal has no security implications. The new method exposes message metad
9598

9699
This change is fully backward-compatible.
97100

98-
* The addition of a new method to an interface is a non-breaking change for existing applications. Code compiled with an older version of the client library will continue to run without issue.
101+
* The addition of new methods to the builder interface is a non-breaking change. Existing code that uses `create()` or `createAsync()` will continue to function as before with no performance or behavioral changes.
99102

100103
## Upgrade
101104

102-
The upgrade process is seamless. Applications can update their client dependency to a version containing this feature and start using the new method without any other changes.
105+
The upgrade process is seamless. Applications can update their client dependency to a version containing this feature and start using the new builder methods without any other changes.
103106

104107
## Downgrade / Rollback
105108

106-
A downgrade is also seamless. If an application that uses the new `getRawMessage` method is rolled back to an older client version, it will fail at compile time. Applications that do not use the new method can be rolled back without any issues.
109+
A downgrade is also seamless. If an application that uses the new `createForMessages` methods is rolled back to an older client version, it will fail at compile time. Applications that do not use the new methods can be rolled back without any issues.
107110

108111
## Pulsar Geo-Replication Upgrade & Downgrade/Rollback Considerations
109112

110113
This is a client-side change and has no impact on geo-replication.
111114

112115
# Alternatives
113116

114-
An alternative considered was to add multiple specific getter methods to the `TableView` interface, such as `getProperties(String key)` and `getEventTime(String key)`.
117+
## Add `getRawMessage(String key)` to `TableView`
118+
119+
An alternative considered was to add a `getRawMessage(String key)` method directly to the `TableView` interface. This would have required modifying the existing `TableViewImpl` to store the entire `Message<T>` object for all users.
115120

116-
This approach was rejected because it would clutter the API and be less flexible. If new metadata is added to the `Message` object in the future, the `TableView` API would need to be changed again. Returning the entire `Message<T>` object is a cleaner, more efficient, and more future-proof solution.
121+
This approach was rejected because it would be a **breaking change in terms of performance**. It would increase memory and CPU consumption for all `TableView` users, even those who do not need access to the raw message. The proposed builder-based approach is superior as it makes this an opt-in feature, preserving the performance characteristics of the existing `TableView`.
117122

118123
# Links
119124

120125
* Mailing List discussion thread: TBD
121-
* Mailing List voting thread: TBD
126+
* Mailing List voting thread: TBD

0 commit comments

Comments
 (0)