Persistence implementations for list pagination #1555

eric-maynard · 2025-05-09T21:08:26Z

In #1528, we introduced the interface changes necessary to paginate requests to listTables, listViews, and listNamespaces. This PR adds the persistence-level logic for pagination and a new PageToken type EntityIdPageToken used to paginate requests based on entity ID.

snazy

I don't think that the approach implemented here yields correct results. The code assumes strict ordering of integer IDs, which is from general experience w/ relational DBs and in particular looking at org.apache.polaris.extension.persistence.relational.jdbc.IdGenerator not the case.

snazy · 2025-05-12T07:50:11Z

...n/java/org/apache/polaris/extension/persistence/relational/jdbc/JdbcBasePersistenceImpl.java

@@ -414,6 +415,11 @@ public <T> Page<T> listEntities(
    // Limit can't be pushed down, due to client side filtering
    // absence of transaction.
    String query = QueryGenerator.generateSelectQuery(new ModelEntity(), params);
+
+    if (pageToken instanceof EntityIdPageToken entityIdPageToken) {
+      query += String.format(" AND id > %d ORDER BY id ASC", entityIdPageToken.getId());


How would this work with org.apache.polaris.extension.persistence.relational.jdbc.IdGenerator?

I think you're right that it won't; this logic is copied from EclipseLink where IDs are always increasing but does not work with the current way that the JDBC metastore creates IDs.

I'd propose to change the Page/PageToken contract in a way to push the parameter "as is" down to the persistence layer and let the persistence implementation deal with it.

I spoke with @singhpk234 who noted this is probably the same discussion as here on the old PR. With that context, I think we might be OK here.

IMO it's alright that the list ordering you'd get across metastores won't be the same. Other than that difference, seems like everything should work with JDBC's IdGenerator. Although the IDs aren't generated sequentially, pagination only uses the entity ID as an essentially arbitrary consistent ordering.

The key implication here is that if an entity gets created in the middle of a listing operation (e.g. between list calls 2 and 3) it may or may not show up in the next page. An alternative would be to try to filter it out so that the behavior is more obvious & consistent, but I think the simple approach that ultimately gives the user a chance to see these new entities is good.

Losing new entities that are stored after pagination start is fine from my POV. The JDBC persistence does not implement catalog-level versioning, so this is unavoidable, I guess.

Agreed that we will naturally lose some entities, the question is whether we are OK with entities stored after pagination start being lost nondeterministically rather than always. Right now, whether the new entity is lost or not depends on what entity ID it gets. If it gets a high entity ID you might see it in a later page and if it gets a low ID you might not.

My thought on this question is "yes", because it's better to show the entity if we can and it simplifies the code.

But if we feel like this is too unintuitive, we can add a secondary filter on the entity's creation time to try and get rid of these entities (on a best-effort basis, since clocks are not perfect).

I think current pagination behaviour wrt concurrent changes is fine.

Making it deterministic would be a great addition to Polaris, but that, I think, has a much broader scope. For example, if an entry is deleted after pagination starts, but a client re-submits a page request using an old token, the new response would still be inconsistent with the old response.

From my POV a complete and deterministic pagination solution implies catalog-level versioning.

eric-maynard · 2025-05-12T17:22:21Z

The code assumes strict ordering of integer IDs

On this note, I think it's not true actually. The code assumes that IDs are sortable but it doesn't rely on any kind of semantic meaning behind this comparison. So IDs can be created totally randomly and you can still paginate simply by breaking that random key space into pages of some size. There's no assumption that new entries will appear at the end, either.

adnanhemani

Small comments, overall LGTM

adnanhemani · 2025-05-13T20:25:39Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/pagination/PageToken.java

+      try {
+        String[] parts = token.split("/");
+        if (parts.length < 1) {
+          throw new IllegalArgumentException("Invalid token format: " + token);
+        } else if (parts[0].equals(EntityIdPageToken.PREFIX)) {
+          int resolvedPageSize = pageSize == null ? Integer.parseInt(parts[2]) : pageSize;
+          return new EntityIdPageToken(Long.parseLong(parts[1]), resolvedPageSize);
+        } else {
+          throw new IllegalArgumentException("Unrecognized page token: " + token);
+        }
+      } catch (NumberFormatException | IndexOutOfBoundsException e) {
+        LOGGER.debug(e.getMessage());
+        throw new IllegalArgumentException("Invalid token format: " + token);
+      }


I, personally, find this fragment a bit more complex than it may need to be. Is there a reason why we cannot defensively check for the right amount array length after splitting it right away? Same for the NumberFormationException?

I wanted to structure the code in a way that obviously leaves the door open for other PageToken implementations -- those would have different array length expectations. So we check the prefix first, and then parse the token using the logic appropriate for the PageToken implementation that the prefix corresponds to. Ideally we could even push this parsing logic down into some method in the PageToken.

In the old PR, there were 2 parseable PageToken implementations. I do agree that it looks a little clunky with the single PageToken implementation we have now. If this really is too confusing I can simplify this logic and then we can re-complicate it if/when we add a new PageToken.

If this really is too confusing I can simplify this logic and then we can re-complicate it if/when we add a new PageToken.

I'd personally prefer this - but don't care enough if we do this or not, I can understand the reasoning.

...ce/common/src/test/java/org/apache/polaris/service/persistence/pagination/PageTokenTest.java

eric-maynard · 2025-05-20T16:30:44Z

@dimas-b, would this also necessitate the creation of a new type which covers both the opaque (string) token and the page-size? I am worried that we are introducing a lot of complexity given that currently all page token types are in fact shared.

dimas-b · 2025-05-22T19:29:11Z

I think it is preferable to delegate all handling of tokens to the Persistence implementations.

We can certainly have a generic holder type (e.g. Token) to avoid passing those parameters as String everywhere and thus add a source of confusion and mistakes.

This should also allow for layered token (should we ever have to paginate over derivatives of lower-level paginated lists).

Also, I think we should distinguish token data in requests and responses. Requests need to provide two pieces of data: A1) a flag whether pagination is requested; A2) page size hint; A3) previous page token. Responses provide B1) actual page size, B2) next page token.

Token A3 is parsed by the same code that produced token B2.

B1 may not equal A2.

Implementations may limit repose sizes when A1 is false.

With that, I think the complexity will actually be reduced and specific token details only need to be considered by the places in code that have to take action based on pagination parameters. WDYT?

eric-maynard · 2025-05-22T22:18:33Z

@dimas-b per the spec, the request provides page-token and page-size. So there is no A1, but the presence of A2 or A3 implies A1.

Turning this page-token into a PageToken actually used to be done with persistence-specific logic, but I refactored that out in response to your comment here which indeed simplified the code significantly.

I'll make some changes to put some of this logic back -- but since the page token types are all shared, I'll try to keep the actual types in core for now.

dimas-b · 2025-05-23T00:00:06Z

Turning this page-token into a PageToken actually used to be done with persistence-specific logic, but I refactored that out in response to #1528 (comment) which indeed simplified the code significantly.

We can certainly keep the parsing / encoding logic in polaris-core. However, the interpretation of tokens (IMHO) should be delegated to the code that actually implements pagination (Persistence). Sorry, if my comments caused confusion 😅

Here's the pseudo-code that might work, I think:

REST API gets string token / page size params and constructs a "PageRequest" without parsing the token (A1, A2, A3).
"PageRequest" goes down to the persistence layer, where each impl. parses the token according to what it expects (B2). Persistence calls core code to do the decoding of the page token and uses a specific token class for that.
Persistence returns data using core classes to encode the next token (B2).
REST API passes the next token to client without interpretation.

eric-maynard · 2025-05-23T02:37:58Z

np @dimas-b, happy to iterate and try to get it right. I tried to push some new changes based on your guidance above which is very helpful. I think this is still cleaner than what we had before, too. Let me know what you think!

dimas-b

Thanks for the update, @eric-maynard ! From my POV the PR is moving in the right direction :) some more comments below.

dimas-b · 2025-05-23T17:29:51Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/pagination/PageRequest.java

+ */
+public class PageRequest {
+  private final Optional<String> pageTokenStringOpt;
+  private final Optional<Integer> pageSizeOpt;


OptionalInt?

Are they functionally different? I haven't come across OptionalInt before but since pageTokenStringOpt is an Optional<String> I thought to just keep both Optionals the same rather than introducing some new type.

Integer can create fluff on the heap for larger values. The Optional side is the same.

Are we really concerned about a few bytes on the heap per each request? By "fluff" you just mean a reference right?

I mean a short-lived Integer object... Not a big deal in this case, just a bit sloppy :)

polaris-core/src/main/java/org/apache/polaris/core/persistence/BasePersistence.java

dimas-b · 2025-05-23T17:35:24Z

...ris-core/src/main/java/org/apache/polaris/core/persistence/pagination/EntityIdPageToken.java

+  public static PageToken fromPageRequest(PageRequest pageRequest) {
+    if (pageRequest.getPageTokenString().isEmpty()) {
+      if (pageRequest.getPageSize().isEmpty()) {
+        return PageToken.readEverything();


It is a bit awkward for a specific conversion method (in class EntityIdPageToken) to return a less specific type (PageToken).

WDYT about using boolean PageRequest.readEverything() instead?

Yeah so this is exactly why previously this logic lived in PageToken which is the less specific type.

We actually should have even more types here, since fromLimit could be optimized return you a PageToken that limits but doesn't sort, instead of always sorting as we have now.

WDYT about using boolean PageRequest.readEverything() instead?

Can you say more about this? We need a way to be able to call buildPageToken in the persistence layer and to get back a PageToken. In the future, we may also have different methods within a given persistence implementation using different page token types, which is why previously the parsing happened up the call stack.

I mean, the code that needs to extract a specific token from PageRequest already knows the exact token class. The "read everything" case is the only exception.

I do not see a need for polymorphic tokens at this state of the code. Each Persistence uses only one token type.

If we later have polymorphic token use cases, let's handle it then and keep the code simple for now.

I mean, the code that needs to extract a specific token from PageRequest already knows the exact token class. The "read everything" case is the only exception.

It's not the only exception -- to make this more clear I've added the limit token type I mentioned above. Now in the current state of the code we have two "exceptions".

Each Persistence uses only one token type.

Each one uses 3 now, and even if you exclude the 2 simple types it's obvious that we will need another type soon for paginating non-entity requests.

Happy to go back to this just being PageToken.buildPageToken if the objection is specifically to EntityIdPageToken letting you build different token types. I think the logic is better encapsulated in PageToken anyway.

Sorry, I guess we're not understanding each other here 😅

I believe page size should not be part of PageToken at all. Instead, let's keep it in PageRequest.

From a higher level view: a client gets some page token, after that the client is able to request the next page of any size. The next page size is not restricted by the token from the previous response.

I'd like to have a similar delineation of concerns on the server side too.

Page size is in the string page-token. The reasons for this were discussed in the previous PR. PageToken is the deserialized string page token.

dimas-b · 2025-05-23T17:45:10Z

...n/java/org/apache/polaris/extension/persistence/relational/jdbc/JdbcBasePersistenceImpl.java

-              : results.stream().filter(entityFilter).map(transformer).collect(Collectors.toList());
-      return Page.fromItems(resultsOrEmpty);
+      List<T> resultsOrEmpty = results.stream().map(transformer).collect(Collectors.toList());
+      return pageToken.buildNextPage(resultsOrEmpty);


It looks like buildNextPage does not have to be a function of the previous page token. It is a function of page request + data + persistence impl. WDYT about: PageRequest.buildPage(List<T> data, Function<T, PageToken> nextToken)?

Here, the call would look like: return pageRequest.buildPage(resultsOrEmpty, EntityIdPageToken::fromLastItem)

Note: the nextToken function will be invoked only when the next page is expected (i.e. not "done" and not "everything").

Side note: if we want to avoid empty last pages (when the list ends exactly on the last element from the query) we may need to request one more entry from the database, but not return it to the caller.

It looks like buildNextPage does not have to be a function of the previous page token.

In this particular implementation, no. In other implementations, such as the previous OffsetPageToken, it is a function of the previous token. Beyond that, this seems intuitive to me because you're essentially adding data to one page to get a new page.

Empty last pages are fine.

OffsetPageToken does not exist (anymore).

That aside, what information would need to flow from the previous page token to the next directly? I suppose all of that is inside PageRequest now 🤔

That aside, what information would need to flow from the previous page token to the next directly?

In the case of OffsetPageToken, the new token's offset was data.size + currentToken.offset. You need the current token, not just the data.

In the case of EntityIdPageToken, you can use the current token's entity ID to validate the new data comes after the current token.

I suppose all of that is inside PageRequest now

In the sense that PageRequest is entirely duplicative of PageToken, yes.

polaris-core/src/main/java/org/apache/polaris/core/persistence/pagination/Page.java

...-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java

dimas-b · 2025-05-23T18:01:39Z

...ris-core/src/main/java/org/apache/polaris/core/persistence/pagination/EntityIdPageToken.java

+  public static final long BASE_ID = MINIMUM_ID - 1;
+
+  private final long entityId;
+  private final int pageSize;


With the new code it looks like pageSize is redundant here. This information is defined by PageRequest.

PageRequest is really only used to get you to a PageToken, so I think it's okay to have both. The page size is fundamentally part of the token.

If anything, PageRequest in its entirety seems redundant since all the relevant information can immediately be represented in a PageToken.

IMHO, a page token is a pointer into the continuation of the response stream. How much is fetched from that stream (size) is not part of the token. This is why I suggested introducing PageRequest.

As I commented elsewhere, in general, the page size may be limited by other factors (sometimes outside user or server's control), e.g. message sizes at the protocol level. Filtering (some example in this PR) affect the page size too. Also, if (in general) two paginated streams are merged, the resultant page size is not necessarily related to the lower stream page sizes.

How much is fetched from that stream (size) is not part of the token.

Why not?

the page size may be limited by other factors

Limits, i.e. running out of data, are not relevant here. The PageToken is a request from the user -- the server may ignore the requested page size just like it may ignore the requested starting point (e.g. entity ID). It does both of these things today.

As I've just commented in another thread, I think it would be nice to track request data in PageRequest and use PageToken only for dealing with "pointer to the next piece of data" aspects.

dimas-b · 2025-05-23T18:04:30Z

polaris-core/src/main/java/org/apache/polaris/core/persistence/pagination/PageRequest.java

+    this.pageSizeOpt = Optional.ofNullable(pageSize);
+  }
+
+  public static PageRequest readEverything() {


I guess ReadEverythingPageToken is no longer necessary now?

It's used in quite a few places, wdym? We could try to rely on isPaginationRequested everywhere but the code currently actually never calls that method. There could be places in the code that have a PageToken and not a PageRequest, and in that case we'd need ReadEverythingPageToken.

I tend to think that "read everything" is represented by PageRequest. Implementations then implement those requests and provide the appropriate token in the response.

In my mind a page token represents a "continuation" of data, but then there's no continuation to "everything" :)

I tend to think that "read everything" is represented by PageRequest

PageToken still describes the page of data a user wants. ReadEverythingPageToken means they want to read everything. We could model this as a lack of a page token, but this simplifies all the code so that it can just handle a PageToken.

We need a way to represent that there's no more continuation, too, which ReadEverythingPageToken lets us do.

snazy

Thanks for the effort to simplify things.
I've got a proposal to further simplify it a bit:
The page token is from a REST API point-of-view an opaque response field from the server so that the client can send a follow-up request with that token to skip already seen results.
The page size is a client (proposal) for the amount of items it wants for a list request.
Bundling those two fields in PageRequest is completely fine, implementations can check whether a token and/or page-size is present - and also enforce a hard limit on the page-size if needed.
However, I think that the PageToken should only contain the opaque token, as it's rather a response-only thing. It cannot "force" or "ask" a client to use a page-size.
WDYT?

eric-maynard · 2025-05-28T16:30:41Z

However, I think that the PageToken should only contain the opaque token, as it's rather a response-only thing. It cannot "force" or "ask" a client to use a page-size.

So PageToken represents the parsed opaque token -- meaning for EntityIdPageToken, we have an entityId field with value 123 which is extracted from a string like entity-id/123/456. But that string also encodes the requested page size (the reasons for this are discussed in the previous PR). So if EntityIdPageToken represents the data in that string, it follows that EntityIdPageToken will contain the requested page size.

This doesn't force the client to do anything, but it lets the client re-use the page size from the previous request if it wants to.

Following up on apache#1555 * Refactor pagination code to delineate page requests and tokens. * Requests deal with the "previous" token, user-provided page size (optional) and the previous request's page size. * Inner page tokens deal only with the Persistence-specific way to point into the (logically) sorted dataset to connected previous and next pages. * Concentrate the logic of combining page size requests and previous tokens in PageTokenUtil * Only one inner page token impl. is necessary now: EntityIdPageToken.

Following up on apache#1555 * Refactor pagination code to delineate API-level page tokens and internal "pointers to data" * Requests deal with the "previous" token, user-provided page size (optional) and the previous request's page size. * Concentrate the logic of combining page size requests and previous tokens in PageTokenUtil * PageToken subclasses are no longer necessary. EntityIdPaging handles pagination over ordered result sets with static helper methods.

Following up on apache#1555 * Refactor pagination code to delineate API-level page tokens and internal "pointers to data" * Requests deal with the "previous" token, user-provided page size (optional) and the previous request's page size. * Concentrate the logic of combining page size requests and previous tokens in PageTokenUtil * PageToken subclasses are no longer necessary. EntityIdPaging handles pagination over ordered result sets with static helper methods. Co-authored-by: Eric Maynard <[email protected]>

Based on apache#1838, following up on apache#1555 * Allows multiple implementations of `Token` referencing the "next page", encapsulated in `PageToken`. No changes to `polaris-core` needed to add custom `Token` implementations. * Extensible to (later) support (cryptographic) signatures to prevent tampered page-token * Refactor pagination code to delineate API-level page tokens and internal "pointers to data" * Requests deal with the "previous" token, user-provided page size (optional) and the previous request's page size. * Concentrate the logic of combining page size requests and previous tokens in `PageTokenUtil` * `PageToken` subclasses are no longer necessary. * Serialzation of `PageToken` uses Jackson serialization (smile format) Since no (metastore level) implementation handling pagination existed before, no backwards compatibility is needed.

github-actions · 2025-06-29T02:13:18Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Based on apache#1838, following up on apache#1555 * Allows multiple implementations of `Token` referencing the "next page", encapsulated in `PageToken`. No changes to `polaris-core` needed to add custom `Token` implementations. * Extensible to (later) support (cryptographic) signatures to prevent tampered page-token * Refactor pagination code to delineate API-level page tokens and internal "pointers to data" * Requests deal with the "previous" token, user-provided page size (optional) and the previous request's page size. * Concentrate the logic of combining page size requests and previous tokens in `PageTokenUtil` * `PageToken` subclasses are no longer necessary. * Serialzation of `PageToken` uses Jackson serialization (smile format) Since no (metastore level) implementation handling pagination existed before, no backwards compatibility is needed.

eric-maynard added 3 commits May 9, 2025 13:10

add pagetoken impl

3375ff5

persistence impls

255e44b

stable

015e637

eric-maynard requested review from adutra, ashvina, dennishuo, dimas-b, jackye1995, jbonofre, vvcephei and collado-mike as code owners May 9, 2025 21:08

github-project-automation bot added this to Basic Kanban Board May 9, 2025

eric-maynard requested review from snazy, RussellSpitzer, takidau, MonkeyCanCode, flyrain, ebyhr, HonahX and pingtimeout as code owners May 9, 2025 21:08

github-project-automation bot moved this to PRs In Progress in Basic Kanban Board May 9, 2025

eric-maynard added 3 commits May 9, 2025 14:12

another test

434ffb1

another small test

8ffc29d

autolint

66d20aa

eric-maynard force-pushed the pagination-persistence branch from 1c383cf to 66d20aa Compare May 9, 2025 21:23

typofix

a5d62ed

snazy requested changes May 12, 2025

View reviewed changes

eric-maynard requested a review from snazy May 12, 2025 17:04

adnanhemani approved these changes May 13, 2025

View reviewed changes

eric-maynard requested a review from ajantha-bhat as a code owner May 23, 2025 02:31

eric-maynard added 3 commits May 22, 2025 19:31

attempt pagerequest refactor

c6611be

pull main

5def8fa

stable

aa90da2

dimas-b reviewed May 23, 2025

View reviewed changes

eric-maynard requested a review from dimas-b May 24, 2025 01:26

snazy reviewed May 26, 2025

View reviewed changes

eric-maynard added 2 commits May 28, 2025 09:36

stable

95f8410

rebase

80aee49

dimas-b mentioned this pull request Jun 9, 2025

Persistence implementation for pagination in some requests #1838

Open

snazy mentioned this pull request Jun 25, 2025

Extensible pagination token implementation #1938

Open

github-actions bot added the Stale label Jun 29, 2025

Persistence implementations for list pagination #1555

Are you sure you want to change the base?

Persistence implementations for list pagination #1555

Uh oh!

Conversation

eric-maynard commented May 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

snazy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard May 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard commented May 12, 2025

Uh oh!

adnanhemani left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

eric-maynard commented May 20, 2025

Uh oh!

dimas-b commented May 22, 2025

Uh oh!

eric-maynard commented May 22, 2025

Uh oh!

dimas-b commented May 23, 2025

Uh oh!

eric-maynard commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard May 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard May 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

eric-maynard commented May 9, 2025 •

edited

Loading

eric-maynard May 12, 2025 •

edited

Loading

eric-maynard May 13, 2025 •

edited

Loading

eric-maynard May 13, 2025 •

edited

Loading

eric-maynard commented May 23, 2025 •

edited

Loading

eric-maynard May 24, 2025 •

edited

Loading

eric-maynard May 28, 2025 •

edited

Loading

eric-maynard May 24, 2025 •

edited

Loading

dimas-b May 26, 2025 •

edited

Loading

eric-maynard May 28, 2025 •

edited

Loading

eric-maynard May 24, 2025 •

edited

Loading

eric-maynard May 28, 2025 •

edited

Loading