Skip to content

[ML] Append all data to Chat Completion buffer #127658

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 5, 2025

Conversation

prwhelan
Copy link
Member

@prwhelan prwhelan commented May 2, 2025

Moved the Chat Completion buffer into the
StreamingUnifiedChatCompletionResults so that all Chat Completion responses can benefit from it. Chat Completions is meant to adhere to OpenAI as much as possible, and OpenAI only sends one response chunk at a time. All implementations of Chat Completions will now buffer.

This fixes a bug where more than two chunks in a single item would be dropped, instead they are all added to the buffer.
This fixes a bug where onComplete would omit trailing items in the buffer.

Moved the Chat Completion buffer into the
StreamingUnifiedChatCompletionResults so that all Chat Completion
responses can benefit from it. Chat Completions is meant to adhere to
OpenAI as much as possible, and OpenAI only sends one response chunk at
a time. All implementations of Chat Completions will now buffer.

This fixes a bug where more than two chunks in a single item would be
dropped, instead they are all added to the buffer.

This fixes a bug where onComplete would omit trailing items in the
buffer.
@prwhelan prwhelan added >bug :ml Machine learning Team:ML Meta label for the ML team auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 v8.18.2 v9.0.2 labels May 2, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @prwhelan, I've created a changelog YAML for you.

@prwhelan prwhelan marked this pull request as ready for review May 2, 2025 21:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Contributor

@jonathan-buttner jonathan-buttner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, left some questions.

subscription.request(n);
}
} else {
downstream.onNext(new Results(DequeUtils.of(buffer.poll())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there only 1 thread accessing the buffer? Or is there a chance that we could check for isEmpty() and then some other thread picks up the item before this thread?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only 1 thread calling request, but we can be safe and change to bufer.poll() and check if it's null

@Override
public void onComplete() {}
});
assertThat(counter.get(), equalTo(2));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make sure I understand, does this test that we only get a 1 result even if we have multiple in a single item?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes because we'll only call onNext once per chunk, so if we send a chunk of 2 elements then counter will equal 1. Let me change to mockito spies so that's easier to read (I think)

@prwhelan prwhelan enabled auto-merge (squash) May 5, 2025 19:04
@prwhelan prwhelan merged commit b108e39 into elastic:main May 5, 2025
16 of 17 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
8.18 Commit could not be cherrypicked due to conflicts
9.0 Commit could not be cherrypicked due to conflicts

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 127658

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >bug :ml Machine learning Team:ML Meta label for the ML team v8.18.2 v8.19.0 v9.0.2 v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants