Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding fast refresh setting during index creation #1074

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

toepkerd
Copy link

@toepkerd toepkerd commented Feb 26, 2025

Description

This change explicitly sets the refresh_interval setting of an OpenSearch index to 1 second during query results index creation. This makes query results searchable sooner, reducing query latency.

Related Issues

List any issues this PR will resolve, e.g. Resolves [...].

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • New added source code should include a copyright header
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@toepkerd toepkerd force-pushed the main branch 2 times, most recently from 5455fc2 to 1104d7c Compare February 27, 2025 00:06
Signed-off-by: Dennis Toepker <[email protected]>
logInfo(s"create $osIndexName")

using(flintClient.createClient()) { client =>
val request = new CreateIndexRequest(osIndexName)
request.mapping(mapping, XContentType.JSON)
request.mapping(mapping, XContentType.JSON).settings(settings, XContentType.JSON)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would happen if settings passed in is None?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The short answer: a similar thing that would happen if the mapping passed in is None.

This approach closely models after how OS index mappings are passed in during index creation. There is only one caller to OSClient.createIndex : FlintJobExecutor.scala. Similar to index mapping, a fixed index setting string is declared at the top of this file as an effective const. This value is then passed into the OS.createIndex call. As such, the index settings inherits this guarantee from the index mapping approach that the passed in value will not be None.

Copy link
Collaborator

@noCharger noCharger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. Let's add

  1. UT on malformed settings
  2. IT to verify if the object is searchable within 1 second

Copy link
Collaborator

@dai-chen dai-chen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add PR description? Any side effect and is it worth making this configurable via Spark conf?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants