Skip to content

[FEATURE] DQL full text search offering in PPL #31

Open
@brijos

Description

@brijos

Is your feature request related to a problem?

OpenSearch PPL currently lacks the intuitive full-text search capabilities that users enjoy in DQL. Users want to perform simple string searches across their datasets without complex query syntax or they want to understand where strings are in their documents to perform complex analytics. This limitation becomes particularly noticeable for users who are familiar with both DQL and PPL.

What solution would you like?

Implement DQL-like full-text search capabilities in PPL with the following features:

  1. Simple string search syntax similar to DQL
  2. Support for highlighted fields in search results
  3. Smart handling of different data sources with:
    • Optimized performance for OpenSearch indices
    • Configurable limitations for external data sources such as object stores
    • Clear user feedback on performance implications

Implementation details:

  • Add syntax support for basic text queries
  • Implement field highlighting functionality
  • Create intelligent query routing based on data source
  • Develop cost-aware query execution for external sources
  • Add user warnings/notifications for potentially expensive operations

What alternatives have you considered?

  1. Maintaining current PPL syntax and recommending DQL for full-text search needs
  2. Creating a hybrid syntax that combines PPL and DQL approaches
  3. Implementing a simplified subset of full-text search capabilities for external sources only

These alternatives were considered less optimal because:

  • They don't address the user need for consistent search experience
  • Could create more confusion with multiple search syntaxes
  • Wouldn't provide the seamless experience users expect

Do you have any additional context?

  • Performance considerations:
    • OpenSearch indices: Native optimization possible
    • S3: Need to consider data transfer costs and query optimization
  • User experience requirements:
    • Clear indicators for search performance expectations
    • Visual feedback for potentially costly operations
    • Documentation for best practices with different data sources
  • Implementation complexity varies by data source
  • Need for proper testing across various data volumes and sources

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions