Open
Description
Is your feature request related to a problem?
OpenSearch PPL currently lacks the intuitive full-text search capabilities that users enjoy in DQL. Users want to perform simple string searches across their datasets without complex query syntax or they want to understand where strings are in their documents to perform complex analytics. This limitation becomes particularly noticeable for users who are familiar with both DQL and PPL.
What solution would you like?
Implement DQL-like full-text search capabilities in PPL with the following features:
- Simple string search syntax similar to DQL
- Support for highlighted fields in search results
- Smart handling of different data sources with:
- Optimized performance for OpenSearch indices
- Configurable limitations for external data sources such as object stores
- Clear user feedback on performance implications
Implementation details:
- Add syntax support for basic text queries
- Implement field highlighting functionality
- Create intelligent query routing based on data source
- Develop cost-aware query execution for external sources
- Add user warnings/notifications for potentially expensive operations
What alternatives have you considered?
- Maintaining current PPL syntax and recommending DQL for full-text search needs
- Creating a hybrid syntax that combines PPL and DQL approaches
- Implementing a simplified subset of full-text search capabilities for external sources only
These alternatives were considered less optimal because:
- They don't address the user need for consistent search experience
- Could create more confusion with multiple search syntaxes
- Wouldn't provide the seamless experience users expect
Do you have any additional context?
- Performance considerations:
- OpenSearch indices: Native optimization possible
- S3: Need to consider data transfer costs and query optimization
- User experience requirements:
- Clear indicators for search performance expectations
- Visual feedback for potentially costly operations
- Documentation for best practices with different data sources
- Implementation complexity varies by data source
- Need for proper testing across various data volumes and sources