Skip to content

Speed up time-series aggregation #127444

Open
@dnhatn

Description

@dnhatn

Time-series aggregations, such as {agg}_over_time and rate, against time-series indices are currently slow due to several reasons:

  1. They require two phases:
    • First, grouping by each time-series (by tsid and timebucket).
    • Then, grouping by user-specified groups.
  2. For rate aggregations, data must be provided in timestamp order per time-series.

This issue proposes some ideas and tracks optimizations to improve the performance of time-series aggregations in ES|QL.

Source command

Execution

Values aggregation

Block hash

Planning

  • Use a single aggregation for the second phase.
  • Optimize for a single target index.
  • Skip backing indices with start_time and end_time outside the TRANGE filter.

Misc

Migrated from 105397 and to be considered

  • Add support of sparse index to easily navigate a time series documents (Sparse index for tsdb #95701). This is required for determining the last value of a metric and skipping to the next last value of the next time serie. And other functionally like interpolation and geo fencing. Additionally a query may be too selective, and mask documents which are valid metric of a time serie. A sparse index would allow us to access the metrics even if that would be the case.
  • Enhancing the time serie grouping operator to also group by time series and time interval. A typical use case would group by time serie and time interval. This is when the BUCKET syntax is used.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions