-
Notifications
You must be signed in to change notification settings - Fork 360
How to retrieve the start time of the next group from the event table with esProc
After sorting the event table of a certain database table by timestamp, adjacent value fields may sometimes be continuously identical.
| id | value | timestamp |
|---|---|---|
| 1 | 1 | 2023-11-10 13:00:00 |
| 2 | 2 | 2023-11-11 13:00:00 |
| 3 | 2 | 2023-11-12 13:00:00 |
| 4 | 1 | 2023-11-13 13:00:00 |
| 5 | 1 | 2023-11-14 13:00:00 |
| 6 | 1 | 2023-11-15 13:00:00 |
| 7 | 2 | 2023-11-16 13:00:00 |
| 8 | 2 | 2023-11-17 13:00:00 |
| 9 | 1 | 2023-11-18 13:00:00 |
Now we need to group adjacent records with the same value, extract the start time of this group and the start time of the next group as the start and end time of this group, and form a new two-dimensional table. The starting time for the next group of the last group is agreed to be "9999-12-31 00:00:00".
| id | value | effective_from | effective_to |
|---|---|---|---|
| 1 | 1 | 2023-11-10 13:00:00 | 2023-11-11 13:00:00 |
| 2 | 2 | 2023-11-11 13:00:00 | 2023-11-13 13:00:00 |
| 4 | 1 | 2023-11-13 13:00:00 | 2023-11-16 13:00:00 |
| 7 | 2 | 2023-11-16 13:00:00 | 2023-11-18 13:00:00 |
| 9 | 1 | 2023-11-18 13:00:00 | 9999-12-31 00:00:00 |
SQL does not directly support grouping same adjacent values into one group, and is difficult to retain the groups for further calculation, and the indirectly implemented code is very complex. SPL supports grouping adjacent data, and can retain the grouped subsets for further calculation:
https://try.esproc.com/splx?3st
| A | |
|---|---|
| 1 | $select * from equipments_staging.csv order by timestamp |
| 2 | =A1.group@o(value) |
| 3 | =A2.new(id,value, timestamp:effective_from, ifn(~[1].timestamp,datetime("9999-12-31 00:00:00")):effective_to) |
A1: Load data and sort by timestamp.
A2: Divide adjacent records with the same value into one group, with each group being a set. The function group is used for grouping, but not aggregate. By default, it compares the values of the entire column, that is, equivalence grouping. @o represents comparing adjacent values and belongs to ordered grouping. The first three groups are shown in the figure:

The complete code for effective_to is ~1.timestamp, abbreviated as ~[1].timestamp, where ~ represents the current group and can be omitted when used alone, but cannot be omitted when expressing relative position and other meanings; [1] indicate the next group in relative position; (1) represents the first member and can be omitted.
The function ifn takes the first non-null member from the parameter.
SPL Resource: SPL Official Website | SPL Blog | Download esProc SPL | SPL Source Code
