You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enable developers to define dataset columns that represent transformations of 1+ other dataset columns.
The actual aql might look like the following:
{% set aql %}
using customer_stream
select all activity_1 (
customer_id as customer_id,
activity_at as activity_1_at
)
append first after activity_2 (
activity_at as activity_2_at
)
derive (
datediff('d', ${activity_1_at}, ${activity_2_at}) as time_to_activity_2_days
)
{% endset %}
The resulting dataset schema should be:
customer_id (str)
activity_1_at (ts)
activity_2_at (ts)
time_to_activity_2_days (float)
Open questions:
How to identify the data type of the derived column? first-level dataset columns can be inferred because the data type of the attribute and any aggregation function applied are both known, but arbitrary sql can (and should) be used in defining these transformations
How to identify multiple derived columns? Currently columns are parsed based on the logic that a comma is only expected at the end of the column alias, but arbitrary sql (which include commas) will be used, which will break the aforementioned parsing logic
How to apply aggregations to derived columns?
Not supported for now - need to figure out base dataset aggregation workflow semantics
How necessary are these features in aql, if the goal is interfacing in a BI layer?
Very - need a code-centric interface to enable automated maintenance/upkeep of dataset columns as they are canonized
The text was updated successfully, but these errors were encountered:
Enable developers to define dataset columns that represent transformations of 1+ other dataset columns.
The actual aql might look like the following:
The resulting dataset schema should be:
Open questions:
The text was updated successfully, but these errors were encountered: