Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#3016 New expand PPL command #3305

Draft
wants to merge 81 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
f462e2a
Add flatten command to ANTLR lexer and parser.
currantw Jan 17, 2025
69f0b1a
Skeleton implementation, tests, and documents with lots of TODOs.
currantw Jan 20, 2025
c1ac737
Initial implementation
currantw Jan 20, 2025
366e162
Fix typo
currantw Jan 24, 2025
0cbd8d4
Initial implementation
currantw Jan 27, 2025
26e9443
Update/fix tests.
currantw Jan 27, 2025
237b69e
Update integration tests to align with doc tests.
currantw Jan 31, 2025
3981c38
Minor cleanup.
currantw Jan 28, 2025
2ca7194
Add `ExplainIT` tests for flatten
currantw Jan 28, 2025
9ddfc4a
Revert recursive flattening, add documentation, more test updates
currantw Jan 28, 2025
c54c1f5
One more doctest fix
currantw Jan 28, 2025
8993e11
Fix `ExplainIT` error
currantw Jan 28, 2025
288add2
Add additional test case to `flatten.rst`
currantw Jan 28, 2025
eca3154
Fix `FlattenCommandIT`, add additional test case.
currantw Jan 28, 2025
c89a302
Fix `PhysicalPlanNodeVisitor` test coverage.
currantw Jan 28, 2025
9b2e9ce
Review: use `StringUtils.format` instead of `String.format`.
currantw Jan 29, 2025
82c8ccb
Fix `LogicalFlattenTest`.
currantw Jan 29, 2025
b7d8794
Simplify algorithm for `Analyzer`.
currantw Jan 29, 2025
ca013ef
Update to support flattening nested structs.
currantw Jan 30, 2025
7920bd8
Fix unrelated bug in `IPFUnctionsTest`.
currantw Jan 30, 2025
9d6459f
Update `IPFUnctionsTest` to anchor at start.
currantw Jan 30, 2025
6d040eb
Minor cleanup.
currantw Jan 30, 2025
43c0902
Fix doctest formatting.
currantw Jan 30, 2025
40362bf
Address minor review comments.
currantw Jan 30, 2025
b0a6710
Fix doc tests.
currantw Jan 31, 2025
be26660
Update integratation tests to align with doc tests.
currantw Jan 31, 2025
b3e4401
Review - minor documentation updates.
currantw Jan 31, 2025
4099f10
Remove double periods
currantw Feb 1, 2025
b96cefa
Add comment on `Map.equals`.
currantw Feb 1, 2025
72d98ed
Remove unnecessary error checks.
currantw Feb 1, 2025
4632c03
Update to maintain existing field.
currantw Feb 3, 2025
d755208
Update for test coverage
currantw Feb 3, 2025
09563ab
Simplify `Analyzer` implementation
currantw Feb 3, 2025
1d391ce
Rename `cities` dataset to `flatten`
currantw Feb 5, 2025
ef750f4
SpotlessApply
currantw Feb 5, 2025
14e005e
Minor doc cleanup.
currantw Feb 5, 2025
73885a7
Fix failing IT
currantw Feb 5, 2025
4fbd320
Update incorrect documentation in `Analyzer.visitFlatten`.
currantw Feb 5, 2025
337fb01
Update integ and doc tests to add another example of original field b…
currantw Feb 6, 2025
abe5c6c
Review comment - move example to `Analyzer.visitFlatten` Javadoc.
currantw Feb 6, 2025
a0022f4
Review comment - update `Analyzer.visitFlatten` Javadoc to specify th…
currantw Feb 6, 2025
df99d37
Review comment - remove unnecessary @Getter
currantw Feb 6, 2025
6883214
Review comments - add `testStructNestedDeep` test case
currantw Feb 6, 2025
94a4c8a
Review comments - add `testStructNestedDeep` test case
currantw Feb 6, 2025
26563c9
Woops! Fix failing test.
currantw Feb 6, 2025
bfb51a5
Review comments - extract `PathUtils` constants
currantw Feb 6, 2025
22eccaf
Review comments - update `Analyzer` to not use `Optional`.
currantw Feb 7, 2025
dcd241a
Bunch of additional review comments.
currantw Feb 7, 2025
befe55b
Spotless
currantw Feb 7, 2025
eb93cb1
Spotless
currantw Feb 7, 2025
1f05e85
Additional review comments, including move constants to `ExprValueUti…
currantw Feb 7, 2025
db96c51
Review comments - update tests for exception msg
currantw Feb 7, 2025
c1666ee
Review comments - simplify `FlattenOperator.flattenExprValueAtPath`.
currantw Feb 7, 2025
6e176a3
Change braces in documentation.
currantw Feb 10, 2025
ab5a2fe
Initial implementation of skeleton classes and methods.
currantw Feb 3, 2025
e7e5a5a
Implement some of the `expand` logic
currantw Feb 4, 2025
a8f6855
Add `PathUtils` and unit tests.
currantw Feb 4, 2025
cf357ea
Update `ExpandOperator` and unit tests.
currantw Feb 5, 2025
c260f6b
Implement integration tests, update `Expand` logic, rename data set.
currantw Feb 5, 2025
0d3c33c
Implement integration tests, update `Expand` logic, rename data set.
currantw Feb 5, 2025
209326d
Add `expand.rst` documentation and further updates to tests/implement…
currantw Feb 5, 2025
dd9a024
Unrelated typo fix
currantw Feb 5, 2025
e31df37
Cleanup, modify to return `null` for an empty array.
currantw Feb 6, 2025
274719d
Fix `test_docs.py` typo, order alphabetically.
currantw Feb 10, 2025
459ca24
Minor cleanup, mostly alphabetizing constants.
currantw Feb 10, 2025
eef907f
Add new doc and integration tests
currantw Feb 11, 2025
77afac7
Fix missing test coverage
currantw Feb 11, 2025
49078e9
Move `PathUtils` to `ExprValueUtils` and update tests.
currantw Feb 11, 2025
ff0d5ae
Use `ExprValueUtils` to simplify `FlattenOperator`
currantw Feb 11, 2025
790104c
Simplify and make consistent flatten and expand operators.
currantw Feb 12, 2025
ec47f8f
Update `ExprValueUtils` and unit tests.
currantw Feb 12, 2025
101b653
Make constants private
currantw Feb 12, 2025
ec0f44f
Spotless
currantw Feb 12, 2025
da7738a
Add hashCode unit tests
currantw Feb 12, 2025
4284f6f
Trivial documentation cleanup.
currantw Feb 12, 2025
1e51881
Fix doctest
currantw Feb 12, 2025
c4f732e
General cleanup, combine flatten and expand datasets.
currantw Feb 12, 2025
582fdaa
Spotless
currantw Feb 12, 2025
e8494f2
Fix failing doctests
currantw Feb 12, 2025
028e074
Fix `ExplainIT` test
currantw Feb 13, 2025
01569ae
Handle empty collections.
currantw Feb 13, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Initial implementation of skeleton classes and methods.
Signed-off-by: currantw <taylor.curran@improving.com>
currantw committed Feb 13, 2025
commit ab5a2fef767b8109213a6912a37f22a47be435c7
27 changes: 22 additions & 5 deletions core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
Original file line number Diff line number Diff line change
@@ -38,6 +38,7 @@
import org.apache.commons.lang3.tuple.ImmutablePair;
import org.apache.commons.lang3.tuple.Pair;
import org.opensearch.sql.DataSourceSchemaName;
import org.opensearch.sql.analysis.symbol.Namespace;
import org.opensearch.sql.analysis.symbol.Symbol;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Argument;
@@ -52,6 +53,7 @@
import org.opensearch.sql.ast.tree.CloseCursor;
import org.opensearch.sql.ast.tree.Dedupe;
import org.opensearch.sql.ast.tree.Eval;
import org.opensearch.sql.ast.tree.Expand;
import org.opensearch.sql.ast.tree.FetchCursor;
import org.opensearch.sql.ast.tree.FillNull;
import org.opensearch.sql.ast.tree.Filter;
@@ -333,9 +335,11 @@ public LogicalPlan visitAggregation(Aggregation node, AnalysisContext context) {
TypeEnvironment newEnv = context.peek();
aggregators.forEach(
aggregator ->
newEnv.define(new Symbol(FIELD_NAME, aggregator.getName()), aggregator.type()));
newEnv.define(
new Symbol(Namespace.FIELD_NAME, aggregator.getName()), aggregator.type()));
groupBys.forEach(
group -> newEnv.define(new Symbol(FIELD_NAME, group.getNameOrAlias()), group.type()));
group ->
newEnv.define(new Symbol(Namespace.FIELD_NAME, group.getNameOrAlias()), group.type()));
return new LogicalAggregation(child, aggregators, groupBys);
}

@@ -360,8 +364,9 @@ public LogicalPlan visitRareTopN(RareTopN node, AnalysisContext context) {
context.push();
TypeEnvironment newEnv = context.peek();
groupBys.forEach(
group -> newEnv.define(new Symbol(FIELD_NAME, group.toString()), group.type()));
fields.forEach(field -> newEnv.define(new Symbol(FIELD_NAME, field.toString()), field.type()));
group -> newEnv.define(new Symbol(Namespace.FIELD_NAME, group.toString()), group.type()));
fields.forEach(
field -> newEnv.define(new Symbol(Namespace.FIELD_NAME, field.toString()), field.type()));

List<Argument> options = node.getNoOfResults();
Integer noOfResults = (Integer) options.get(0).getValue().getValue();
@@ -427,7 +432,8 @@ public LogicalPlan visitProject(Project node, AnalysisContext context) {
context.push();
TypeEnvironment newEnv = context.peek();
namedExpressions.forEach(
expr -> newEnv.define(new Symbol(FIELD_NAME, expr.getNameOrAlias()), expr.type()));
expr ->
newEnv.define(new Symbol(Namespace.FIELD_NAME, expr.getNameOrAlias()), expr.type()));
List<NamedExpression> namedParseExpressions = context.getNamedParseExpressions();
return new LogicalProject(child, namedExpressions, namedParseExpressions);
}
@@ -449,6 +455,17 @@ public LogicalPlan visitEval(Eval node, AnalysisContext context) {
return new LogicalEval(child, expressionsBuilder.build());
}

/**
* Builds and returns a {@link org.opensearch.sql.planner.logical.logicalExpand} corresponding to
* the given expand node.
*/
@Override
public LogicalPlan visitExpand(Expand node, AnalysisContext context) {

// TODO #3016: Implement expand command
return null;
}

/**
* Builds and returns a {@link org.opensearch.sql.planner.logical.LogicalFlatten} corresponding to
* the given flatten node, and adds the new fields to the current type environment.
Original file line number Diff line number Diff line change
@@ -44,6 +44,7 @@
import org.opensearch.sql.ast.tree.CloseCursor;
import org.opensearch.sql.ast.tree.Dedupe;
import org.opensearch.sql.ast.tree.Eval;
import org.opensearch.sql.ast.tree.Expand;
import org.opensearch.sql.ast.tree.FetchCursor;
import org.opensearch.sql.ast.tree.FillNull;
import org.opensearch.sql.ast.tree.Filter;
@@ -104,6 +105,10 @@ public T visitRelationSubquery(RelationSubquery node, C context) {
return visitChildren(node, context);
}

public T visitExpand(Expand node, C context) {
return visitChildren(node, context);
}

public T visitTableFunction(TableFunction node, C context) {
return visitChildren(node, context);
}
5 changes: 5 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
Original file line number Diff line number Diff line change
@@ -49,6 +49,7 @@
import org.opensearch.sql.ast.tree.Aggregation;
import org.opensearch.sql.ast.tree.Dedupe;
import org.opensearch.sql.ast.tree.Eval;
import org.opensearch.sql.ast.tree.Expand;
import org.opensearch.sql.ast.tree.FillNull;
import org.opensearch.sql.ast.tree.Filter;
import org.opensearch.sql.ast.tree.Flatten;
@@ -105,6 +106,10 @@ public static Eval eval(UnresolvedPlan input, Let... projectList) {
return new Eval(Arrays.asList(projectList)).attach(input);
}

public Expand expand(UnresolvedPlan input, Field field) {
return new Expand(field).attach(input);
}

public Flatten flatten(UnresolvedPlan input, Field field) {
return new Flatten(field).attach(input);
}
40 changes: 40 additions & 0 deletions core/src/main/java/org/opensearch/sql/ast/tree/Expand.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.ast.tree;

import com.google.common.collect.ImmutableList;
import java.util.List;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.ToString;
import org.opensearch.sql.ast.AbstractNodeVisitor;
import org.opensearch.sql.ast.expression.Field;

/** AST node representing an expand <field> operation. */
@Getter
@ToString
@RequiredArgsConstructor
public class Expand extends UnresolvedPlan {
private UnresolvedPlan child;

@Getter private final Field field;

@Override
public Expand attach(UnresolvedPlan child) {
this.child = child;
return this;
}

@Override
public List<UnresolvedPlan> getChild() {
return ImmutableList.of(child);
}

@Override
public <T, C> T accept(AbstractNodeVisitor<T, C> nodeVisitor, C context) {
return nodeVisitor.visitExpand(this, context);
}
}
9 changes: 9 additions & 0 deletions core/src/main/java/org/opensearch/sql/executor/Explain.java
Original file line number Diff line number Diff line change
@@ -22,6 +22,7 @@
import org.opensearch.sql.planner.physical.AggregationOperator;
import org.opensearch.sql.planner.physical.DedupeOperator;
import org.opensearch.sql.planner.physical.EvalOperator;
import org.opensearch.sql.planner.physical.ExpandOperator;
import org.opensearch.sql.planner.physical.FilterOperator;
import org.opensearch.sql.planner.physical.FlattenOperator;
import org.opensearch.sql.planner.physical.LimitOperator;
@@ -161,6 +162,14 @@ public ExplainResponseNode visitEval(EvalOperator node, Object context) {
ImmutableMap.of("expressions", convertPairListToMap(node.getExpressionList()))));
}

@Override
public ExplainResponseNode visitExpand(ExpandOperator node, Object context) {
return explain(
node,
context,
explainNode -> explainNode.setDescription(ImmutableMap.of("expandField", node.getField())));
}

@Override
public ExplainResponseNode visitFlatten(FlattenOperator node, Object context) {
return explain(
Original file line number Diff line number Diff line change
@@ -10,6 +10,7 @@
import org.opensearch.sql.planner.logical.LogicalCloseCursor;
import org.opensearch.sql.planner.logical.LogicalDedupe;
import org.opensearch.sql.planner.logical.LogicalEval;
import org.opensearch.sql.planner.logical.LogicalExpand;
import org.opensearch.sql.planner.logical.LogicalFetchCursor;
import org.opensearch.sql.planner.logical.LogicalFilter;
import org.opensearch.sql.planner.logical.LogicalFlatten;
@@ -31,6 +32,7 @@
import org.opensearch.sql.planner.physical.CursorCloseOperator;
import org.opensearch.sql.planner.physical.DedupeOperator;
import org.opensearch.sql.planner.physical.EvalOperator;
import org.opensearch.sql.planner.physical.ExpandOperator;
import org.opensearch.sql.planner.physical.FilterOperator;
import org.opensearch.sql.planner.physical.FlattenOperator;
import org.opensearch.sql.planner.physical.LimitOperator;
@@ -101,6 +103,11 @@ public PhysicalPlan visitEval(LogicalEval node, C context) {
return new EvalOperator(visitChild(node, context), node.getExpressions());
}

@Override
public PhysicalPlan visitExpand(LogicalExpand node, C context) {
return new ExpandOperator(visitChild(node, context), node.getFieldRefExp());
}

@Override
public PhysicalPlan visitFlatten(LogicalFlatten node, C context) {
return new FlattenOperator(visitChild(node, context), node.getFieldRefExp());
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.planner.logical;

import java.util.Collections;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.ToString;
import org.opensearch.sql.expression.ReferenceExpression;

/** Logical plan that represent the flatten command. */
@Getter
@ToString
@EqualsAndHashCode(callSuper = true)
public class LogicalExpand extends LogicalPlan {
private final ReferenceExpression fieldRefExp;

public LogicalExpand(LogicalPlan child, ReferenceExpression fieldRefExp) {
super(Collections.singletonList(child));
this.fieldRefExp = fieldRefExp;
}

@Override
public <R, C> R accept(LogicalPlanNodeVisitor<R, C> visitor, C context) {
return visitor.visitExpand(this, context);
}
}
Original file line number Diff line number Diff line change
@@ -97,6 +97,10 @@ public static LogicalPlan eval(
return new LogicalEval(input, Arrays.asList(expressions));
}

public LogicalPlan expand(LogicalPlan input, ReferenceExpression fieldRefExp) {
return new LogicalExpand(input, fieldRefExp);
}

public LogicalPlan flatten(LogicalPlan input, ReferenceExpression fieldRefExp) {
return new LogicalFlatten(input, fieldRefExp);
}
Original file line number Diff line number Diff line change
@@ -72,6 +72,10 @@ public R visitEval(LogicalEval plan, C context) {
return visitNode(plan, context);
}

public R visitExpand(LogicalExpand plan, C context) {
return visitNode(plan, context);
}

public R visitFlatten(LogicalFlatten plan, C context) {
return visitNode(plan, context);
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/*
* Copyright OpenSearch Contributors
* SPDX-License-Identifier: Apache-2.0
*/

package org.opensearch.sql.planner.physical;

import java.util.Collections;
import java.util.List;
import java.util.regex.Pattern;
import lombok.EqualsAndHashCode;
import lombok.Getter;
import lombok.RequiredArgsConstructor;
import lombok.ToString;
import org.opensearch.sql.data.model.ExprValue;
import org.opensearch.sql.data.model.ExprValueUtils;
import org.opensearch.sql.expression.ReferenceExpression;

/** Flattens the specified field from the input and returns the result. */
@Getter
@ToString
@RequiredArgsConstructor
@EqualsAndHashCode(callSuper = false)
public class ExpandOperator extends PhysicalPlan {

private final PhysicalPlan input;
private final ReferenceExpression field;

private static final Pattern PATH_SEPARATOR_PATTERN = Pattern.compile(".", Pattern.LITERAL);

@Override
public <R, C> R accept(PhysicalPlanNodeVisitor<R, C> visitor, C context) {
return visitor.visitExpand(this, context);
}

@Override
public List<PhysicalPlan> getChild() {
return Collections.singletonList(input);
}

@Override
public boolean hasNext() {

// TODO #3016: Implement expand command
return false;
}

@Override
public ExprValue next() {

// TODO #3016: Implement expand command
return ExprValueUtils.nullValue();
}
}
Original file line number Diff line number Diff line change
@@ -60,6 +60,10 @@ public static EvalOperator eval(
return new EvalOperator(input, Arrays.asList(expressions));
}

public ExpandOperator expand(PhysicalPlan input, ReferenceExpression fieldRefExp) {
return new ExpandOperator(input, fieldRefExp);
}

public FlattenOperator flatten(PhysicalPlan input, ReferenceExpression fieldRefExp) {
return new FlattenOperator(input, fieldRefExp);
}
Original file line number Diff line number Diff line change
@@ -56,6 +56,10 @@ public R visitEval(EvalOperator node, C context) {
return visitNode(node, context);
}

public R visitExpand(ExpandOperator node, C context) {
return visitNode(node, context);
}

public R visitFlatten(FlattenOperator node, C context) {
return visitNode(node, context);
}
20 changes: 20 additions & 0 deletions core/src/test/java/org/opensearch/sql/executor/ExplainTest.java
Original file line number Diff line number Diff line change
@@ -11,6 +11,7 @@
import static org.opensearch.sql.ast.tree.RareTopN.CommandType.TOP;
import static org.opensearch.sql.ast.tree.Sort.SortOption.DEFAULT_ASC;
import static org.opensearch.sql.ast.tree.Trendline.TrendlineType.SMA;
import static org.opensearch.sql.data.type.ExprCoreType.ARRAY;
import static org.opensearch.sql.data.type.ExprCoreType.DOUBLE;
import static org.opensearch.sql.data.type.ExprCoreType.INTEGER;
import static org.opensearch.sql.data.type.ExprCoreType.STRING;
@@ -56,6 +57,7 @@
import org.opensearch.sql.expression.ReferenceExpression;
import org.opensearch.sql.expression.aggregation.NamedAggregator;
import org.opensearch.sql.expression.window.WindowDefinition;
import org.opensearch.sql.planner.physical.ExpandOperator;
import org.opensearch.sql.planner.physical.FlattenOperator;
import org.opensearch.sql.planner.physical.PhysicalPlan;
import org.opensearch.sql.planner.physical.TrendlineOperator;
@@ -301,6 +303,24 @@ void can_explain_trendline() {
explain.apply(plan));
}

@Test
void can_explain_expand() {
String fieldName = "field_name";
ReferenceExpression fieldReference = ref(fieldName, ARRAY);

PhysicalPlan plan = new ExpandOperator(tableScan, fieldReference);
ExplainResponse actual = explain.apply(plan);

ExplainResponse expected =
new ExplainResponse(
new ExplainResponseNode(
"ExpandOperator",
ImmutableMap.of("expandField", fieldReference),
singletonList(tableScan.explainNode())));

assertEquals(expected, actual, "explain expand");
}

@Test
void can_explain_flatten() {
String fieldName = "field_name";
Loading