Skip to content

Commit cf58d3a

Browse files
authored
[Feature] implement cardinality(), element_at() (backport #22846) (#23389)
* [Feature] implement cardinality(), element_at()(backport #22846) Signed-off-by: Zhuhe Fang <[email protected]> * update codes Signed-off-by: Zhuhe Fang <[email protected]> --------- Signed-off-by: Zhuhe Fang <[email protected]>
1 parent 84c8bcd commit cf58d3a

File tree

11 files changed

+191
-1
lines changed

11 files changed

+191
-1
lines changed

docs/TOC.md

+4
Original file line numberDiff line numberDiff line change
@@ -359,6 +359,8 @@
359359
+ [array_sum](./sql-reference/sql-functions/array-functions/array_sum.md)
360360
+ [arrays_overlap](./sql-reference/sql-functions/array-functions/arrays_overlap.md)
361361
+ [array_to_bitmap](./sql-reference/sql-functions/array-functions/array_to_bitmap.md)
362+
+ [cardinality](./sql-reference/sql-functions/array-functions/cardinality.md)
363+
+ [element_at](./sql-reference/sql-functions/array-functions/element_at.md)
362364
+ [reverse](./sql-reference/sql-functions/array-functions/reverse.md)
363365
+ [unnest](./sql-reference/sql-functions/array-functions/unnest.md)
364366
+ Bit Functions
@@ -414,6 +416,8 @@
414416
+ [json_query](./sql-reference/sql-functions/json-functions/json-query-and-processing-functions/json_query.md)
415417
+ [json_string](./sql-reference/sql-functions/json-functions/json-query-and-processing-functions/json_string.md)
416418
+ Map Functions
419+
+ [cardinality](./sql-reference/sql-functions/map-functions/cardinality.md)
420+
+ [element_at](./sql-reference/sql-functions/map-functions/element_at.md)
417421
+ [map_apply](./sql-reference/sql-functions/map-functions/map_apply.md)
418422
+ [map_filter](./sql-reference/sql-functions/map-functions/map_filter.md)
419423
+ [map_from_arrays](./sql-reference/sql-functions/map-functions/map_from_arrays.md)

docs/sql-reference/sql-functions/array-functions/array_length.md

+2
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
Returns the number of elements in the array. The result type is INT. if the parameter is NULL, the result is also NULL.
66

7+
It has an alias [cardinality()](cardinality.md).
8+
79
## Syntax
810

911
```Haskell
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# cardinality
2+
3+
## Description
4+
5+
Returns the number of elements in the array. The result type is INT. if the parameter is NULL, the result is also NULL.
6+
7+
It is the alias of [array_length()](array_length.md).
8+
9+
## Syntax
10+
11+
```Haskell
12+
cardinality(any_array)
13+
```
14+
15+
## Examples
16+
17+
```plain text
18+
mysql> select cardinality([1,2,3]);
19+
+-----------------------+
20+
| cardinality([1,2,3]) |
21+
+-----------------------+
22+
| 3 |
23+
+-----------------------+
24+
1 row in set (0.00 sec)
25+
mysql> select cardinality([[1,2], [3,4]]);
26+
+-----------------------------+
27+
| cardinality([[1,2],[3,4]]) |
28+
+-----------------------------+
29+
| 2 |
30+
+-----------------------------+
31+
1 row in set (0.01 sec)
32+
```
33+
34+
## keyword
35+
36+
CARDINALITY,ARRAY_LENGTH,ARRAY
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# element_at
2+
3+
## Description
4+
5+
Returns the array element at the specific position. if any parameter is NULL, the result is also NULL.
6+
7+
It is the alias of `[]` to get an element from an array.
8+
9+
## Syntax
10+
11+
```Haskell
12+
element_at(any_array, position)
13+
```
14+
position is an integer, which is in [1, any_array.lenght], other values will return null.
15+
16+
## Examples
17+
18+
```plain text
19+
mysql> select element_at([11,2,3],3);
20+
+-----------------------+
21+
| element_at([11,2,3]) |
22+
+-----------------------+
23+
| 3 |
24+
+-----------------------+
25+
1 row in set (0.00 sec)
26+
```
27+
28+
## keyword
29+
30+
ELEMENT_AT, ARRAY
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# cardinality
2+
3+
## Description
4+
5+
Returns the number of elements in a MAP value, it is an alias of [map_size()](map_size.md).
6+
7+
From version 2.5, StarRocks supports querying complex data types MAP and STRUCT from data lakes. MAP is an unordered collection of key-value pairs, for example, `{"a":1, "b":2}`. One key-value pair constitutes one element, for example, `{"a":1, "b":2}` contains two elements.
8+
9+
You can use external catalogs provided by StarRocks to query MAP and STRUCT data from Apache Hive™, Apache Hudi, and Apache Iceberg. You can only query data from ORC and Parquet files. For more information about how to use external catalogs to query external data sources, see [Overview of catalogs](../../../data_source/catalog/catalog_overview.md) and topics related to the required catalog type.
10+
11+
## Syntax
12+
13+
```Haskell
14+
INT cardinality(any_map)
15+
```
16+
17+
## Parameters
18+
19+
`any_map`: the MAP value from which you want to retrieve the number of elements.
20+
21+
## Return value
22+
23+
Returns a value of the INT value.
24+
25+
If the input is NULL, NULL is returned.
26+
27+
If a key or value in the MAP value is NULL, NULL is processed as a normal value.
28+
29+
## Examples
30+
31+
This example uses the Hive table `hive_map`, which contains the following data:
32+
33+
```Plain
34+
select * from hive_map order by col_int;
35+
+---------+---------------+
36+
| col_int | col_map |
37+
+---------+---------------+
38+
| 1 | {"a":1,"b":2} |
39+
| 2 | {"c":3} |
40+
| 3 | {"d":4,"e":5} |
41+
+---------+---------------+
42+
3 rows in set (0.05 sec)
43+
```
44+
45+
After a Hive catalog is created in your database, you can use this catalog and the cardinality() function to obtain the number of elements in each row of the `cardinality` column.
46+
47+
```Plaintext
48+
select cardinality(col_map) from hive_map order by col_int;
49+
+----------------------+
50+
| cardinality(col_map) |
51+
+----------------------+
52+
| 2 |
53+
| 1 |
54+
| 2 |
55+
+----------------------+
56+
3 rows in set (0.05 sec)
57+
```
58+
59+
## keyword
60+
61+
CARDINALITY,MAP_LENGTH,MAP
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# element_at
2+
3+
## Description
4+
5+
Returns the map element with the specific key. if any parameter is NULL, the result is also NULL.
6+
7+
It is the alias of `[]` to get an element from a map.
8+
9+
## Syntax
10+
11+
```Haskell
12+
element_at(any_map, any_key)
13+
```
14+
if any_key exists in any_map, the specific key-value pair will return, otherwise return null.
15+
16+
## Examples
17+
18+
```plain text
19+
mysql> select element_at({1:3,2:3},1);
20+
+-------------------------+
21+
| element_at({1:3,2:3},1) |
22+
+-------------------------+
23+
| {1:3} |
24+
+-------------------------+
25+
1 row in set (0.00 sec)
26+
mysql> select element_at({1:3,2:3},3);
27+
+-------------------------+
28+
| element_at({1:3,2:3},3) |
29+
+-------------------------+
30+
| NULL |
31+
+-------------------------+
32+
1 row in set (0.00 sec)
33+
```
34+
35+
## keyword
36+
37+
ELEMENT_AT, MAP

docs/sql-reference/sql-functions/map-functions/map_size.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Description
44

5-
Returns the number of elements in a MAP value.
5+
Returns the number of elements in a MAP value, it has an alias [cardinality()](cardinality.md).
66

77
From version 2.5, StarRocks supports querying complex data types MAP and STRUCT from data lakes. MAP is an unordered collection of key-value pairs, for example, `{"a":1, "b":2}`. One key-value pair constitutes one element, for example, `{"a":1, "b":2}` contains two elements.
88

fe/fe-core/src/main/java/com/starrocks/catalog/FunctionSet.java

+2
Original file line numberDiff line numberDiff line change
@@ -300,6 +300,8 @@ public class FunctionSet {
300300
public static final String ARRAY_FILTER = "array_filter";
301301
public static final String ARRAY_SORTBY = "array_sortby";
302302

303+
public static final String ELEMENT_AT = "element_at";
304+
public static final String CARDINALITY = "cardinality";
303305
// Bit functions:
304306
public static final String BITAND = "bitand";
305307
public static final String BITNOT = "bitnot";

fe/fe-core/src/main/java/com/starrocks/sql/parser/AstBuilder.java

+8
Original file line numberDiff line numberDiff line change
@@ -5007,6 +5007,14 @@ public ParseNode visitSimpleFunctionCall(StarRocksParser.SimpleFunctionCallConte
50075007
intervalLiteral.getUnitIdentifier().getDescription());
50085008
}
50095009

5010+
if (functionName.equals(FunctionSet.ELEMENT_AT)) {
5011+
List<Expr> params = visit(context.expression(), Expr.class);
5012+
if (params.size() != 2) {
5013+
throw new ParsingException("element_at() should accept 2 arguments, but there are " + params.size());
5014+
}
5015+
return new CollectionElementExpr(params.get(0), params.get(1));
5016+
}
5017+
50105018
if (functionName.equals(FunctionSet.ISNULL)) {
50115019
List<Expr> params = visit(context.expression(), Expr.class);
50125020
if (params.size() != 1) {

fe/fe-core/src/test/java/com/starrocks/sql/analyzer/AnalyzeArrayTest.java

+5
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,12 @@ public void testArray() {
6161
" array_contains([true, false, true], true),\n" +
6262
" array_contains([true, false, true], false)");
6363
analyzeSuccess("select array_length(null)");
64+
analyzeSuccess("select cardinality(null)");
65+
analyzeSuccess("select cardinality([])");
66+
analyzeSuccess("select element_at([3,2], 0)");
6467

68+
analyzeFail("select element_at([3,2])");
69+
analyzeFail("select element_at(1,[3,2])");
6570
analyzeFail("select array_concat([])");
6671
}
6772

gensrc/script/functions.py

+5
Original file line numberDiff line numberDiff line change
@@ -881,6 +881,11 @@
881881
[170002, 'map_values', 'ANY_ARRAY', ['ANY_MAP'], 'MapFunctions::map_values'],
882882
[170003, 'map_from_arrays', 'ANY_MAP', ['ANY_ARRAY', 'ANY_ARRAY'], 'MapFunctions::map_from_arrays'],
883883
[170004, 'map_apply', 'ANY_MAP', ['FUNCTION', 'ANY_MAP'], 'nullptr'],
884+
885+
# map, array common functions
886+
[170100, 'cardinality', 'INT', ['ANY_MAP'], 'MapFunctions::map_size'],
887+
[170101, 'cardinality', 'INT', ['ANY_ARRAY'], 'ArrayFunctions::array_length'],
888+
884889
# struct functions
885890
# [170500, 'row', 'ANY_STRUCT', ['ANY_ELEMENT', "..."], 'StructFunctions::row'],
886891
]

0 commit comments

Comments
 (0)