Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](ES Catalog)Only like on keyword can be applied to wildcard query (#41176) #43459

Merged
merged 1 commit into from
Nov 8, 2024

Conversation

qidaye
Copy link
Contributor

@qidaye qidaye commented Nov 7, 2024

bp #41176

apache#41176)

We map `text` and `keyword` both to `string` type in Doris. When enable
`like_push_down`, we translate like to wildcard query in ES, which will
lead unexpected result in `text` field. We should stick to `keyword`
with wildcard query.
1. Add `column2typeMap` in `EsTable` to save the mapping of column_name
to ES field data type.
2. Add new class `EsSchemaCacheValue` to get schema and column to type
map
3. Init `column2typeMap` when cache init and build query process of ES
external table
4. Support LIKE functionCallExpr for Nereids planner.
5. Add end to end like predicate test cases and UTs
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.
  • Release note

    None

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@qidaye
Copy link
Contributor Author

qidaye commented Nov 7, 2024

run buildall

@github-actions github-actions bot added area/planner Issues or PRs related to the query planner kind/test labels Nov 7, 2024
@doris-robot
Copy link

TPC-H: Total hot run time: 49720 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 5e984bab1710559f759a23854cb5ac3684d1918b, data reload: false

------ Round 1 ----------------------------------
q1	17695	4482	4424	4424
q2	2075	158	154	154
q3	10254	1926	1929	1926
q4	10133	1288	1356	1288
q5	9052	3935	4006	3935
q6	239	125	124	124
q7	2056	1633	1585	1585
q8	9352	2773	2746	2746
q9	10771	9972	10204	9972
q10	8644	3583	3541	3541
q11	434	237	256	237
q12	470	301	308	301
q13	18362	4009	4051	4009
q14	352	328	331	328
q15	522	472	463	463
q16	558	466	479	466
q17	1153	974	948	948
q18	7362	6966	6941	6941
q19	1707	1580	1554	1554
q20	541	318	308	308
q21	4486	4137	4081	4081
q22	497	389	398	389
Total cold run time: 116715 ms
Total hot run time: 49720 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4358	4318	4315	4315
q2	326	219	216	216
q3	4179	4153	4148	4148
q4	2765	2744	2751	2744
q5	7238	7178	7193	7178
q6	244	129	120	120
q7	3247	2787	2823	2787
q8	4404	4507	4474	4474
q9	14226	13790	13794	13790
q10	4220	4307	4258	4258
q11	760	691	687	687
q12	1015	858	858	858
q13	6703	3719	3726	3719
q14	457	417	416	416
q15	494	468	461	461
q16	626	595	578	578
q17	3831	3771	3875	3771
q18	8892	8734	8777	8734
q19	1760	1671	1689	1671
q20	2377	2140	2117	2117
q21	8503	8464	8497	8464
q22	1041	941	981	941
Total cold run time: 81666 ms
Total hot run time: 76447 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 212425 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 5e984bab1710559f759a23854cb5ac3684d1918b, data reload: false

query1	941	430	380	380
query2	6532	2198	2147	2147
query3	6921	201	199	199
query4	23343	21714	21475	21475
query5	19733	6538	6523	6523
query6	291	220	251	220
query7	4337	297	300	297
query8	262	268	232	232
query9	3086	2671	2598	2598
query10	462	329	302	302
query11	15478	15166	14806	14806
query12	133	79	74	74
query13	1027	453	442	442
query14	17348	13568	13684	13568
query15	361	221	229	221
query16	6494	283	265	265
query17	1742	950	912	912
query18	920	325	317	317
query19	224	148	153	148
query20	110	100	103	100
query21	190	97	96	96
query22	5078	5023	4964	4964
query23	34397	33340	33706	33340
query24	7852	6314	6351	6314
query25	513	427	424	424
query26	1277	165	164	164
query27	2411	297	296	296
query28	6104	2276	2261	2261
query29	2836	2727	2770	2727
query30	244	169	166	166
query31	929	731	729	729
query32	72	62	61	61
query33	456	267	274	267
query34	865	485	464	464
query35	1159	974	922	922
query36	1167	1118	1130	1118
query37	169	62	63	62
query38	3078	2933	2915	2915
query39	1397	1329	1317	1317
query40	313	97	94	94
query41	40	38	36	36
query42	85	91	89	89
query43	682	562	701	562
query44	1160	725	729	725
query45	247	230	230	230
query46	1224	951	944	944
query47	1747	1816	1631	1631
query48	523	410	405	405
query49	650	379	377	377
query50	860	628	587	587
query51	4747	4636	4697	4636
query52	94	83	83	83
query53	224	181	196	181
query54	2686	2482	2490	2482
query55	93	102	91	91
query56	226	213	221	213
query57	1233	1246	1157	1157
query58	223	212	214	212
query59	3501	3084	3221	3084
query60	215	208	208	208
query61	99	97	97	97
query62	857	466	457	457
query63	203	178	182	178
query64	3500	1589	1488	1488
query65	3605	3555	3550	3550
query66	775	428	402	402
query67	17795	15701	15293	15293
query68	9676	649	664	649
query69	515	294	264	264
query70	1592	1670	1590	1590
query71	407	320	321	320
query72	6929	4815	4920	4815
query73	771	320	317	317
query74	6311	5845	5789	5789
query75	4961	3658	3652	3652
query76	5096	1172	1140	1140
query77	875	257	259	257
query78	12676	12035	11558	11558
query79	8268	629	647	629
query80	1304	411	417	411
query81	497	238	235	235
query82	1413	99	98	98
query83	171	132	138	132
query84	258	73	70	70
query85	887	340	320	320
query86	326	339	288	288
query87	3243	3045	3039	3039
query88	4273	2290	2287	2287
query89	458	287	276	276
query90	1988	213	217	213
query91	162	130	126	126
query92	63	56	53	53
query93	6470	559	549	549
query94	741	214	214	214
query95	2061	2045	2050	2045
query96	660	324	318	318
query97	6486	6424	6347	6347
query98	228	229	208	208
query99	3134	917	891	891
Total cold run time: 320446 ms
Total hot run time: 212425 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.71 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 5e984bab1710559f759a23854cb5ac3684d1918b, data reload: false

query1	0.03	0.02	0.03
query2	0.07	0.02	0.02
query3	0.26	0.05	0.05
query4	1.78	0.08	0.06
query5	0.54	0.53	0.52
query6	1.24	0.62	0.61
query7	0.02	0.01	0.01
query8	0.04	0.02	0.02
query9	0.53	0.50	0.48
query10	0.54	0.53	0.54
query11	0.12	0.09	0.08
query12	0.12	0.10	0.09
query13	0.63	0.61	0.62
query14	0.79	0.78	0.77
query15	0.79	0.76	0.75
query16	0.36	0.35	0.39
query17	1.02	1.03	1.02
query18	0.23	0.25	0.24
query19	1.89	1.78	1.84
query20	0.02	0.01	0.01
query21	15.51	0.56	0.54
query22	2.17	2.54	1.61
query23	17.22	1.02	0.84
query24	6.45	2.44	1.04
query25	0.34	0.09	0.05
query26	0.78	0.14	0.14
query27	0.05	0.05	0.04
query28	5.85	0.79	0.74
query29	12.62	2.34	2.31
query30	0.62	0.55	0.54
query31	2.81	0.40	0.37
query32	3.34	0.50	0.49
query33	3.08	3.04	3.05
query34	15.28	4.82	4.82
query35	4.86	4.87	4.84
query36	1.07	1.03	1.03
query37	0.06	0.05	0.04
query38	0.03	0.02	0.02
query39	0.02	0.02	0.01
query40	0.16	0.14	0.15
query41	0.07	0.02	0.01
query42	0.03	0.01	0.01
query43	0.02	0.02	0.02
Total cold run time: 103.46 s
Total hot run time: 30.71 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 5e984bab1710559f759a23854cb5ac3684d1918b with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          58 seconds loaded 1101869774 Bytes, about 18 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       21.7 seconds inserted 10000000 Rows, about 460K ops/s

Copy link
Member

@airborne12 airborne12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@airborne12 airborne12 merged commit e6b5108 into apache:branch-2.0 Nov 8, 2024
23 of 26 checks passed
@qidaye qidaye deleted the pick_41176_2.0 branch November 8, 2024 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/planner Issues or PRs related to the query planner kind/test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants