-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CH] fallback for unsupported regex in re2 #7866
Comments
Does CH also use re2? @PHILO-HE we may reuse the preprocessing script as well as the skip list. |
@FelixYBW, currently, we have no script, but have a pre-validation function on Gluten native side. It validate all listed regrex functions by letting RE2 try to compile the pattern. If the compilation fails, we definitely need to make it fall back. This pre-validation can be reused by both backend. |
I remember we have some preprocessing of the pattern before we send to re2, which can make more pattern workable on re2. Is the code still there? |
@FelixYBW, I just found Velox has such preprocessing for presto: see code. And Meituan also proposed a pr to re-use this code for Spark and also make some improvement. See facebookincubator/velox#10981. |
Backend
CH (ClickHouse)
Bug description
OptimizedRegularExpression: cannot compile re2: d(?!d), error: invalid perl operator: (?!. Look at https://github.com/google/re2/wiki/Syntax for reference.
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
No response
The text was updated successfully, but these errors were encountered: