-
Notifications
You must be signed in to change notification settings - Fork 4.7k
HIVE-28899: Make 'get_table_meta' query case-sensitive to use index #5793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
private StringBuilder appendPatternConditionCaseSensitive(StringBuilder builder, | ||
String fieldName, String elements, List<String> parameters) { | ||
elements = normalizeIdentifier(elements); | ||
return appendCondition(builder, fieldName, elements.split("\\|"), true, parameters, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove appendPatternConditionCaseSensitive
and rename protected appendPatternCondition
to appendIdentifierPatternCondition
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deniskuzZ
Thank you for the review.
As requested, I removed the appendPatternConditionCaseSensitive
method and renamed appendPatternCondition
to appendIdentifierPatternCondition
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove another appendPatternCondition
in this class as well? using this new method instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we can, as private appendPatternCondition
is used for database.name
and tableType
(we should double-check if we store tableType
in lower-case)
If that is the case, we can keep the original name appendPatternCondition
and drop the new caseSensitive
param with (?i)
cc @jiihye
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have only verified the logic for the database name and table name used in getTableMeta
. If all columns where appendCondition
is used are indeed stored in lowercase, I’ll keep the original function name appendPatternCondition
and remove the caseSensitive
parameter, along with the (?i)
flag inside appendCondition
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @jiihye, please ping me once it's available for the final review. Thanks!
…ndition to appendIdentifierPatternCondition
|
What changes were proposed in this pull request?
This PR proposes to optimize the SQL query generated by the
get_table_meta
API in Hive Metastore by removing the use of the LOWER() function.Why are the changes needed?
In production environments with a large number of tables, we observed that the SQL query generated by
get_table_meta
caused full scans on the TBLS table, resulting in high CPU usage on the Metastore's MySQL backend.https://issues.apache.org/jira/projects/HIVE/issues/HIVE-28899?filter=allopenissues
Does this PR introduce any user-facing change?
No. Since both database names and table names are already stored in lowercase in the Metastore DB, this change should not have any impact on users.
This change is only applied to get_table_meta to minimize risk. Other commands were left unchanged as we are not yet certain about potential side effects in those paths.
How was this patch tested?