Skip to content

HIVE-28899: Make 'get_table_meta' query case-sensitive to use index #5793

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

jiihye
Copy link

@jiihye jiihye commented Apr 30, 2025

What changes were proposed in this pull request?

This PR proposes to optimize the SQL query generated by the get_table_meta API in Hive Metastore by removing the use of the LOWER() function.

  • before
# JDOQL
SELECT FROM org.apache.hadoop.hive.metastore.model.MTable 
+ WHERE (database.catalogName == hive ) && (database.name.matches((?i)db_name) )
  • after
# JDOQL
SELECT FROM org.apache.hadoop.hive.metastore.model.MTable
+ WHERE (database.catalogName == hive ) && (database.name.matches(db_name) )

Why are the changes needed?

In production environments with a large number of tables, we observed that the SQL query generated by get_table_meta caused full scans on the TBLS table, resulting in high CPU usage on the Metastore's MySQL backend.
https://issues.apache.org/jira/projects/HIVE/issues/HIVE-28899?filter=allopenissues

Does this PR introduce any user-facing change?

No. Since both database names and table names are already stored in lowercase in the Metastore DB, this change should not have any impact on users.
This change is only applied to get_table_meta to minimize risk. Other commands were left unchanged as we are not yet certain about potential side effects in those paths.

How was this patch tested?

private StringBuilder appendPatternConditionCaseSensitive(StringBuilder builder,
String fieldName, String elements, List<String> parameters) {
elements = normalizeIdentifier(elements);
return appendCondition(builder, fieldName, elements.split("\\|"), true, parameters, true);
Copy link
Member

@deniskuzZ deniskuzZ Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we remove appendPatternConditionCaseSensitive and rename protected appendPatternCondition to appendIdentifierPatternCondition?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deniskuzZ
Thank you for the review.
As requested, I removed the appendPatternConditionCaseSensitive method and renamed appendPatternCondition to appendIdentifierPatternCondition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove another appendPatternCondition in this class as well? using this new method instead

Copy link
Member

@deniskuzZ deniskuzZ May 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we can, as private appendPatternCondition is used for database.name and tableType (we should double-check if we store tableType in lower-case)

If that is the case, we can keep the original name appendPatternCondition and drop the new caseSensitive param with (?i)
cc @jiihye

Copy link
Author

@jiihye jiihye May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have only verified the logic for the database name and table name used in getTableMeta. If all columns where appendCondition is used are indeed stored in lowercase, I’ll keep the original function name appendPatternCondition and remove the caseSensitive parameter, along with the (?i) flag inside appendCondition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @jiihye, please ping me once it's available for the final review. Thanks!

Copy link

sonarqubecloud bot commented May 2, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants