Skip to content

HIVE-28961: Respect partition limit in alter_table_req for partitioned tables #5823

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

wecharyu
Copy link
Contributor

What changes were proposed in this pull request?

  1. Calls get_partitions_req api to get all partitions to enable partition limit check
  2. Avoid calls get_partitions_req if not dataWasMoved

Why are the changes needed?

Get all partitions of a table is a heavy operation that could affect both HMS server and backend DB, alter_table should also check and limit the fetched partition number.

Does this PR introduce any user-facing change?

Yes, alter table cascade to large partitions may fail after this PR.

How was this patch tested?

Add a unit test:

mvn test -Dtest.groups= -Dtest=org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore#testAlterTableCascadeExceedsPartitionLimits -pl :hive-standalone-metastore-server

@Aggarwal-Raghav
Copy link
Contributor

@wecharyu, thanks for the PR. I have 1 question, a bit out of context of this PR. Can we make this

run in batch as done in case of rename table
Batchable.runBatched(partitionBatchSize, parts, new Batchable<Partition, Void>() {

@Aggarwal-Raghav
Copy link
Contributor

Changes LGTM +1

Copy link

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves the alter table operation for partitioned tables by adding a partition limit check and by conditionally calling the partitions API only when necessary.

  • Introduces a new unit test to verify that alter table cascade operations exceeding partition limits throw an exception.
  • Refactors the alter table logic to use the get_partitions_req API with a PartitionsRequest object.
  • Modifies exception handling in the alter table flow to streamline error reporting.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
standalone-metastore/metastore-server/src/test/java/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java Adds a new unit test (testAlterTableCascadeExceedsPartitionLimits) to validate partition limit enforcement.
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java Refactors partition fetching logic in alterTable using PartitionsRequest and adjusts exception handling.
Comments suppressed due to low confidence (2)

standalone-metastore/metastore-server/src/main/java/org/apache/hive/metastore/HiveAlterHandler.java:416

  • Clarify via a comment why this cascade branch uses the original table identifiers (dbname, name) rather than the new names used in the rename block.
PartitionsRequest req = new PartitionsRequest(dbname, name);

standalone-metastore/metastore-server/src/main/java/org/apache/hive/metastore/HiveAlterHandler.java:465

  • Ensure that combining the handling of NoSuchObjectException into MetaException does not reduce the clarity of error messages for missing objects.
catch (MetaException e) {

// Do not verify stats parameters on a partitioned table.
msdb.alterTable(catName, dbname, name, newt, null);
int partitionBatchSize = MetastoreConf.getIntVar(handler.getConf(),
MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX);
String catalogName = catName;
// alterPartition is only for changing the partition location in the table rename
if (dataWasMoved) {
PartitionsRequest req = new PartitionsRequest(newDbName, newTblName);
req.setCatName(catName);
req.setMaxParts((short) -1);
Copy link
Preview

Copilot AI May 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a brief comment explaining the rationale behind using -1 to indicate fetching all partitions to aid future maintainers.

Copilot uses AI. Check for mistakes.

// Do not verify stats parameters on a partitioned table.
msdb.alterTable(catName, dbname, name, newt, null);
int partitionBatchSize = MetastoreConf.getIntVar(handler.getConf(),
MetastoreConf.ConfVars.BATCH_RETRIEVE_MAX);
String catalogName = catName;
// alterPartition is only for changing the partition location in the table rename
if (dataWasMoved) {
PartitionsRequest req = new PartitionsRequest(newDbName, newTblName);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoid calls get_partitions_req if not dataWasMoved

Please give more context about this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants