Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use renamed column name in $partition table in Iceberg #24069

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jinyangli34
Copy link
Contributor

Description

Trino use the field.name() as partition name when building Iceberg $partition table.
After a partition column rename, the $partition table returns old partition column name that no longer exists.

Fix the logic to use updated partition column name to keep it consistent with new schema.

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@cla-bot cla-bot bot added the cla-signed label Nov 8, 2024
@github-actions github-actions bot added the iceberg Iceberg connector label Nov 8, 2024
@@ -278,4 +278,15 @@ public void testNestedFieldChangePartitionTransform()
assertThat(monthPartitionedFiles).hasSize(2);
assertUpdate("DROP TABLE " + tableName);
}

@Test
public void testPartitionRename()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually use BaseIcebergSystemTables for testing system tables.


assertQuery("SELECT partition.par_renamed FROM \"" + tableName + "$partitions\"", "VALUES 11");
assertUpdate("DROP TABLE " + tableName);
}
Copy link
Member

@ebyhr ebyhr Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some tests?

  • partitioning by transforms, e.g. bucket
  • partitioning by nested field

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point.
this will cause nested field to return field name only.
Thinking how to rebuild the column.field ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the logic with a new partitionNameTransform as single place source of truth on partition column naming.
Now both nested identify and bucketing tranforms are handled as new names.

@ebyhr ebyhr changed the title Use renamed column name in $partition table Use renamed column name in $partition table in Iceberg Nov 8, 2024
@jinyangli34 jinyangli34 force-pushed the jinyang-iceberg_partition_rename branch 4 times, most recently from 81bc086 to 3199969 Compare November 10, 2024 05:19
@jinyangli34 jinyangli34 force-pushed the jinyang-iceberg_partition_rename branch from 3199969 to 9e13de3 Compare November 13, 2024 05:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed iceberg Iceberg connector
Development

Successfully merging this pull request may close these issues.

2 participants