-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix flush_metadata_cache failure when metastore impersonation is enabled #27059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Reviewer's GuideThis PR introduces metastore impersonation support for cache flushing by plumbing a new impersonation flag through connector factories and metastore modules, enhancing FlushMetadataCacheProcedure to detect and use an ImpersonationCachingHiveMetastoreFactory, and adding an integration test to verify flush_metadata_cache under impersonation. Sequence diagram for cache flush with metastore impersonation enabledsequenceDiagram
participant "FlushMetadataCacheProcedure"
participant "HiveMetastoreFactory"
participant "ImpersonationCachingHiveMetastoreFactory"
participant "CachingHiveMetastore"
participant "GlueCache"
actor "User"
"User"->>"FlushMetadataCacheProcedure": invoke flush_metadata_cache
"FlushMetadataCacheProcedure"->>"HiveMetastoreFactory": check if impersonation enabled
alt impersonation enabled
"FlushMetadataCacheProcedure"->>"ImpersonationCachingHiveMetastoreFactory": createMetastore(identity)
"ImpersonationCachingHiveMetastoreFactory"-->>"FlushMetadataCacheProcedure": return CachingHiveMetastore
else impersonation not enabled
"FlushMetadataCacheProcedure"-->>"FlushMetadataCacheProcedure": use default cachingHiveMetastore
end
"FlushMetadataCacheProcedure"->>"CachingHiveMetastore": flushCache/flushPartitionCache/invalidateTable
"FlushMetadataCacheProcedure"->>"GlueCache": flushCache/invalidatePartition/invalidateTable
Class diagram for updated HiveMetastoreFactory and StaticHiveMetastoreFactoryclassDiagram
class HiveMetastoreFactory {
+createMetastore(Optional<ConnectorIdentity>) HiveMetastore
+hasBuiltInCaching() boolean
+isImpersonationEnabled() boolean
+ofInstance(HiveMetastore, boolean) HiveMetastoreFactory
}
class StaticHiveMetastoreFactory {
-HiveMetastore metastore
-boolean impersonationEnabled
+StaticHiveMetastoreFactory(HiveMetastore, boolean)
+isImpersonationEnabled() boolean
+createMetastore(Optional<ConnectorIdentity>) HiveMetastore
}
HiveMetastoreFactory <|-- StaticHiveMetastoreFactory
Class diagram for updated HiveMetastoreModuleclassDiagram
class HiveMetastoreModule {
-Optional<HiveMetastore> metastore
-boolean impersonationEnabled
+HiveMetastoreModule(Optional<HiveMetastore>, boolean)
+setup(Binder)
}
Class diagram for updated FlushMetadataCacheProcedureclassDiagram
class FlushMetadataCacheProcedure {
-Optional<CachingHiveMetastore> cachingHiveMetastore
-Optional<GlueCache> glueCache
-HiveMetastoreFactory hiveMetadataFactory
+flushMetadataCache(...)
-doFlushMetadataCache(ConnectorSession, Optional<String>, Optional<String>, List<String>, List<String>)
}
class ImpersonationCachingHiveMetastoreFactory {
+createMetastore(Optional<ConnectorIdentity>) HiveMetastore
}
FlushMetadataCacheProcedure --> ImpersonationCachingHiveMetastoreFactory : uses
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestThriftMetastoreImpersonation.java:55-56` </location>
<code_context>
+ .build();
+ }
+
+ @Test
+ void testFlushMetadataCache()
+ {
+ Session alice = Session.builder(getSession()).setIdentity(Identity.ofUser("alice")).build();
</code_context>
<issue_to_address>
**suggestion (testing):** Missing negative test for flush_metadata_cache failure scenarios.
Add a test to cover scenarios where flush_metadata_cache fails, such as when impersonation is enabled but the cache is missing or cannot be flushed, to validate error handling.
</issue_to_address>
### Comment 2
<location> `plugin/trino-hive/src/test/java/io/trino/plugin/hive/TestThriftMetastoreImpersonation.java:56-72` </location>
<code_context>
+ }
+
+ @Test
+ void testFlushMetadataCache()
+ {
+ Session alice = Session.builder(getSession()).setIdentity(Identity.ofUser("alice")).build();
+
+ try (TestTable table = newTrinoTable("test_partition", "(id int, part int) WITH (partitioned_by = ARRAY['part'])", List.of("1, 10"))) {
+ assertThat(computeScalar(alice, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
+ .isEqualTo(1L);
+
+ assertUpdate("INSERT INTO " + table.getName() + " VALUES (2, 20)", 1);
+ assertThat(computeScalar(alice, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
+ .isEqualTo(1L);
+
+ assertUpdate(alice, "CALL system.flush_metadata_cache(schema_name => CURRENT_SCHEMA, table_name => '" + table.getName() + "')");
+ assertThat(computeScalar(alice, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
+ .isEqualTo(2L);
+ }
+ }
+}
</code_context>
<issue_to_address>
**suggestion (testing):** Test only covers a single user scenario; consider multi-user edge cases.
Adding tests with multiple user identities will help verify that cache flushing and partition visibility work as expected for different users.
```suggestion
void testFlushMetadataCache()
{
Session alice = Session.builder(getSession()).setIdentity(Identity.ofUser("alice")).build();
Session bob = Session.builder(getSession()).setIdentity(Identity.ofUser("bob")).build();
try (TestTable table = newTrinoTable("test_partition", "(id int, part int) WITH (partitioned_by = ARRAY['part'])", List.of("1, 10"))) {
// Alice sees 1 partition
assertThat(computeScalar(alice, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(1L);
// Bob sees 1 partition
assertThat(computeScalar(bob, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(1L);
// Insert new partition as default session (no user)
assertUpdate("INSERT INTO " + table.getName() + " VALUES (2, 20)", 1);
// Alice and Bob still see 1 partition (cache not flushed)
assertThat(computeScalar(alice, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(1L);
assertThat(computeScalar(bob, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(1L);
// Alice flushes cache
assertUpdate(alice, "CALL system.flush_metadata_cache(schema_name => CURRENT_SCHEMA, table_name => '" + table.getName() + "')");
// Alice sees 2 partitions, Bob still sees 1
assertThat(computeScalar(alice, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(2L);
assertThat(computeScalar(bob, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(1L);
// Bob flushes cache
assertUpdate(bob, "CALL system.flush_metadata_cache(schema_name => CURRENT_SCHEMA, table_name => '" + table.getName() + "')");
// Now Bob sees 2 partitions
assertThat(computeScalar(bob, "SELECT count(1) FROM \"" + table.getName() + "$partitions\""))
.isEqualTo(2L);
}
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| private void doFlushMetadataCache(ConnectorSession session, Optional<String> schemaName, Optional<String> tableName, List<String> partitionColumns, List<String> partitionValues) | ||
| { | ||
| if (cachingHiveMetastore.isEmpty() && glueCache.isEmpty()) { | ||
| if (!(hiveMetadataFactory instanceof ImpersonationCachingHiveMetastoreFactory) && cachingHiveMetastore.isEmpty() && glueCache.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't flush procedure also work for other types of caching, not only ImpersonationCaching
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I’m not sure which caching you’re referring to. Could you elaborate?
plugin/trino-hive/src/main/java/io/trino/plugin/hive/procedure/FlushMetadataCacheProcedure.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/procedure/FlushMetadataCacheProcedure.java
Outdated
Show resolved
Hide resolved
|
CI hit #27082 |
d7f53d9 to
9b40b97
Compare
|
Rebased on master to resolve conflicts. |
| private final boolean impersonationEnabled; | ||
|
|
||
| private StaticHiveMetastoreFactory(HiveMetastore metastore) | ||
| private StaticHiveMetastoreFactory(HiveMetastore metastore, boolean impersonationEnabled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actual always false in current code
Description
Fix flush_metadata_cache failure when metastore impersonation is enabled
Release notes
Summary by Sourcery
Enable flush_metadata_cache to function correctly when metastore impersonation is enabled by propagating an impersonationEnabled flag through Hive connector factories and modules, and updating the cache flush logic to use session-specific caching.
New Features:
Enhancements:
Tests: