Replace ClientContext/TabletLocator/MetadataServicer with AccumuloTableInfoFetcher facade#3449
Replace ClientContext/TabletLocator/MetadataServicer with AccumuloTableInfoFetcher facade#3449SethSmucker wants to merge 9 commits intointegrationfrom
Conversation
…Is (#2443) Create AccumuloTableInfoFetcher in core/connection-pool that centralizes Accumulo table metadata operations behind public APIs, replacing direct usage of ClientContext, ThriftClientTypes, TabletLocator, and MetadataServicer. Migrated callers: - BulkIngestMapFileLoader: replace Thrift RPC with getActiveCompactions() - TableSplitsCache: replace MetadataServicer with locate() API - BulkInputFormat: replace TabletLocator/ClientContext online path with locate() API; offline path deferred (uses KeyExtent, separate task) Also removes TabletLocator from import-control-accumulo.xml allowlist.
e9a4617 to
5f29ebf
Compare
.../connection-pool/src/main/java/datawave/core/common/connection/AccumuloTableInfoFetcher.java
Outdated
Show resolved
Hide resolved
| * Get the count of running major compactions across all tablet servers using the public {@code getActiveCompactions()} API. | ||
| * <p> | ||
| * Note: This counts only running compactions (not queued), which differs slightly from the original Thrift-based implementation that also counted queued | ||
| * compactions. This is acceptable because the MAJC_THRESHOLD default is 3000 (a high safety margin) and this is polled on each bulk load cycle. |
There was a problem hiding this comment.
There was a problem hiding this comment.
@SethSmucker are there any available APIs to get queued compactions? If so, can we just add it to maintain the historical behavior?
There was a problem hiding this comment.
Not that I could find for 2.1.* at least, but it looks like it will be available once we move to 4.0. For now I'll use the same facade pattern and we can swap out the internals once we hit 4.0, removing the facades one by one so we can test them on the high side. I'll push a draft for that soon, but if there's another way to do it that would work better I'm all ears
There was a problem hiding this comment.
Also, it looked like MetadataServicer might not need a class replacement, the public API versions of its usecase are fairly straight forward, so I think we can directly call them instead of having the original facade. We'll still have the one I mentioned above for the compactions, but just not the original one from the ticket. If it would be better to keep it as a centralized place (closer to what MetadataServicer was doing) I can revert it, but I'll go ahead and try out the direct method calls for now.
warehouse/core/src/main/java/datawave/mr/bulk/BulkInputFormat.java
Outdated
Show resolved
Hide resolved
- Remove implementation details from AccumuloTableInfoFetcher javadocs - Throw TableNotFoundException instead of TableDeletedException in BulkInputFormat - Add datawave-core-connection-pool to root pom dependencyManagement
…into callers Restore the original Thrift-based getMajorCompactionCount() in AccumuloTableInfoFetcher since there is no public Accumulo API for queued compactions (apache/accumulo#5965). Remove public API wrapper methods (tableExists, isTableOnline, getTabletLocations, getTableId) from the facade and inline them directly into BulkInputFormat and TableSplitsCache. The facade now only contains methods that require non-public APIs.
…text-facade-migration
…onalSecurityAgency/datawave into task/clientcontext-facade-migration
Summary
AccumuloTableInfoFetcherfacade in core/connection-pool centralizing Accumulo metadata operations behind public APIsBulkIngestMapFileLoader.getMajorCompactionCount()withgetActiveCompactions()APIMetadataServicerinTableSplitsCachewithlocate()APITabletLocator/ClientContextinBulkInputFormatonline path withlocate()APITabletLocatorfrom import-control-accumulo.xml allowlistPart of #2443