Skip to content

Replace ClientContext/TabletLocator/MetadataServicer with AccumuloTableInfoFetcher facade#3449

Open
SethSmucker wants to merge 9 commits intointegrationfrom
task/clientcontext-facade-migration
Open

Replace ClientContext/TabletLocator/MetadataServicer with AccumuloTableInfoFetcher facade#3449
SethSmucker wants to merge 9 commits intointegrationfrom
task/clientcontext-facade-migration

Conversation

@SethSmucker
Copy link
Copy Markdown
Collaborator

Summary

  • Create AccumuloTableInfoFetcher facade in core/connection-pool centralizing Accumulo metadata operations behind public APIs
  • Replace Thrift RPC in BulkIngestMapFileLoader.getMajorCompactionCount() with getActiveCompactions() API
  • Replace MetadataServicer in TableSplitsCache with locate() API
  • Replace TabletLocator/ClientContext in BulkInputFormat online path with locate() API
  • Remove TabletLocator from import-control-accumulo.xml allowlist

Part of #2443

- Remove getClientContext utility from AccumuloConnectionFactory
- Update PushdownScheduler to use tableOperations().tableIdMap()

Fixes #3339
Part of #2443
…Is (#2443)

Create AccumuloTableInfoFetcher in core/connection-pool that centralizes
Accumulo table metadata operations behind public APIs, replacing direct
usage of ClientContext, ThriftClientTypes, TabletLocator, and
MetadataServicer.

Migrated callers:
- BulkIngestMapFileLoader: replace Thrift RPC with getActiveCompactions()
- TableSplitsCache: replace MetadataServicer with locate() API
- BulkInputFormat: replace TabletLocator/ClientContext online path with
  locate() API; offline path deferred (uses KeyExtent, separate task)

Also removes TabletLocator from import-control-accumulo.xml allowlist.
@SethSmucker SethSmucker force-pushed the task/clientcontext-facade-migration branch from e9a4617 to 5f29ebf Compare March 2, 2026 20:25
* Get the count of running major compactions across all tablet servers using the public {@code getActiveCompactions()} API.
* <p>
* Note: This counts only running compactions (not queued), which differs slightly from the original Thrift-based implementation that also counted queued
* compactions. This is acceptable because the MAJC_THRESHOLD default is 3000 (a high safety margin) and this is polled on each bulk load cycle.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure relying solely on active will work for us. On some of the instances, the tservers never really stop compacting so having a threshold on the active would either always trigger or never trigger. The queued really illustrates the backlog. Thoughts @ivakegg @hlgp ?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SethSmucker are there any available APIs to get queued compactions? If so, can we just add it to maintain the historical behavior?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I could find for 2.1.* at least, but it looks like it will be available once we move to 4.0. For now I'll use the same facade pattern and we can swap out the internals once we hit 4.0, removing the facades one by one so we can test them on the high side. I'll push a draft for that soon, but if there's another way to do it that would work better I'm all ears

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it looked like MetadataServicer might not need a class replacement, the public API versions of its usecase are fairly straight forward, so I think we can directly call them instead of having the original facade. We'll still have the one I mentioned above for the compactions, but just not the original one from the ticket. If it would be better to keep it as a centralized place (closer to what MetadataServicer was doing) I can revert it, but I'll go ahead and try out the direct method calls for now.

SethSmucker and others added 7 commits March 23, 2026 12:12
- Remove implementation details from AccumuloTableInfoFetcher javadocs
- Throw TableNotFoundException instead of TableDeletedException in BulkInputFormat
- Add datawave-core-connection-pool to root pom dependencyManagement
…into callers

Restore the original Thrift-based getMajorCompactionCount() in
AccumuloTableInfoFetcher since there is no public Accumulo API for
queued compactions (apache/accumulo#5965).

Remove public API wrapper methods (tableExists, isTableOnline,
getTabletLocations, getTableId) from the facade and inline them
directly into BulkInputFormat and TableSplitsCache. The facade now
only contains methods that require non-public APIs.
…onalSecurityAgency/datawave into task/clientcontext-facade-migration
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants