You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
The current implementation of the listAll() API in CompositeDirectory needs evaluation. CompositeDirectory is a hybrid directory utilizing both localDirectory and RemoteSegmentDirectory. The listAll() API is crucial for file cleanup and obtaining the latest commit SegmentInfo.
Issue:
When listAll() lists both local and remote files, some tests become flaky. For example, the test WarmIndexSegmentReplicationIT.testReplicationPostDeleteAndForceMerge() is affected.
Stack trace:
févr. 25, 2025 2:13:43 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
AVERTISSEMENT: Uncaught exception in thread: Thread[#75,opensearch[node_t2][generic][T#3],5,TGRP-WarmIndexRemoteStoreSegmentReplicationIT]
java.lang.AssertionError: new global checkpoint [-1] is lower than previous one [8]
at __randomizedtesting.SeedInfo.seed([EA432349BB4BCDD6]:0)
at org.opensearch.index.seqno.ReplicationTracker.updateGlobalCheckpointOnPrimary(ReplicationTracker.java:1752)
at org.opensearch.index.seqno.ReplicationTracker.activatePrimaryMode(ReplicationTracker.java:1389)
at org.opensearch.index.shard.IndexShard.lambda$updateShardState$5(IndexShard.java:784)
at org.opensearch.index.shard.IndexShard$5.onResponse(IndexShard.java:4276)
at org.opensearch.index.shard.IndexShard$5.onResponse(IndexShard.java:4246)
at org.opensearch.index.shard.IndexShard.lambda$asyncBlockOperations$37(IndexShard.java:4197)
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
at org.opensearch.index.shard.IndexShardOperationPermits$1.doRun(IndexShardOperationPermits.java:157)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:994)
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
[2025-02-25T14:13:43,384][INFO ][o.o.i.s.CompositeDirectory] [node_t2] listAll() call stack (last 10 methods): [org.apache.lucene.store.FilterDirectory.listAll, org.apache.lucene.store.FilterDirectory.listAll, org.apache.lucene.index.SegmentInfos.getLastCommitGeneration, org.apache.lucene.index.SegmentInfos.getLastCommitSegmentsFileName, org.opensearch.index.shard.RemoteStoreRefreshListener.isRefreshAfterCommit, org.opensearch.index.shard.RemoteStoreRefreshListener.syncSegments, org.opensearch.index.shard.RemoteStoreRefreshListener.performAfterRefreshWithPermit, org.opensearch.index.shard.ReleasableRetryableRefreshListener.runAfterRefreshWithPermit, org.opensearch.index.shard.ReleasableRetryableRefreshListener.afterRefresh, org.apache.lucene.search.ReferenceManager.notifyRefreshListenersRefreshed]
[2025-02-25T14:13:43,385][INFO ][o.o.i.s.CompositeDirectory] [node_t2] listAll Composite Directory[Composite Directory @ c9e9359]: Local Directory files - [_0.cfe_block_0, _0.cfs_block_0, _0.si_block_0, _0_1.fnm_block_0, _0_1_Lucene90_0.dvd_block_0, _0_1_Lucene90_0.dvm_block_0, _1.cfe_block_0, _1.cfs_block_0, _1.si_block_0, _2.cfe_block_0, _2.cfs_block_0, _2.si_block_0, _3.cfe_block_0, _3.cfs_block_0, _3.si_block_0, _4.fdm_block_0, _4.fdt_block_0, _4.fdx_block_0, _4.fnm_block_0, _4.kdd_block_0, _4.kdi_block_0, _4.kdm_block_0, _4.nvd_block_0, _4.nvm_block_0, _4.si_block_0, _4_Lucene101_0.doc_block_0, _4_Lucene101_0.pos_block_0, _4_Lucene101_0.psm_block_0, _4_Lucene101_0.tim_block_0, _4_Lucene101_0.tip_block_0, _4_Lucene101_0.tmd_block_0, _4_Lucene90_0.dvd_block_0, _4_Lucene90_0.dvm_block_0, _5.cfe_block_0, _5.cfs_block_0, _5.si_block_0, _6.cfe_block_0, _6.cfs_block_0, _6.si_block_0, _7.cfe_block_0, _7.cfs_block_0, _7.si_block_0, _8.cfe, _8.cfs, _8.si, segments_3_block_0, segments_5, segments_6, write.lock]
[2025-02-25T14:13:43,385][INFO ][o.o.i.s.CompositeDirectory] [node_t2] Composite Directory[Composite Directory @ c9e9359]: Remote Directory files - [_6.cfe, _4_Lucene101_0.tim, _7.cfs, _4_Lucene101_0.tip, _4_Lucene90_0.dvd, _4.fdx, _6.cfs, _7.cfe, _7.si, _5.si, _4_Lucene101_0.doc, _4.nvm, _4.fnm, _5.cfs, _4.nvd, segments_3, _4_Lucene90_0.dvm, _4.fdt, _4_Lucene101_0.psm, _4_Lucene101_0.tmd, _4.kdm, _5.cfe, _4.kdi, _4.fdm, _4.si, _4_Lucene101_0.pos, _4.kdd, _6.si]
[2025-02-25T14:13:43,385][INFO ][o.o.i.s.CompositeDirectory] [node_t2] Composite Directory[Composite Directory @ c9e9359]: listAll() returns : [_4.fdm, _4.fdt, _4.fdx, _4.fnm, _4.kdd, _4.kdi, _4.kdm, _4.nvd, _4.nvm, _4.si, _4_Lucene101_0.doc, _4_Lucene101_0.pos, _4_Lucene101_0.psm, _4_Lucene101_0.tim, _4_Lucene101_0.tip, _4_Lucene101_0.tmd, _4_Lucene90_0.dvd, _4_Lucene90_0.dvm, _5.cfe, _5.cfs, _5.si, _6.cfe, _6.cfs, _6.si, _7.cfe, _7.cfs, _7.si, _8.cfe, _8.cfs, _8.si, segments_3, segments_5, segments_6, write.lock]
[2025-02-25T14:13:43,390][TRACE][o.o.i.r.c.PublishCheckpointAction] [node_t2] [shardId 0] Publishing replication checkpoint [ReplicationCheckpoint{shardId=[test-idx-1][0], primaryTerm=2, segmentsGen=6, version=31, size=3893, codec=Lucene101}]
[2025-02-25T14:13:43,390][TRACE][o.o.i.r.c.PublishCheckpointAction] [node_t2] [[test-idx-1][0]] op [indices:admin/publishCheckpoint] completed on primary for request [PublishCheckpointRequest{checkpoint=ReplicationCheckpoint{shardId=[test-idx-1][0], primaryTerm=2, segmentsGen=6, version=31, size=3893, codec=Lucene101}}]
[2025-02-25T14:13:43,390][DEBUG][o.o.i.r.c.PublishCheckpointAction] [node_t2] [shardId 0] Completed publishing checkpoint [ReplicationCheckpoint{shardId=[test-idx-1][0], primaryTerm=2, segmentsGen=6, version=31, size=3893, codec=Lucene101}], timing: 0
[2025-02-25T14:13:43,390][INFO ][o.o.i.r.WarmIndexRemoteStoreSegmentReplicationIT] [testReplicationPostDeleteAndForceMerge] Sandeep - 10
févr. 25, 2025 2:13:43 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
AVERTISSEMENT: Uncaught exception in thread: Thread[#88,opensearch[node_t2][refresh][T#1],5,TGRP-WarmIndexRemoteStoreSegmentReplicationIT]
java.lang.AssertionError: global checkpoint is not up-to-date, expected: -1 but was: 8
at __randomizedtesting.SeedInfo.seed([EA432349BB4BCDD6]:0)
at org.opensearch.index.seqno.ReplicationTracker.invariant(ReplicationTracker.java:920)
at org.opensearch.index.seqno.ReplicationTracker.updateLocalCheckpoint(ReplicationTracker.java:1695)
at org.opensearch.index.shard.IndexShard.updateLocalCheckpointForShard(IndexShard.java:3197)
at org.opensearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.updateLocalCheckpointForShard(TransportReplicationAction.java:1338)
at org.opensearch.action.support.replication.ReplicationOperation.updateCheckPoints(ReplicationOperation.java:341)
at org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:184)
at org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:178)
at org.opensearch.action.support.replication.TransportReplicationAction$PrimaryResult.runPostReplicationActions(TransportReplicationAction.java:728)
at org.opensearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:178)
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
at org.opensearch.core.action.ActionListener$4.onResponse(ActionListener.java:182)
at org.opensearch.core.action.ActionListener.completeWith(ActionListener.java:355)
at org.opensearch.action.admin.indices.refresh.TransportShardRefreshAction.shardOperationOnPrimary(TransportShardRefreshAction.java:100)
at org.opensearch.action.admin.indices.refresh.TransportShardRefreshAction.shardOperationOnPrimary(TransportShardRefreshAction.java:57)
at org.opensearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:1333)
at org.opensearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:150)
at org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.runWithPrimaryShardReference(TransportReplicationAction.java:654)
at org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.lambda$doRun$0(TransportReplicationAction.java:547)
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
at org.opensearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$36(IndexShard.java:4185)
at org.opensearch.core.action.ActionListener$3.onResponse(ActionListener.java:132)
at org.opensearch.action.support.ContextPreservingActionListener.onResponse(ContextPreservingActionListener.java:58)
at org.opensearch.index.shard.IndexShardOperationPermits$2.doRun(IndexShardOperationPermits.java:286)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:994)
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
févr. 25, 2025 2:13:43 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
AVERTISSEMENT: Uncaught exception in thread: Thread[#103,opensearch[node_t2][refresh][T#2],5,TGRP-WarmIndexRemoteStoreSegmentReplicationIT]
java.lang.AssertionError: global checkpoint is not up-to-date, expected: -1 but was: 8
at __randomizedtesting.SeedInfo.seed([EA432349BB4BCDD6]:0)
at org.opensearch.index.seqno.ReplicationTracker.invariant(ReplicationTracker.java:920)
at org.opensearch.index.seqno.ReplicationTracker.updateLocalCheckpoint(ReplicationTracker.java:1695)
at org.opensearch.index.shard.IndexShard.updateLocalCheckpointForShard(IndexShard.java:3197)
at org.opensearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.updateLocalCheckpointForShard(TransportReplicationAction.java:1338)
at org.opensearch.action.support.replication.ReplicationOperation.updateCheckPoints(ReplicationOperation.java:341)
at org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:184)
at org.opensearch.action.support.replication.ReplicationOperation$1.onResponse(ReplicationOperation.java:178)
at org.opensearch.action.support.replication.TransportReplicationAction$PrimaryResult.runPostReplicationActions(TransportReplicationAction.java:728)
at org.opensearch.action.support.replication.ReplicationOperation.handlePrimaryResult(ReplicationOperation.java:178)
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
at org.opensearch.core.action.ActionListener$4.onResponse(ActionListener.java:182)
at org.opensearch.core.action.ActionListener.completeWith(ActionListener.java:355)
at org.opensearch.indices.replication.checkpoint.PublishCheckpointAction.shardOperationOnPrimary(PublishCheckpointAction.java:194)
at org.opensearch.indices.replication.checkpoint.PublishCheckpointAction.shardOperationOnPrimary(PublishCheckpointAction.java:52)
at org.opensearch.action.support.replication.TransportReplicationAction$PrimaryShardReference.perform(TransportReplicationAction.java:1333)
at org.opensearch.action.support.replication.ReplicationOperation.execute(ReplicationOperation.java:150)
at org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.runWithPrimaryShardReference(TransportReplicationAction.java:654)
at org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.lambda$doRun$0(TransportReplicationAction.java:547)
at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
at org.opensearch.index.shard.IndexShard.lambda$wrapPrimaryOperationPermitListener$36(IndexShard.java:4185)
at org.opensearch.core.action.ActionListener$3.onResponse(ActionListener.java:132)
at org.opensearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:310)
at org.opensearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:255)
at org.opensearch.index.shard.IndexShard.acquirePrimaryOperationPermit(IndexShard.java:4156)
at org.opensearch.action.support.replication.TransportReplicationAction.acquirePrimaryOperationPermit(TransportReplicationAction.java:1262)
at org.opensearch.action.support.replication.TransportReplicationAction$AsyncPrimaryAction.doRun(TransportReplicationAction.java:544)
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
at org.opensearch.action.support.replication.TransportReplicationAction.handlePrimaryRequest(TransportReplicationAction.java:483)
at org.opensearch.wlm.WorkloadManagementTransportInterceptor$RequestHandler.messageReceived(WorkloadManagementTransportInterceptor.java:63)
at org.opensearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:108)
at org.opensearch.transport.TransportService$7.doRun(TransportService.java:1048)
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:994)
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1583)
Additional context:
This issue may have broader implications for warm index functionality and remote store operations. It's crucial to ensure that the listAll() API provides accurate and consistent results across all scenarios.
Related component
No response
To Reproduce
Go to '...'
Click on '....'
Scroll down to '....'
See error
Expected behavior
All tests should pass with correct implementation of listAll()
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
OS: [e.g. iOS]
Version [e.g. 22]
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Describe the bug
Part of this meta issue: #13149
Description:
The current implementation of the listAll() API in CompositeDirectory needs evaluation. CompositeDirectory is a hybrid directory utilizing both localDirectory and RemoteSegmentDirectory. The listAll() API is crucial for file cleanup and obtaining the latest commit SegmentInfo.
Issue:
When listAll() lists both local and remote files, some tests become flaky. For example, the test WarmIndexSegmentReplicationIT.testReplicationPostDeleteAndForceMerge() is affected.
Stack trace:
Additional context:
This issue may have broader implications for warm index functionality and remote store operations. It's crucial to ensure that the listAll() API provides accurate and consistent results across all scenarios.
Related component
No response
To Reproduce
Expected behavior
All tests should pass with correct implementation of listAll()
Additional Details
Plugins
Please list all plugins currently enabled.
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: