-
Notifications
You must be signed in to change notification settings - Fork 267
Implement LeaderOnlyTokenCrawler #6160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement LeaderOnlyTokenCrawler #6160
Conversation
* | ||
* @param buffer The buffer to write events to | ||
*/ | ||
void setBuffer(Buffer<Record<Event>> buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove this method (setBuffer) and add buffer as a parameter to writeBatchToBuffer in a new revision
long startTime = System.currentTimeMillis(); | ||
Instant lastCheckpointTime = Instant.now(); | ||
|
||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this try-catch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the outer try catch is not needed as we already have a try catch around process batch. I will remove this.
|
||
Iterator<ItemInfo> itemIterator = client.listItems(lastToken); | ||
|
||
while (itemIterator.hasNext()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come we switched from do-while to while loop? Ref
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If itemIterator.hasNext() is false, then client.listItems() returned an empty iterator and there are no items to process. So there's no need to process anything if hasNext() is false on the first check. That's why a regular while loop is appropriate here.
throw new RuntimeException("Crawl operation failed", e); | ||
} | ||
|
||
log.info("Crawl completed in {} ms", System.currentTimeMillis() - startTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be recording crawl metric time same as other crawler classes?
https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/saas-source-plugins/source-crawler/src/main/java/org/opensearch/dataprepper/plugins/source/source_crawler/base/TokenPaginationCrawler.java#L90
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Included crawl metric
private final LeaderOnlyTokenCrawlerClient client; | ||
private final PluginMetrics pluginMetrics; | ||
@Setter | ||
private boolean acknowledgementsEnabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this being set from the Source plugin?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
private void processBatch(List<ItemInfo> batch, | ||
LeaderPartition leaderPartition, | ||
EnhancedSourceCoordinator coordinator) { | ||
if (acknowledgementsEnabled && acknowledgementSetManager != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we expect acknowledgementSetManager
to always exist if acknowledgementsEnabled
is true, does it make sense to remove this null check? I am referring to WorkerScheduler
Open to your feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. Since acknowledgementSetManager is required when acknowledgements are enabled, the null check is redundant. I will remove it.
} else { | ||
// On failure: give up partition | ||
log.error("Batch processing failed for token: {}", lastToken); | ||
coordinator.giveUpPartition(leaderPartition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we should be giving up the partition, since this indicates the partition is shutting down and if this is for leader partition, then it will indicate the entire plugin is shutting down which is not the case:
Should be called by the source when it is shutting down to indicate that it will no longer be able to perform work on partitions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I followed documentdb's implementation, which also gives up the partition on negative acknowledgment https://github.com/opensearch-project/data-prepper/blob/main/data-prepper-plugins/mongodb/src/main/java/org/opensearch/dataprepper/plugins/mongo/stream/StreamAcknowledgementManager.java#L103
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alparish what happens if acknowledgements do not comeback? Do you have a retry mechanism?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @bbenner7635 Giving up is probably not right here. Please test it locally to confirm the behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, acknowledgement timeouts should be very large (or INT MAX) because retrying events (that have not been acknowledged) means that the duplicate events will be sent. And if there are buffer issues or sink issues that caused the acknowledgements to be timedout, then injecting same events again will make the problem worse.
AcknowledgementSet acknowledgementSet = acknowledgementSetManager.create( | ||
success -> { | ||
if (success) { | ||
// On success: update checkpoint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need these plugin metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added these metrics
success -> { | ||
if (success) { | ||
// On success: update checkpoint | ||
updateLeaderProgressState(leaderPartition, lastToken, coordinator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
String a not a primitive variable. updateLeaderProgressState here is async function which might be invoked a while later.
Is it possible the lastToken here is already changed to a different value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The callback is processed right after writing to the buffer, and before processing the next batch. Since we only update lastToken when processing a new batch, its value will remain consistent during the acknowledgment handling for the current batch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Then it could be a big performance risk because it will wait until event being ingested and then move to next batch.
Say the buffer wait time is 30 second, it means we can only process 50 events every 30 second?
If that is the case, shall we increase the batch size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to increase the batch size, we need a reasonably high number to ensure the minimal speed as well as not breaching buffer size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@san81 Do you have any recommendation for the batch size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hard to just say one number. I would say, really test with different batch sizes and see what gives the best performance. The factors to consider are
- vendor api support for max page size and latency with respect to the increase in page size
- size of the api response payload with respect to the increase in page size. Just giving an example here. For S3 based pipelines, OSI pipelines process at least 20Mbps per OCU.
- In these 3p connector pipelines, I don't think we will ever fill up the buffer. Each OCU comes up with 8GB buffer size. Biggest bottleneck is the vendor API latency and network latency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that is the case, I think we can easily go up to 5000 event batch for now and we will do load test to verify it is working without problem. The max size is no more then 10kb so the total size is maximumly 50MB, way less than 8GB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After I rethink about it. We should make it configurable per connector.
For Okta, each API can only return maximum 100 events. So 5000 event batch requires 50 calls, which can easily run into a failure mode where 1 api failure block 49 other calls.
A reasonable balance is 5 ~ 10 calls
f9b45fc
to
c2fd947
Compare
Signed-off-by: Alekhya Parisha <[email protected]>
c2fd947
to
ca88f1c
Compare
processBatch(batch, leaderPartition, coordinator); | ||
} catch (Exception e) { | ||
batchesFailedCounter.increment(); | ||
log.error("Failed to process batch ending with token {}", lastToken, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using NOISY
is preferred for this case since we are printing the entire stack trace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. This is happening at a batch level (not per-event level). And it should be more of a defensive programming catch.
} else { | ||
// On failure: give up partition | ||
log.error("Batch processing failed for token: {}", lastToken); | ||
coordinator.giveUpPartition(leaderPartition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @bbenner7635 Giving up is probably not right here. Please test it locally to confirm the behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something about this seems to mix concerns. We already have four crawlers. Each one has a different way to crawl. This change is still the token crawler, but it is using only the leader node. We should decouple the approach for crawling from the nodes that make use of it. Otherwise, we may end up with 8 crawlers.
processBatch(batch, leaderPartition, coordinator); | ||
} catch (Exception e) { | ||
batchesFailedCounter.increment(); | ||
log.error("Failed to process batch ending with token {}", lastToken, e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. This is happening at a batch level (not per-event level). And it should be more of a defensive programming catch.
@Named | ||
public class LeaderOnlyTokenCrawler implements Crawler { | ||
private static final Logger log = LoggerFactory.getLogger(LeaderOnlyTokenCrawler.class); | ||
private static final Duration BUFFER_WRITE_TIMEOUT = Duration.ofSeconds(15); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the client be providing these values?
import java.util.concurrent.TimeUnit; | ||
|
||
@Named | ||
public class LeaderOnlyTokenCrawler implements Crawler { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not use the raw type. You need Crawler<...>
.
Description
This PR introduces the LeaderOnlyTokenCrawler, a performance-optimized implementation that eliminates redundant API calls by processing complete event content in the leader thread without worker partitions.
Issues Resolved
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.