Skip to content

KAFKA-19280: Fix NoSuchElementException in UnifiedLog#19717

Merged
lucasbru merged 3 commits into
apache:trunkfrom
lucasbru:unstable_offset_metadata_fix
May 17, 2025
Merged

KAFKA-19280: Fix NoSuchElementException in UnifiedLog#19717
lucasbru merged 3 commits into
apache:trunkfrom
lucasbru:unstable_offset_metadata_fix

Conversation

@lucasbru

@lucasbru lucasbru commented May 14, 2025

Copy link
Copy Markdown
Member

In FETCH requests and TXN_OFFSET_COMMIT requests, on current trunk we
run into a race condition inside UnifiedLog, causing a
NoSuchElementException in
UnifiedLog.fetchLastStableOffsetMetadata(UnifiedLog.java:651).

The cause is that the line a performing an isPresent check on a
volatile Optional before accessing it in get, leaving the door open to
a race condition when the optional changes between isPresent and
get. This change takes a copy of the volatile variable first.

Reviewers: Mickael Maison mickael.maison@gmail.com, wilmerdooley wilmer@snovon.com

@github-actions github-actions Bot added storage Pull requests that target the storage module small Small PRs labels May 14, 2025
@lucasbru lucasbru requested a review from mimaison May 14, 2025 10:01
@lucasbru lucasbru marked this pull request as ready for review May 15, 2025 09:59
@lucasbru lucasbru requested a review from dajac May 15, 2025 09:59
@lucasbru lucasbru changed the title KAFKA-19280: draft fix KAFKA-19280: Fix NoSuchElementException in UnifiedLog May 15, 2025

@mimaison mimaison left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR (and catching the issue!). Looks good overall, just left a couple of suggestions

LogOffsetMetadata highWatermarkMetadata = fetchHighWatermarkMetadata();
if (firstUnstableOffsetMetadata.isPresent() && firstUnstableOffsetMetadata.get().messageOffset < highWatermarkMetadata.messageOffset) {
LogOffsetMetadata lom = firstUnstableOffsetMetadata.get();
Optional<LogOffsetMetadata> firstUnstableOffsetMetadataCopy = firstUnstableOffsetMetadata;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a comment or update the comment just above?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

public long lastStableOffset() {
if (firstUnstableOffsetMetadata.isPresent() && firstUnstableOffsetMetadata.get().messageOffset < highWatermark()) {
return firstUnstableOffsetMetadata.get().messageOffset;
Optional<LogOffsetMetadata> firstUnstableOffsetMetadataCopy = firstUnstableOffsetMetadata;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth adding a comment. There are a few places in this class where we say we cache a value to avoid concurrency issues.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@lucasbru

Copy link
Copy Markdown
Member Author

Thanks for the review @mimaison. Ready for re-review

@mimaison mimaison left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lucasbru lucasbru merged commit bff1602 into apache:trunk May 17, 2025
24 of 25 checks passed
@wilmerdooley

Copy link
Copy Markdown

I opened #22609 to address this. The details are in the PR; happy to adjust based on what you'd prefer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

small Small PRs storage Pull requests that target the storage module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants