Skip to content

DOCSP-34008: Split large change events #374

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Dec 27, 2024

Conversation

norareidy
Copy link
Collaborator

@norareidy norareidy commented Dec 12, 2024

Pull Request Info

PR Reviewing Guidelines

JIRA - https://jira.mongodb.org/browse/DOCSP-34008

Staging Links

  • fundamentals/crud/read-operations/change-streams
  • Self-Review Checklist

    • Is this free of any warnings or errors in the RST?
    • Did you run a spell-check?
    • Did you run a grammar-check?
    • Are all the links working?
    • Are the facets and meta keywords accurate?

    Copy link

    netlify bot commented Dec 12, 2024

    Deploy Preview for mongodb-docs-csharp ready!

    Name Link
    🔨 Latest commit 9a03ab7
    🔍 Latest deploy log https://app.netlify.com/sites/mongodb-docs-csharp/deploys/676ed0ae7a68590008bc5935
    😎 Deploy Preview https://deploy-preview-374--mongodb-docs-csharp.netlify.app
    📱 Preview on mobile
    Toggle QR Code...

    QR Code

    Use your smartphone camera to open QR code link.

    To edit notification comments on pull requests, go to your Netlify site configuration.

    Copy link
    Collaborator

    @mcmorisi mcmorisi left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    LGTM with some nits!

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    If your application generates change events that exceed 16 MB in size, the
    {+driver-short+} returns an error. To avoid this error, you can use
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Nit: Do we know the type of error it returns?

    Copy link
    Collaborator Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    The server error is BSONObjectTooLarge, I'll include that

    var pipeline = new EmptyPipelineDefinition<ChangeStreamDocument<Restaurant>>()
    .ChangeStreamSplitLargeEvent();

    using (var cursor = await _restaurantsCollection.WatchAsync(pipeline))
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Nit: Not limited to this PR but it looks like some examples on this page use collection and some use _restaurantsCollection – should we standardize the name?

    Comment on lines 214 to 216
    To learn more about splitting large change events, see the
    :manual:`$changeStreamSplitLargeEvent </reference/operator/aggregation/changeStreamSplitLargeEvent/>`
    page in the {+mdb-server+} manual.
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Nit based on how other Server doc links on this page word this.

    Suggested change
    To learn more about splitting large change events, see the
    :manual:`$changeStreamSplitLargeEvent </reference/operator/aggregation/changeStreamSplitLargeEvent/>`
    page in the {+mdb-server+} manual.
    To learn more about splitting large change events, see
    :manual:`$changeStreamSplitLargeEvent </reference/operator/aggregation/changeStreamSplitLargeEvent/>`
    in the {+mdb-server+} manual.

    @norareidy norareidy requested review from a team and papafe and removed request for a team December 12, 2024 22:02

    If your application generates change events that exceed 16 MB in size, the
    server returns a ``BSONObjectTooLarge`` error. To avoid this error, you can use
    the ``$changeStreamSplitLargeEvent`` pipeline stage to split the events
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Here we talk about the $changeStreamSplitLargeEvent, but we do not explicitly say that we have it as a step in the typed aggregation api, called ChangeStreamSplitLargeEvent. We mention this only in the code at the bottom of this section. I think it would be better to make this clear from the start

    the ``$changeStreamSplitLargeEvent`` pipeline stage to split the events
    into smaller fragments.

    After you receive the change stream event fragments, you can use the
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    I think it would be better to show how to use ChangeStreamSplitLargeEvent first (as done in the last code section) an then show how to merge fragments, as this is a secondary aspect.

    Copy link
    Collaborator Author

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Since I use the helper methods in the ChangeStreamSplitLargeEvent example, I decided to introduce the helper methods first - but the actual split event code example is the focus of the section so it does make sense to include it first. Updated!

    @@ -65,6 +65,115 @@ await cursor.ForEachAsync(change =>
    }
    // end-change-stream-pipeline

    // start-split-event-helpers-sync
    // Fetches the next complete change stream event
    private static ChangeStreamDocument<TDocument> GetNextChangeStreamEvent<TDocument>(
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    @BorisDog, can you take a look at the code examples? It seems you've worked on the implementation of this so you can better spot errors


    using (var cursor = collection.Watch(pipeline))
    {
    using (var enumerator = cursor.ToEnumerable().GetEnumerator())
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    @BorisDog do we need to do all of this here?

    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Short answer: unfortunately yes.
    We just demonstrate how to merge the events if the user chooses to. Our spec does not prescribe any merging functionality, so we did not implement a shortcut for that in our API. We also didn't have any requests for that yet.

    @norareidy norareidy requested review from BorisDog and papafe December 19, 2024 15:20

    using (var cursor = collection.Watch(pipeline))
    {
    using (var enumerator = cursor.ToEnumerable().GetEnumerator())
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Short answer: unfortunately yes.
    We just demonstrate how to merge the events if the user chooses to. Our spec does not prescribe any merging functionality, so we did not implement a shortcut for that in our API. We also didn't have any requests for that yet.

    change events that exceed the 16 MB limit. The code prints the
    change document for each event and calls helper methods to
    reassemble any event fragments:

    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    We probably should mention that fragments reassembling is optional (but probably needed), and the split events can be watched as usual events as they arrive.

    private static async Task<ChangeStreamDocument<TDocument>> GetNextChangeStreamEvent<TDocument>(
    IAsyncCursor<ChangeStreamDocument<TDocument>> changeStreamCursor)
    {
    var changeStreamEvent = changeStreamCursor.Current.First();
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    The Async path needs to be enumerated differently,
    The whole batch returned by Current needs to be inspected. There are few alternatives to do so, I'll share an example later if you don't find one beforehand.

    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Shared the code sample in slack.

    Copy link
    Collaborator

    @BorisDog BorisDog left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Some changes requested

    @norareidy norareidy requested a review from BorisDog December 26, 2024 19:49
    Copy link
    Collaborator

    @BorisDog BorisDog left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    LGTM! + minor formatting comment.

    .ChangeStreamSplitLargeEvent();

    using var cursor = collection.Watch(pipeline);
    foreach (var completeEvent in GetNextChangeStreamEvent(cursor.ToEnumerable().GetEnumerator()))
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Unnecessary one space indentation in foreach block.

    .ChangeStreamSplitLargeEvent();

    using var cursor = await collection.WatchAsync(pipeline);
    await foreach (var completeEvent in GetNextChangeStreamEventAsync(cursor))
    Copy link
    Collaborator

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Unnecessary one space indentation in foreach block.

    @norareidy norareidy merged commit e721d24 into mongodb:master Dec 27, 2024
    6 checks passed
    @norareidy norareidy deleted the DOCSP-34008-split-event branch December 27, 2024 16:14
    norareidy added a commit that referenced this pull request Dec 27, 2024
    * DOCSP-34008: Split large change events
    
    * fixes
    
    * edits
    
    * fixes to async code
    
    * fixes to sync code
    
    * wording
    
    * admonition
    
    * small fix
    
    * MM feedback
    
    * FP feedback
    
    * tech feedback
    
    * edits
    
    * final tech feedback
    
    (cherry picked from commit e721d24)
    norareidy added a commit that referenced this pull request Dec 27, 2024
    * DOCSP-34008: Split large change events
    
    * fixes
    
    * edits
    
    * fixes to async code
    
    * fixes to sync code
    
    * wording
    
    * admonition
    
    * small fix
    
    * MM feedback
    
    * FP feedback
    
    * tech feedback
    
    * edits
    
    * final tech feedback
    
    (cherry picked from commit e721d24)
    norareidy added a commit that referenced this pull request Dec 27, 2024
    * DOCSP-34008: Split large change events
    
    * fixes
    
    * edits
    
    * fixes to async code
    
    * fixes to sync code
    
    * wording
    
    * admonition
    
    * small fix
    
    * MM feedback
    
    * FP feedback
    
    * tech feedback
    
    * edits
    
    * final tech feedback
    
    (cherry picked from commit e721d24)
    mongoKart pushed a commit to mongoKart/docs-csharp that referenced this pull request May 16, 2025
    * DOCSP-34008: Split large change events
    
    * fixes
    
    * edits
    
    * fixes to async code
    
    * fixes to sync code
    
    * wording
    
    * admonition
    
    * small fix
    
    * MM feedback
    
    * FP feedback
    
    * tech feedback
    
    * edits
    
    * final tech feedback
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    None yet
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    4 participants