Skip to content

Replace old value after null merges in nested properties when batching merges/updates #620

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

fabioh8010
Copy link
Contributor

@fabioh8010 fabioh8010 commented Mar 3, 2025

Details

#615 (comment)

This PR:

  • Applies first part of the solution described in the proposal:

    Change Onyx.merge()/Onyx.update() batching logic (applyMerge/fastMerge/mergeObject) to replace the current value of a nested property after a null change in it, so the subsequent updates of that batch in this nested property can fully reset its data.

  • Removes JSON_PATCH usage from native provider. Onyx will be now always responsible for handling the merging strategy, since now we have some customizations (this very PR) that is impossible to replicate with JSON_PATCH.
  • Changes our merging strategy (defined in applyMerge/fastMerge/mergeObject) to have two additional flags: isBatchingMergeChanges and shouldReplaceMarkedObjects. The isBatchingMergeChanges flag will be used when batching any queued merge changes, as it has a special logic to handle the replacement of old values after null merges. The shouldReplaceMarkedObjects flag will be used when applying this batched merge changes to the existing value, using this special logic to effectively replace the old values we desire.

Related Issues

#615

Automated Tests

Unit tests were added to cover these changes.

Manual Tests

Same as Expensify/App#55199.

Author Checklist

  • I linked the correct issue in the ### Related Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android / native
    • Android / Chrome
    • iOS / native
    • iOS / Safari
    • MacOS / Chrome / Safari
    • MacOS / Desktop
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that the left part of a conditional rendering a React component is a boolean and NOT a string, e.g. myBool && <MyComponent />.
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.js or at the top of the file that uses the constant) are defined as such
  • I verified that if a function's arguments changed that all usages have also been updated correctly
  • If a new component is created I verified that:
    • A similar component doesn't exist in the codebase
    • All props are defined accurately and each prop has a /** comment above it */
    • The file is named correctly
    • The component has a clear name that is non-ambiguous and the purpose of the component can be inferred from the name alone
    • The only data being stored in the state is data necessary for rendering and nothing else
    • If we are not using the full Onyx data that we loaded, I've added the proper selector in order to ensure the component only re-renders when the data it is using changes
    • For Class Components, any internal methods passed to components event handlers are bound to this properly so there are no scoping issues (i.e. for onClick={this.submit} the method this.submit should be bound to this in the constructor)
    • Any internal methods bound to this are necessary to be bound (i.e. avoid this.submit = this.submit.bind(this); if this.submit is never passed to a component event handler like onClick)
    • All JSX used for rendering exists in the render method
    • The component has the minimum amount of code necessary for its purpose, and it is broken down into smaller components in order to separate concerns and functions
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.
  • I have checked off every checkbox in the PR author checklist, including those that don't apply to this PR.

Screenshots/Videos

Android: Native
Screen.Recording.2025-03-25.at.22.13.11-compressed.mov
Android: mWeb Chrome

I'm having problems with my emulators when opening the Chrome app (they crash instantly), so I couldn't record videos for this platform.

iOS: Native
Screen.Recording.2025-03-25.at.22.29.49-compressed.mov
iOS: mWeb Safari
Screen.Recording.2025-03-25.at.22.34.02-compressed.mov
MacOS: Chrome / Safari
Screen.Recording.2025-03-25.at.22.36.15-compressed.mov
Screen.Recording.2025-03-25.at.22.37.28-compressed.mov
MacOS: Desktop
Screen.Recording.2025-03-25.at.22.43.39-compressed.mov

@fabioh8010
Copy link
Contributor Author

Strangely the should replace the old value after a null merge in a nested property when batching updates and should replace the old value after a null merge in a nested property when batching merges tests are partially passing, which I was not expecting to happen.

Maybe the fastMerge/mergeObject original logic are partially working somehow? I need to investigate.

@fabioh8010
Copy link
Contributor Author

My tests were "wrong", now it will output the errors in the two scenarios of both Onyx.update() and Onyx.merge()

@fabioh8010 fabioh8010 changed the title [WIP] Replace old value after null merges in nested properties when batching merges/updates Replace old value after null merges in nested properties when batching merges/updates Mar 30, 2025
@fabioh8010 fabioh8010 marked this pull request as ready for review March 30, 2025 18:51
@fabioh8010 fabioh8010 requested a review from a team as a code owner March 30, 2025 18:51
@fabioh8010
Copy link
Contributor Author

cc @chrispader @neil-marcellini

@melvin-bot melvin-bot bot requested review from lakchote and removed request for a team March 30, 2025 18:51
Copy link
Contributor

@lakchote lakchote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

// To achieve this, we first mark these nested objects with an internal flag. With the desired objects
// marked, when calling this method again with "shouldReplaceMarkedObjects" set to true we can proceed
// to effectively replace them in the next condition.
if (isBatchingMergeChanges && targetValue === null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we do something like this?

Suggested change
if (isBatchingMergeChanges && targetValue === null) {
if (isBatchingMergeChanges && !targetValue) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should explicitly check for null. undefined values are not supposed to delete a key. Effectively, they should never be in store anyway, but i think making this explicit makes more sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did that previously, but actually for this logic, we might want to rethink that.

If we merge a nested object with an existing key, we would expect undefined values to be set, rather than discarded.

So in this case i would still explicitly check for if (targetValue == null) (which includes undefined values)

Copy link
Contributor

@chrispader chrispader left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I added a few comments around the SQLite part and tests, but this has a GO from me!

// To achieve this, we first mark these nested objects with an internal flag. With the desired objects
// marked, when calling this method again with "shouldReplaceMarkedObjects" set to true we can proceed
// to effectively replace them in the next condition.
if (isBatchingMergeChanges && targetValue === null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should explicitly check for null. undefined values are not supposed to delete a key. Effectively, they should never be in store anyway, but i think making this explicit makes more sense

// To achieve this, we first mark these nested objects with an internal flag. With the desired objects
// marked, when calling this method again with "shouldReplaceMarkedObjects" set to true we can proceed
// to effectively replace them in the next condition.
if (isBatchingMergeChanges && targetValue === null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did that previously, but actually for this logic, we might want to rethink that.

If we merge a nested object with an existing key, we would expect undefined values to be set, rather than discarded.

So in this case i would still explicitly check for if (targetValue == null) (which includes undefined values)

Comment on lines +80 to +104
return this.multiGet(nonNullishPairsKeys).then((storagePairs) => {
// multiGet() is not guaranteed to return the data in the same order we asked with "nonNullishPairsKeys",
// so we use a map to associate keys to their existing values correctly.
const existingMap = new Map<OnyxKey, OnyxValue<OnyxKey>>();
// eslint-disable-next-line @typescript-eslint/prefer-for-of
for (let i = 0; i < storagePairs.length; i++) {
existingMap.set(storagePairs[i][0], storagePairs[i][1]);
}

const newPairs: KeyValuePairList = [];

// eslint-disable-next-line @typescript-eslint/prefer-for-of
for (let i = 0; i < nonNullishPairs.length; i++) {
const key = nonNullishPairs[i][0];
const newValue = nonNullishPairs[i][1];

const existingValue = existingMap.get(key) ?? {};

const mergedValue = utils.fastMerge(existingValue, newValue, true, false, true);

newPairs.push([key, mergedValue]);
}

return this.multiSet(newPairs);
});
Copy link
Contributor

@chrispader chrispader Mar 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not sure about the performance implications of this change on native.

I know it is hard to have SQLite handle merges with through ON CONFLICT DO UPDATE and JSON_PATCH, but this was supposed to add significant performance gains, since we can basically offload all of the merging (at least with the existing values in store) to the low-level C++ implementation of SQLite. Ideally we would want to let SQLite handle as much of merging/batching as possible.

I see though, that we already had a lot of exceptions to this before and that we still always needed to "pre-merge" in order to broadcast the update, so i think it should be fine for now. In a bigger re-design we could tackle and improve all of this, to make sure that each platform is facilitated to it's fullest.

Still, do we have any benchmarks around how this affects performance of simple Onyx.merge operations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think it would be a good idea to measure performance in the app before and after this change to make sure it doesn't cause a large performance hit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but I am less convinced that this is "fine for now". It feels like we are undoing work that we had a good reason to do at some point. I trust that we are moving in a good direction, but would rather let some benchmarks do the talking.

@neil-marcellini neil-marcellini self-requested a review April 1, 2025 21:14
Copy link
Contributor

@neil-marcellini neil-marcellini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the changes others suggested. I would like to see the tests cleaned up a little and agree with Chris that we should measure the performance impact of this.

I'm also going to request a review from @marcaaron now that he is back and because he has a good perspective about how Onyx should work.

Comment on lines +80 to +104
return this.multiGet(nonNullishPairsKeys).then((storagePairs) => {
// multiGet() is not guaranteed to return the data in the same order we asked with "nonNullishPairsKeys",
// so we use a map to associate keys to their existing values correctly.
const existingMap = new Map<OnyxKey, OnyxValue<OnyxKey>>();
// eslint-disable-next-line @typescript-eslint/prefer-for-of
for (let i = 0; i < storagePairs.length; i++) {
existingMap.set(storagePairs[i][0], storagePairs[i][1]);
}

const newPairs: KeyValuePairList = [];

// eslint-disable-next-line @typescript-eslint/prefer-for-of
for (let i = 0; i < nonNullishPairs.length; i++) {
const key = nonNullishPairs[i][0];
const newValue = nonNullishPairs[i][1];

const existingValue = existingMap.get(key) ?? {};

const mergedValue = utils.fastMerge(existingValue, newValue, true, false, true);

newPairs.push([key, mergedValue]);
}

return this.multiSet(newPairs);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think it would be a good idea to measure performance in the app before and after this change to make sure it doesn't cause a large performance hit.

@marcaaron
Copy link
Contributor

now we have some customizations (this very PR) that is impossible to replicate with JSON_PATCH

Can you please update the description to mention which customizations are not compatible with JSON_PATCH? I will review today, thanks!

@marcaaron
Copy link
Contributor

marcaaron commented Apr 2, 2025

Is it possible to reword some of this for clarity?

Changes our merging strategy (defined in applyMerge/fastMerge/mergeObject) to have two additional flags: isBatchingMergeChanges and shouldReplaceMarkedObjects. The isBatchingMergeChanges flag will be used when batching any queued merge changes, as it has a special logic to handle the replacement of old values after null merges.

Specifically, "handle the replacement of old values after null merges" didn't quite make sense to me.

The shouldReplaceMarkedObjects flag will be used when applying this batched merge changes to the existing value, using this special logic to effectively replace the old values we desire.

A part of me also finds the description a bit confusing. Maybe you can also explain what a "marked object" is in the description? Thanks!

Copy link
Contributor

@marcaaron marcaaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is generally looking good. Thanks for the work on it @fabioh8010 @chrispader. My main feedback (that @neil-marcellini already called out) would be to make sure that JSON_PATCH in the native provider is something we can confidently get rid of. If so, then 👍.

@@ -307,8 +307,10 @@ function merge<TKey extends OnyxKey>(key: TKey, changes: OnyxMergeInput<TKey>):
}

try {
// We first only merge the changes, so we can provide these to the native implementation (SQLite uses only delta changes in "JSON_PATCH" to merge)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A part of me is confused about why we are getting rid of JSON_PATCH. Another part believes that this was a benefit since when we do this we end up passing a smaller delta change instead of a potentially large object and therefore it should be more efficient (hand wave there as I do not entirely recall anymore).

@@ -346,11 +348,12 @@ function merge<TKey extends OnyxKey>(key: TKey, changes: OnyxMergeInput<TKey>):
return Promise.resolve();
}

// For providers that can't handle delta changes, we need to merge the batched changes with the existing value beforehand.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flagging this since I still have questions about JSON_PATCH before we remove it...

Comment on lines +80 to +104
return this.multiGet(nonNullishPairsKeys).then((storagePairs) => {
// multiGet() is not guaranteed to return the data in the same order we asked with "nonNullishPairsKeys",
// so we use a map to associate keys to their existing values correctly.
const existingMap = new Map<OnyxKey, OnyxValue<OnyxKey>>();
// eslint-disable-next-line @typescript-eslint/prefer-for-of
for (let i = 0; i < storagePairs.length; i++) {
existingMap.set(storagePairs[i][0], storagePairs[i][1]);
}

const newPairs: KeyValuePairList = [];

// eslint-disable-next-line @typescript-eslint/prefer-for-of
for (let i = 0; i < nonNullishPairs.length; i++) {
const key = nonNullishPairs[i][0];
const newValue = nonNullishPairs[i][1];

const existingValue = existingMap.get(key) ?? {};

const mergedValue = utils.fastMerge(existingValue, newValue, true, false, true);

newPairs.push([key, mergedValue]);
}

return this.multiSet(newPairs);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 but I am less convinced that this is "fine for now". It feels like we are undoing work that we had a good reason to do at some point. I trust that we are moving in a good direction, but would rather let some benchmarks do the talking.

@neil-marcellini
Copy link
Contributor

@fabioh8010 please request our reviews again when it's ready

@fabioh8010
Copy link
Contributor Author

@fabioh8010 please request our reviews again when it's ready

Sure, I started working through the comments today and preparing better explanation for you before requesting.

@fabioh8010
Copy link
Contributor Author

Updates:

  • Simplified comments in general
  • Simplied logic of applyMerge() and created separate function to merge batch changes
  • Simplified and structured better the tests regarding these changes.
  • Addressed some of the comments.

TODO

  • Add better explanation about the custom merging strategy
  • Do benchmarks to evaluate performance before/after this solution, targeting SQLiteProvider.multiMerge()
  • Address remaining comments not related to the items above

@neil-marcellini
Copy link
Contributor

@fabioh8010 what is the latest update? How is it coming along?

@fabioh8010
Copy link
Contributor Author

@neil-marcellini Sorry for the late, I was working on two high priority design docs and it was a bit difficult to focus on this, but now I have more capacity and will be addressing the remaining stuff today/tomorrow at most.

@fabioh8010
Copy link
Contributor Author

Performance Analysis

For these analysis I tested on Android (the only provider that have substantial changes is the native one) and profiled Storage.multiMerge and Storage.mergeItem using the Performance API.

Baseline (main)

Test Method Duration (ms)
Merging 1000 records with mergeCollection() Storage.multiMerge 83
Merging 1 record with mergeCollection() Storage.multiMerge 4
Merging the same record 10 times with merge() Storage.mergeItem (uses Storage.multiMerge under the hood) 3

Delta (this PR)

Test Method Duration (ms)
Merging 1000 records with mergeCollection() Storage.multiMerge 118 (+42,17%)
Merging 1 record with mergeCollection() Storage.multiMerge 37 (+89,19%)
Merging the same record 10 times with merge() Storage.mergeItem (uses Storage.setItem under the hood) 30 (+86,67%)

As you can see the operations are quite more expensive now, probably due to having to use this.multiGet() and then this.multiSet() instead of JSON_PATCH.

I will try to look for another solution for the native provider, maybe I can keep the JSON_PATCH and only do additional SQL operations for the properties we need to fully replace.

@fabioh8010
Copy link
Contributor Author

fabioh8010 commented Apr 25, 2025

Updates:

  • Investigating and experimenting a way to keep JSON_PATCH operation and then execute the object replacements with separate SQL operations
  • Also investigating why a simple setItem operation is taking whole 30 ms even on main, while mergeItem takes around 3-4 ms on main. I feel that a SQL "SET" operation should have the same or even less execution time than a JSON_PATCH one. This might be because of executeAsync vs executeBatchAsync, will confirm it when I'm back.

@fabioh8010
Copy link
Contributor Author

Just FYI that I will be OOO next week and will return on May 5th 🌴

@neil-marcellini neil-marcellini marked this pull request as draft April 28, 2025 15:45
@fabioh8010
Copy link
Contributor Author

Updates:

  • Resuming work on this one this week.

@fabioh8010
Copy link
Contributor Author

Updates:

  • Designing/working on the JSON_PATCH + separate JSON_REPLACE operations solution.

@fabioh8010
Copy link
Contributor Author

Updates:

  • I think I have a working solution for JSON_PATCH + separate JSON_REPLACE operations approach, will do a cleanup, more testing and finally the benchmarks.

@chrispader
Copy link
Contributor

Updates:

  • I think I have a working solution for JSON_PATCH + separate JSON_REPLACE operations approach, will do a cleanup, more testing and finally the benchmarks.

Cool! Lmk if you need any help with the Onyx storage layer or SQLite related stuff, since i'm the maintainer of react-native-nitro-sqlite i might be able to help there! 🙌🏼

@fabioh8010
Copy link
Contributor Author

Updates:

  • I think I have a working solution for JSON_PATCH + separate JSON_REPLACE operations approach, will do a cleanup, more testing and finally the benchmarks.

Cool! Lmk if you need any help with the Onyx storage layer or SQLite related stuff, since i'm the maintainer of react-native-nitro-sqlite i might be able to help there! 🙌🏼

For sure man, thanks! I need to do some cleanup before, and for now I won't merge with main yet since I don't want to pollute the benchmarks.

@fabioh8010
Copy link
Contributor Author

Updates:

  • Did the benchmarks and the logic I created to generate the JSON_REPLACE operations are causing some performance impact when merging big collections (like the Merging 1000 records with mergeCollection() case) because we need to traverse the objects again to search for the markers.
  • To improve that I passed this logic to mergeObject because we can reuse the traverse object logic that already happen there to remove the performance impact.
  • It seems to be working but needs more testing. Also started some discussion with @chrispader in DM.

@fabioh8010
Copy link
Contributor Author

Updates:

  • I had a call with @chrispader yesterday and we agree that the new JSON_PATCH + JSON_REPLACE seems to be the right direction now, he suggested some improvements in the code and he'll be supporting me to get to the final code ❤️
  • Today I reviewed Chris suggestions (my solution branch is here which I will later merge to this one, and Chris PR is here), looks great but we have some tests failings that we need to check
  • Meanwhile I'm working in update both Onyx and E/App testing branch to use latest changes (with Nitro SQLite), and will do the final benchmarks once our Onyx code is ready.

@fabioh8010
Copy link
Contributor Author

Updates:

  • @chrispader did updates in his PR and the tests are now passing.
  • I will check them tomorrow and probably we'll be able to merge and continue with the main developments here.

@fabioh8010
Copy link
Contributor Author

Updates:

  • I will try to get back to this as soon as I can, I have been working on Critical Test Drive feature/issues since last week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants