-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(EAI-152) check for change to chunkAlgoHash when updating embeddings #580
base: main
Are you sure you want to change the base?
Conversation
packages/mongodb-rag-core/src/contentStore/updateEmbeddedContent.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment about caching.
and beyond that, sorry if i'm dense, but what changes are you making?
packages/mongodb-rag-core/src/contentStore/updateEmbeddedContent.ts
Outdated
Show resolved
Hide resolved
packages/mongodb-rag-core/src/contentStore/MongoDbEmbeddedContentStore.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment about the $lookup implementation
packages/mongodb-rag-core/src/contentStore/MongoDbEmbeddedContentStore.ts
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see comment about typing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small suggested tweaks + one question about the caching behavior
packages/mongodb-rag-core/src/contentStore/MongoDbEmbeddedContentStore.ts
Outdated
Show resolved
Hide resolved
@@ -276,3 +295,231 @@ describe("updateEmbeddedContent", () => { | |||
}); | |||
}); | |||
}); | |||
|
|||
// These tests use "mongodb-memory-server", not mockEmbeddedContentStore | |||
describe("updateEmbeddedContent", () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't have two describe
blocks with the same title. Can you either combine into a single block or disambiguate them with more descriptive titles?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a more specific describe block for this for now, but this test file should be refactored. These test cases were placed in their own describe block bc the original describe block used mocks for the page and embedding stores and we are moving to using mongodb-memory-server. I created a ticket to capture that work - https://jira.mongodb.org/browse/EAI-935
packages/mongodb-rag-core/src/contentStore/updateEmbeddedContent.ts
Outdated
Show resolved
Hide resolved
const chunkAlgoHashes = new Map<string, string>(); | ||
const chunkAlgoHash = getHashForFunc( | ||
chunkAlgoHashes, | ||
chunkPage, | ||
chunkOptions | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the purpose of the chunkAlgoHashes
caching here.
- We define an empty cache and only use it for this one call right after.
- That call only gets the current
chunkPage
function (that we import into this file)
It seems like the cache will never actually be used? Am I missing something?
packages/mongodb-rag-core/src/contentStore/updateEmbeddedContent.ts
Outdated
Show resolved
Hide resolved
Co-authored-by: Nick Larew <[email protected]>
Jira: https://jira.mongodb.org/browse/EAI-152
Changes