Skip to content

[Enhancement] Reduce memory consumption greatly, and speed up process slighlty by processing file async and inserting to vectordb in chunks rather than all at once #213

@MarcAmick

Description

@MarcAmick

Currently RAG creates embeddings for the file holding them all in memory then bulk inserts them all into the vectordb. When embedding very large files this consumes a vast amount of memory. When memory limits are set in an AKS/EKS environment the pod is more likely to hit the memory limit causing it to crash and restart. This is largely solved by changing the logic such that it breaks the file up into chucks and embeds each separately, and asynchronously bulk inserting each chunk individually as the embedding process completes each chunk. This way less memory is consumed because the process is clearing memory of each chunk of embedding once completed. It also may slightly increase the speed because the database has already inserted most of the document by the time the last bulk insert is run so the last bulk insert is much smaller and completes quickly. If any of the chunks result in some sort of error, the entire document is removed from the db so there is no chance of a partial file in the vectordb.

Please see pull request to address this enhancement: #214

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions