Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fastest way to create and store embeddings for data in a SQLite table #189

Closed
punkish opened this issue Dec 26, 2024 · 5 comments
Closed
Assignees
Labels
question Further information is requested stale

Comments

@punkish
Copy link
Contributor

punkish commented Dec 26, 2024

I have a SQLite table like so (simplified schema below) with ~950K rows

CREATE TABLE t (id INTEGER PRIMARY KEY, fulltext TEXT);

What is the quickest and easiest way to generate embeddings and store them in a vectors table? I am using Ollama with llama 3.2 running locally using the nomic-embed-text embeddings model. I would like to do this once and then create a web interface to query the data. Additionally, as new rows get added to the table t, I would like to TRIGGER embeddings for them as well. Is that possible? If yes, any hint would be very welcome.

@punkish
Copy link
Contributor Author

punkish commented Dec 28, 2024

Is there a way to use transactions when inserting a lot of embeddings in a libSQL db? Right now I am using a pattern like so, and it is really slow (I have about a million documents for which I need to insert embeddings). I'd like to insert them in transactions of 5000 at a time

const rows = db.prepare(`SELECT fulltext FROM t LIMIT 5000`).all();

for (const row of result.rows) {
    await app.addLoader(new TextLoader({ text: row.fulltext }));
}

@adhityan adhityan added the question Further information is requested label Dec 30, 2024
Copy link

This issue is stale because it has been open for 14 days with no activity.

@github-actions github-actions bot added the stale label Jan 14, 2025
Copy link

This issue was closed because it has been inactive for 30 days since being marked as stale.

@adhityan adhityan reopened this Feb 19, 2025
@adhityan adhityan removed the stale label Feb 19, 2025
Copy link

github-actions bot commented Mar 6, 2025

This issue is stale because it has been open for 14 days with no activity.

@github-actions github-actions bot added the stale label Mar 6, 2025
Copy link

github-actions bot commented Apr 5, 2025

This issue was closed because it has been inactive for 30 days since being marked as stale.

@github-actions github-actions bot closed this as completed Apr 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested stale
Projects
None yet
Development

No branches or pull requests

2 participants