This project implements a fully local Retrieval-Augmented Generation (RAG) pipeline using:
- 🦙 Ollama — for local embeddings and reasoning with LLMs
- 🧩 ChromaDB — as a vector database
- 📄 LangChain — to load, split, and process PDF documents
The goal is to transform a PDF into a semantic vector knowledge base that can later be queried by an LLM such as llama2
for contextual answers.
git clone https://github.com/javsan77/Local-RAG-with-Chroma-and-Ollama.git
cd Local-RAG-with-Chroma-and-Ollama
python3 -m venv venv
source venv/bin/activate # Linux/Mac
venv\Scripts\activate # Windows
pip install -r requirements.txt
Download Ollama from ollama.com/download and start the local server:
ollama serve
Then pull the required models:
# Embedding model
ollama pull nomic-embed-text
# LLM for reasoning and Q&A
ollama pull llama2
Place your PDF file as documento.pdf
in the project root and run:
python rag_setup.py
📁 This will create a local Chroma database inside chroma_db/
,
where each document chunk is stored as a semantic vector.
Before integrating the model into your app, test it manually:
ollama run llama2
Then type something like:
Hello, what can you do?
To exit interactive mode: Ctrl + C
This ensures that Ollama and the llama2
model are running correctly before connecting it to your RAG pipeline.
- Python 3.10+
- Ollama running locally (
ollama serve
) - Downloaded models:
nomic-embed-text
,llama2
- Python dependencies from
requirements.txt
Example:
langchain
langchain-community
langchain-ollama
pypdf
chromadb
Javier Sanchez Backend Developer | AI & Data Enthusiast 🔗 GitHub
MIT License © 2025