Skip to content

Project to demonstrate how to use my own document to feed AI generative chat model and then ask local question related to that document for specific and effective answers. In technical term, using RAG with springboot AI. Ollama is used locally to run deepseek-r1 model.

Notifications You must be signed in to change notification settings

AadityaUoHyd/springboot-ai-rag-ollama-demo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spring AI RAG Demo with Ollama 0.6.2 version and PGVector

This project demonstrates the implementation of Retrieval Augmented Generation (RAG) using Spring AI, Ollama, and PGVector Database. The application serves as a personal assistant that can answer questions about Spring Boot by referencing the Spring Boot Reference Documentation PDF.

Features

  • Uses Spring AI for RAG implementation
  • Integrates with Ollama for LLM capabilities
  • Stores and retrieves vector embeddings using PGVector
  • Automatically processes and ingests Spring Boot documentation
  • Provides REST API for question-answering

Architecture

RAG Architecture

RAG Architecture

Document Ingestion Pipeline

Document Ingestion Pipeline

Prerequisites

  • Java 21
  • Docker and Docker Compose
  • Ollama installed locally
  • Maven

Setup Instructions

  1. Install Ollama

  2. Pull the Deepseek Model

    ollama pull deepseek-r1:8b
    ollama run deepseek-r1:8b
    

    Note: If you skip this step, the application will automatically pull the model when it first starts, which might take a few minutes.

  3. Start PGVector Database

    docker-compose up -d

    This will start a PostgreSQL database with PGVector extension on port 5432.

  4. Build the Application

    ./mvnw clean install

Running the Application

  1. Start the Spring Boot Application

    ./mvnw spring-boot:run
  2. The application will automatically:

    • Initialize the vector store schema
    • Load and process the Spring Boot reference PDF
    • Start the REST API server

Usage

Send questions about Spring Boot to the API endpoint:

curl -X POST http://localhost:8080/api/chat \
     -H "Content-Type: text/plain" \
     -d "How to develop a spring boot ai application?"

Using POSTMAN client for testing

  • request sent request sent

  • response generated response generated

Technical Details

  • Vector Database: PGVector (PostgreSQL with vector extension)

    • Database: vectordb
    • Username: aadi
    • Password: aadi
    • Port: 5432
  • LLM Configuration:

    • Model: deepseek-r1:8b
    • Base URL: http://localhost:11434
    • Initialization timeout: 5 minutes
    • Auto-pulls model if not available locally
    • Pull strategy: when_missing
  • Document Processing:

    • Uses Apache Tika for PDF reading
    • Implements text splitting for optimal chunk size
    • Automatically ingests documentation on startup

Project Structure

  • ChatController: Handles REST API requests
  • DocumentIngestionService: Processes and stores documentation
  • application.properties: Contains configuration for Ollama and PGVector
  • compose.yml: Docker composition for PGVector database

Troubleshooting

  1. Ensure Ollama is running and accessible at http://localhost:11434
  2. Verify that the PostgreSQL container is running: docker ps
  3. Check application logs for any initialization errors
  4. Ensure the deepseek model is properly pulled in Ollama

Dependencies

  • Spring Boot 3.4.3
  • Spring AI (version 1.0.0-M6)
  • PGVector
  • Apache Tika
  • Spring Boot Docker Compose Support

About

Project to demonstrate how to use my own document to feed AI generative chat model and then ask local question related to that document for specific and effective answers. In technical term, using RAG with springboot AI. Ollama is used locally to run deepseek-r1 model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages