Skip to content

apiantonio/LLM_powered_chatbot_DIEM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

174 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DIEM Chatbot - LLM-Powered Virtual Assistant

DIEM Logo

An intelligent RAG-based conversational assistant for the Department of Information Engineering, Electrical Engineering and Applied Mathematics (DIEM) at the University of Salerno.

Python LangChain ChromaDB Streamlit License


πŸ“– Overview

The DIEM Chatbot is a production-grade, Retrieval-Augmented Generation (RAG) system designed to serve students, faculty, and external visitors of the DIEM department. It answers natural-language questions by retrieving grounded information from the department's official web sources β€” eliminating the need to manually navigate dozens of web pages.

The system is built around an agentic LLM pipeline with four specialized search tools, a smart conversational memory, multilingual support, and multiple safety layers including scope-aware guardrails.


πŸ–₯️ Interface Preview

DIEM Chatbot Streamlit UI

✨ Key Features

  • Agentic RAG with Parallel Tool Calling β€” The LLM autonomously decides which knowledge collections to query and fires multiple tool calls in a single turn when needed
  • Multi-Collection Knowledge Base β€” Documents are organized into three Chroma vector store collections: persone (faculty), offerta_formativa (degree programs), and dipartimento (department info)
  • Incremental Web Crawling β€” A multi-threaded BFS crawler automatically scrapes and indexes HTML pages and PDFs from the DIEM ecosystem (diem.unisa.it, docenti.unisa.it, corsi.unisa.it)
  • Smart Conversational Memory β€” Semantic similarity filtering and automatic summarization keep conversation context relevant without overloading the context window
  • Query Optimization β€” Coreference resolution rewrites ambiguous follow-up questions, and multi-query expansion improves retrieval recall
  • Cross-Encoder Reranking β€” A dedicated CrossEncoder model reranks retrieved candidates before passing them to the LLM
  • Guardrails β€” Input/output safety checks (injection, toxicity, PII, hallucination, code generation) via a dedicated Groq LLM, with scope-awareness to handle out-of-domain questions
  • Meta-Query Handling β€” Greetings, thanks, and identity questions are handled without knowledge retrieval and without polluting conversation memory
  • Automatic Fallback β€” If a specialized collection returns no results, an internal search_all cross-collection fallback activates transparently
  • RAGAS Evaluation β€” A fully automated evaluation pipeline with a robust Judge LLM (retry + JSON-repair) and export to JSON/CSV/Excel
  • Multilingual β€” Tool queries are always sent in Italian (for retrieval accuracy); responses are generated in the user's language
  • Streamlit Web UI β€” A polished chat interface with trace inspection, session management, and suggested starter questions

πŸ“ Project Structure

.
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ app.py                    # Streamlit web application entry point
β”‚   β”œβ”€β”€ agent/
β”‚   β”‚   β”œβ”€β”€ agent.py              # RAGAgent facade and RAGAgentFactory
β”‚   β”‚   β”œβ”€β”€ agent_main.py         # CLI entry point and REPL
β”‚   β”‚   β”œβ”€β”€ callbacks.py          # Observability and interaction logging
β”‚   β”‚   β”œβ”€β”€ guardrails.py         # Input/output safety checks
β”‚   β”‚   β”œβ”€β”€ guardrails_config/    # NeMo Guardrails prompts and rail definitions
β”‚   β”‚   β”œβ”€β”€ llm_providers.py      # LLM provider abstraction (Ollama/Groq/HuggingFace)
β”‚   β”‚   β”œβ”€β”€ memory.py             # SmartConversationMemory with semantic filtering
β”‚   β”‚   β”œβ”€β”€ prompts.py            # System prompts (agent + meta queries)
β”‚   β”‚   └── tools/                # LangChain tools (search_persone, search_offerta_formativa, etc.)
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ settings.py           # Centralized configuration (dataclasses + env vars)
β”‚   β”‚   └── logging_config.py     # Logging setup
β”‚   β”œβ”€β”€ evaluation/
β”‚   β”‚   β”œβ”€β”€ config.py             # Evaluation configuration
β”‚   β”‚   β”œβ”€β”€ dataset.py            # Dataset builder (loads questions, runs agent, builds RAGAS dataset)
β”‚   β”‚   β”œβ”€β”€ eval_main.py          # CLI entry point for evaluation
β”‚   β”‚   β”œβ”€β”€ judge.py              # Robust Judge LLM (retry + JSON-repair)
β”‚   β”‚   └── runner.py             # Evaluation orchestrator (RAGAS metrics + report export)
β”‚   β”œβ”€β”€ ingestion/
β”‚   β”‚   β”œβ”€β”€ indexer.py            # Multi-collection chunking, embedding, and Chroma indexing
β”‚   β”‚   β”œβ”€β”€ registry.py           # Incremental indexing registry (SHA-256 based)
β”‚   β”‚   β”œβ”€β”€ router.py             # Document routing and metadata extraction
β”‚   β”‚   └── scheduler_main.py     # Ingestion pipeline CLI (scrape / index / verify / full)
β”‚   β”œβ”€β”€ retrieval/
β”‚   β”‚   └── engine.py             # QueryOptimizer, CrossEncoderReranker, RetrievalEngine
β”‚   └── scraping/
β”‚       β”œβ”€β”€ factories.py          # HTML and PDF rule factories
β”‚       β”œβ”€β”€ interfaces.py         # Abstract base classes (CleaningRule, PdfFilterRule, UrlClassifier)
β”‚       β”œβ”€β”€ persistence.py        # HTML document and PDF ledger persistence
β”‚       β”œβ”€β”€ scrapers.py           # Multi-threaded BFS crawler (UnisaCrawler)
β”‚       └── rules/                # Concrete rule implementations (HTML content, PDF, URL classifiers)
└── data/
    β”œβ”€β”€ raw/                      # Crawled HTML files and PDF links
    β”œβ”€β”€ vectorstore/              # ChromaDB persistence and parent docstore
    └── evaluation/               # Evaluation question sets (JSON)

πŸ› οΈ Tech Stack

Component Technology
LLM Backend Ollama (Nemotron, Qwen), Groq (Llama 3.3 70B), HuggingFace
Agent Framework LangChain (create_agent, tool calling)
Vector Store ChromaDB
Embeddings Qwen3-Embedding-0.6B
Reranker Qwen3-Reranker-0.6B
Web Crawling AsyncHtmlLoader, BeautifulSoup4, Requests
Web UI Streamlit
Guardrails Groq (Llama 3.3 70B) with custom prompt-based rails
Evaluation RAGAS framework
Configuration Environment variables + Python dataclasses

πŸš€ Getting Started

Prerequisites

  • Python 3.10+
  • Ollama running locally (or valid Groq API keys)
  • ChromaDB dependencies

Installation

# Clone the repository
git clone <repo-url>
cd <repo-directory>

Configuration

Create a .env file in the project root:

# LLM Provider (ollama | groq | huggingface)
LLM_PROVIDER=ollama
LLM_MODEL=nemotron-3-super:cloud
OLLAMA_BASE_URL=http://localhost:11434

# Groq API Keys (optional, for guardrails and/or chat)
GROQ_CHAT_API_KEY=your_key_here
GROQ_REWRITER_API_KEY=your_key_here
GROQ_GUARDRAILS_API_KEY=your_key_here

# Embedding Model
EMBEDDING_MODEL=Qwen/Qwen3-Embedding-0.6B
RERANKER_MODEL=Qwen/Qwen3-Reranker-0.6B

# Vector Store
CHROMA_PERSIST_DIR=data/vectorstore/chroma

1. Build the Knowledge Base

# Full pipeline: crawl the DIEM website and index everything
python -m ingestion.scheduler_main --mode full

# Or run steps individually:
python -m ingestion.scheduler_main --mode scrape   # crawling only
python -m ingestion.scheduler_main --mode index    # indexing only
python -m ingestion.scheduler_main --mode verify   # verify collections

2. Launch the Web UI

cd src
python -m streamlit run app.py

3. Use the CLI (Interactive REPL)

cd src
python -m agent.agent_main

# Single query mode
python -m agent.agent_main --single-query "What degree programs does DIEM offer?"

# Disable scope guardrail (for testing)
python -m agent.agent_main --no-scope-guard

πŸ“Š Evaluation

The project includes a full automated evaluation pipeline using the RAGAS framework.

Prepare a Question Set

Create data/evaluation/questions.json:

{
  "dataset_name": "DIEM Evaluation Set",
  "samples": [
    {
      "question": "What degree programs are offered by DIEM?",
      "ground_truth": "DIEM offers degree programs in..."
    }
  ]
}

Run Evaluation

cd src
python -m evaluation.eval_main --input data/evaluation/questions.json

# Options:
# --output results/run_01     # custom output directory
# --no-guardrails             # disable guardrails during evaluation
# --log-level DEBUG           # verbose logging

The runner produces a timestamped JSON report, CSV, and a styled Excel workbook with per-sample metrics and aggregated scores for: Context Precision, Context Recall, Response Relevancy, Faithfulness, and Factual Correctness.


βš™οΈ Configuration Reference

All settings are managed via environment variables or Python dataclasses in src/config/settings.py. Key parameters:

Variable Default Description
LLM_PROVIDER ollama LLM backend (ollama, groq, huggingface)
LLM_MODEL nemotron-3-super:cloud Model name
LLM_TEMPERATURE 0.0 Generation temperature
EMBEDDING_MODEL Qwen/Qwen3-Embedding-0.6B3 HuggingFace embedding model
RERANKER_MODEL Qwen/Qwen3-Reranker-0.6B CrossEncoder reranker
RERANKER_TOP_N 5 Number of documents after reranking
MEMORY_MAX_TURNS 10 Max conversation turns to retain
MEMORY_SIMILARITY_THRESHOLD 0.55 Cosine similarity cutoff for memory filtering
MAX_TOOL_CALLS 3 Max tool invocations per agent turn
MAX_DEPTH 5 Crawler BFS depth

πŸ›‘οΈ Guardrails

The system implements a three-stage safety pipeline powered by Llama 3.3 70B via Groq:

  1. Input Check β€” Blocks prompt injections, toxic language, manipulative instructions, and out-of-scope questions before any retrieval occurs
  2. Meta Check β€” Identifies conversational messages (greetings, thanks, identity questions) and routes them to a lightweight direct-LLM handler, keeping conversation memory clean
  3. Output Check β€” Scans generated responses for inappropriate content, code blocks, or sensitive data (fiscal codes, IBANs) before returning them to the user

πŸ“ Knowledge Scope

The chatbot answers questions grounded exclusively in content from:

External links and out-of-scope questions (e.g., general knowledge, other universities) are explicitly detected and declined.


License

This project is licensed under the PolyForm Noncommercial License 1.0.0.

Copyright (c) 2026 Antonio Apicella, Ivan Luigi Cipriano, Simone Faraulo, Antonio Graziosi

Permission is granted for personal, educational, and research use. Any commercial advantage or monetary compensation derived from the use, reproduction, or distribution of this software is strictly prohibited without explicit written authorization from the authors.

About

An Agentic RAG-powered chatbot delivering intelligent, fact-grounded and context-aware answers for the DIEM Department at the University of Salerno.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages