An intelligent RAG-based conversational assistant for the Department of Information Engineering, Electrical Engineering and Applied Mathematics (DIEM) at the University of Salerno.
The DIEM Chatbot is a production-grade, Retrieval-Augmented Generation (RAG) system designed to serve students, faculty, and external visitors of the DIEM department. It answers natural-language questions by retrieving grounded information from the department's official web sources β eliminating the need to manually navigate dozens of web pages.
The system is built around an agentic LLM pipeline with four specialized search tools, a smart conversational memory, multilingual support, and multiple safety layers including scope-aware guardrails.
- Agentic RAG with Parallel Tool Calling β The LLM autonomously decides which knowledge collections to query and fires multiple tool calls in a single turn when needed
- Multi-Collection Knowledge Base β Documents are organized into three Chroma vector store collections:
persone(faculty),offerta_formativa(degree programs), anddipartimento(department info) - Incremental Web Crawling β A multi-threaded BFS crawler automatically scrapes and indexes HTML pages and PDFs from the DIEM ecosystem (
diem.unisa.it,docenti.unisa.it,corsi.unisa.it) - Smart Conversational Memory β Semantic similarity filtering and automatic summarization keep conversation context relevant without overloading the context window
- Query Optimization β Coreference resolution rewrites ambiguous follow-up questions, and multi-query expansion improves retrieval recall
- Cross-Encoder Reranking β A dedicated CrossEncoder model reranks retrieved candidates before passing them to the LLM
- Guardrails β Input/output safety checks (injection, toxicity, PII, hallucination, code generation) via a dedicated Groq LLM, with scope-awareness to handle out-of-domain questions
- Meta-Query Handling β Greetings, thanks, and identity questions are handled without knowledge retrieval and without polluting conversation memory
- Automatic Fallback β If a specialized collection returns no results, an internal
search_allcross-collection fallback activates transparently - RAGAS Evaluation β A fully automated evaluation pipeline with a robust Judge LLM (retry + JSON-repair) and export to JSON/CSV/Excel
- Multilingual β Tool queries are always sent in Italian (for retrieval accuracy); responses are generated in the user's language
- Streamlit Web UI β A polished chat interface with trace inspection, session management, and suggested starter questions
.
βββ src/
β βββ app.py # Streamlit web application entry point
β βββ agent/
β β βββ agent.py # RAGAgent facade and RAGAgentFactory
β β βββ agent_main.py # CLI entry point and REPL
β β βββ callbacks.py # Observability and interaction logging
β β βββ guardrails.py # Input/output safety checks
β β βββ guardrails_config/ # NeMo Guardrails prompts and rail definitions
β β βββ llm_providers.py # LLM provider abstraction (Ollama/Groq/HuggingFace)
β β βββ memory.py # SmartConversationMemory with semantic filtering
β β βββ prompts.py # System prompts (agent + meta queries)
β β βββ tools/ # LangChain tools (search_persone, search_offerta_formativa, etc.)
β βββ config/
β β βββ settings.py # Centralized configuration (dataclasses + env vars)
β β βββ logging_config.py # Logging setup
β βββ evaluation/
β β βββ config.py # Evaluation configuration
β β βββ dataset.py # Dataset builder (loads questions, runs agent, builds RAGAS dataset)
β β βββ eval_main.py # CLI entry point for evaluation
β β βββ judge.py # Robust Judge LLM (retry + JSON-repair)
β β βββ runner.py # Evaluation orchestrator (RAGAS metrics + report export)
β βββ ingestion/
β β βββ indexer.py # Multi-collection chunking, embedding, and Chroma indexing
β β βββ registry.py # Incremental indexing registry (SHA-256 based)
β β βββ router.py # Document routing and metadata extraction
β β βββ scheduler_main.py # Ingestion pipeline CLI (scrape / index / verify / full)
β βββ retrieval/
β β βββ engine.py # QueryOptimizer, CrossEncoderReranker, RetrievalEngine
β βββ scraping/
β βββ factories.py # HTML and PDF rule factories
β βββ interfaces.py # Abstract base classes (CleaningRule, PdfFilterRule, UrlClassifier)
β βββ persistence.py # HTML document and PDF ledger persistence
β βββ scrapers.py # Multi-threaded BFS crawler (UnisaCrawler)
β βββ rules/ # Concrete rule implementations (HTML content, PDF, URL classifiers)
βββ data/
βββ raw/ # Crawled HTML files and PDF links
βββ vectorstore/ # ChromaDB persistence and parent docstore
βββ evaluation/ # Evaluation question sets (JSON)
| Component | Technology |
|---|---|
| LLM Backend | Ollama (Nemotron, Qwen), Groq (Llama 3.3 70B), HuggingFace |
| Agent Framework | LangChain (create_agent, tool calling) |
| Vector Store | ChromaDB |
| Embeddings | Qwen3-Embedding-0.6B |
| Reranker | Qwen3-Reranker-0.6B |
| Web Crawling | AsyncHtmlLoader, BeautifulSoup4, Requests |
| Web UI | Streamlit |
| Guardrails | Groq (Llama 3.3 70B) with custom prompt-based rails |
| Evaluation | RAGAS framework |
| Configuration | Environment variables + Python dataclasses |
- Python 3.10+
- Ollama running locally (or valid Groq API keys)
- ChromaDB dependencies
# Clone the repository
git clone <repo-url>
cd <repo-directory>Create a .env file in the project root:
# LLM Provider (ollama | groq | huggingface)
LLM_PROVIDER=ollama
LLM_MODEL=nemotron-3-super:cloud
OLLAMA_BASE_URL=http://localhost:11434
# Groq API Keys (optional, for guardrails and/or chat)
GROQ_CHAT_API_KEY=your_key_here
GROQ_REWRITER_API_KEY=your_key_here
GROQ_GUARDRAILS_API_KEY=your_key_here
# Embedding Model
EMBEDDING_MODEL=Qwen/Qwen3-Embedding-0.6B
RERANKER_MODEL=Qwen/Qwen3-Reranker-0.6B
# Vector Store
CHROMA_PERSIST_DIR=data/vectorstore/chroma# Full pipeline: crawl the DIEM website and index everything
python -m ingestion.scheduler_main --mode full
# Or run steps individually:
python -m ingestion.scheduler_main --mode scrape # crawling only
python -m ingestion.scheduler_main --mode index # indexing only
python -m ingestion.scheduler_main --mode verify # verify collectionscd src
python -m streamlit run app.pycd src
python -m agent.agent_main
# Single query mode
python -m agent.agent_main --single-query "What degree programs does DIEM offer?"
# Disable scope guardrail (for testing)
python -m agent.agent_main --no-scope-guardThe project includes a full automated evaluation pipeline using the RAGAS framework.
Create data/evaluation/questions.json:
{
"dataset_name": "DIEM Evaluation Set",
"samples": [
{
"question": "What degree programs are offered by DIEM?",
"ground_truth": "DIEM offers degree programs in..."
}
]
}cd src
python -m evaluation.eval_main --input data/evaluation/questions.json
# Options:
# --output results/run_01 # custom output directory
# --no-guardrails # disable guardrails during evaluation
# --log-level DEBUG # verbose loggingThe runner produces a timestamped JSON report, CSV, and a styled Excel workbook with per-sample metrics and aggregated scores for: Context Precision, Context Recall, Response Relevancy, Faithfulness, and Factual Correctness.
All settings are managed via environment variables or Python dataclasses in src/config/settings.py. Key parameters:
| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
ollama |
LLM backend (ollama, groq, huggingface) |
LLM_MODEL |
nemotron-3-super:cloud |
Model name |
LLM_TEMPERATURE |
0.0 |
Generation temperature |
EMBEDDING_MODEL |
Qwen/Qwen3-Embedding-0.6B3 |
HuggingFace embedding model |
RERANKER_MODEL |
Qwen/Qwen3-Reranker-0.6B |
CrossEncoder reranker |
RERANKER_TOP_N |
5 |
Number of documents after reranking |
MEMORY_MAX_TURNS |
10 |
Max conversation turns to retain |
MEMORY_SIMILARITY_THRESHOLD |
0.55 |
Cosine similarity cutoff for memory filtering |
MAX_TOOL_CALLS |
3 |
Max tool invocations per agent turn |
MAX_DEPTH |
5 |
Crawler BFS depth |
The system implements a three-stage safety pipeline powered by Llama 3.3 70B via Groq:
- Input Check β Blocks prompt injections, toxic language, manipulative instructions, and out-of-scope questions before any retrieval occurs
- Meta Check β Identifies conversational messages (greetings, thanks, identity questions) and routes them to a lightweight direct-LLM handler, keeping conversation memory clean
- Output Check β Scans generated responses for inappropriate content, code blocks, or sensitive data (fiscal codes, IBANs) before returning them to the user
The chatbot answers questions grounded exclusively in content from:
https://www.diem.unisa.itβ Department homepage, labs, calls, research, organizationhttps://docenti.unisa.it/β DIEM faculty profiles, courses, research, office hourshttps://corsi.unisa.it/β Degree program pages, study plans, regulations
External links and out-of-scope questions (e.g., general knowledge, other universities) are explicitly detected and declined.
This project is licensed under the PolyForm Noncommercial License 1.0.0.
Copyright (c) 2026 Antonio Apicella, Ivan Luigi Cipriano, Simone Faraulo, Antonio Graziosi
Permission is granted for personal, educational, and research use. Any commercial advantage or monetary compensation derived from the use, reproduction, or distribution of this software is strictly prohibited without explicit written authorization from the authors.
