AI-driven clinical chatbot that references trusted medical content from MedlinePlus through a custom Retrieval-Augmented Generation pipeline backed by a high-performance vector database to deliver concise, evidence-backed medical answers in real time.
- React + TailwindCSS - Modern, responsive web interface built with Vite for speed
- FastAPI Backend - API layer
- Custom RAG Pipeline – Parses and indexes MedlinePlus XML into a local FAISS vector database
- FAISS + SentenceTransformers – L2 Semantic search with cached embeddings for sub-second retrieval
- Evidence-Backed Responses - Every answer cites ranked MedlinePlus sources with relevance scores
- LLM Integration – Works seamlessly with OpenAI, DeepSeek, Groq, or any compatible API
- Local Knowledge Base - Operates independently without external databases after setup
This chatbot leverages MedlinePlus — a health information service of the U.S. National Library of Medicine (NLM), part of the National Institutes of Health (NIH). MedlinePlus provides high-quality, reliable, and up-to-date information on diseases, conditions, medications, and wellness topics. All content is evidence-based, written in plain language, and reviewed by medical experts.
For more details, visit medlineplus.gov.
/frontend # React-based UI (Vite + TailwindCSS)
/server # FastAPI backend (RAG + LLM orchestration)
/server/mplus_*.xml # MedlinePlus XML medical dataset
-
Navigate to backend:
cd server -
(Optional) Create a virtual environment:
python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set your API key securely: Create a
.envfile in/server:OPENAI_API_KEY=your-api-key -
Add an updated MedlinePlus XML dataset (optional): Place
.xmlfile (e.g.mplus_topics_2025-07-19.xml) in/server. -
Start the backend:
uvicorn main:app --reload
First run will:
- Parse MedlinePlus XML topics
- Generate embeddings
- Build a FAISS index
- Cache as
faiss_index.binanddocuments.json
Backend runs at: http://127.0.0.1:8000
-
Navigate to frontend:
cd frontend -
Install dependencies:
npm install
-
Start development server:
npm run dev
Frontend runs at: http://localhost:5173
- Open the app in your browser.
- Ask a clinical question, e.g.: "What are the symptoms of pneumonia?"
- CuraLinkAI will:
- Retrieve relevant content from the FAISS index
- Pass it to the LLM for synthesis
- Return a clear, concise answer with cited sources
(If the image appears blurry in this preview, please click on it to view in full resolution)
- Customized RAG implementation from scratch
- Add more sources
- Offline functionality
- Advanced filtering by specialty or evidence level
- Dockerized setup for streamlined deployment and reproducibility
Retrieval-Augmented Generation (RAG) combines the strengths of information retrieval and large language models to deliver accurate, trustworthy, and explainable answers.
Instead of relying solely on the LLM's internal knowledge — which can be outdated or hallucinated — RAG fetches up-to-date, domain-specific information from a trusted source (in this case, MedlinePlus) before generating a response.
This approach ensures:
- Accuracy – Answers are grounded in verified medical resources
- Explainability – Responses include citations to original sources
- Adaptability – Knowledge base can be updated without retraining the model
- Reduced Hallucination – Minimizes fabricated or misleading medical advice
This project is for experimental and research purposes only. It is not a substitute for professional medical advice.
