Skip to content

A web app that allows users to upload PDFs and interact with them through a Q&A interface. The application extracts text from PDFs, generates embeddings, stores them in a FAISS database, and retrieves relevant information to provide context-aware answers using a large language model .

Notifications You must be signed in to change notification settings

HemalDholakiya12/PDFChat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 

Repository files navigation

PDFChat

A web-based application that allows users to upload PDF files and interact with them via a question-and-answer interface. This application parses the PDF, generates embeddings for the text, stores them in a vector database (FAISS), and retrieves relevant information using semantic search to provide contextual answers with an AI language model.

Features

  • Upload PDFs and extract text.
  • Text chunking and embedding generation.
  • Vector storage with FAISS for efficient similarity search.
  • Answer generation using the Llama3 model hosted via Groq.
  • Intuitive UI for chatting with your PDFs.

Tech Stack

  • Frontend: Next.js
  • Backend: FastAPI
  • Text Processing: PyMuPDFLoader, RecursiveCharacterTextSplitter
  • Embeddings: HuggingFace MiniLM Model
  • Vector Search: FAISS
  • AI Model: Llama3 (via Groq)

How It Works

This web-application follows a structured process to handle user-uploaded PDFs and respond to queries. Here’s a high-level flow of the PDF processing and question-answering pipeline:

flowchart TD
    A[User Uploads PDF] --> B[Read PDF as Bytes using UploadFile]
    B --> C[Write Bytes to Temporary File using tempfile]
    C --> D[Load PDF using PyMuPDFLoader]
    D --> E[Split Text into Chunks using RecursiveCharacterTextSplitter]
    E --> F[Generate Embeddings using HuggingFace MiniLM]
    F --> G[Store Embeddings in FAISS Vector Store]
    G --> H[Create Retriever from FAISS]
    H --> I[Initialize LLM - Groq LLaMA3-8B]
    I --> J[Create QA Chain using RetrievalQA]

    K[User Asks a Question] --> L[Use Retriever to find relevant chunks]
    L --> M[Send Question and Chunks to LLaMA3]
    M --> N[Generate Answer using LLM]
    N --> O[Return Answer as JSON Response]
Loading

About

A web app that allows users to upload PDFs and interact with them through a Q&A interface. The application extracts text from PDFs, generates embeddings, stores them in a FAISS database, and retrieves relevant information to provide context-aware answers using a large language model .

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published