🚀 LangChain RAG Embeddings Study

A comprehensive study repository for exploring LangChain, RAG (Retrieval-Augmented Generation), embeddings, and semantic vector search techniques with practical implementations.

📋 Overview

This repository contains hands-on experiments and production-ready implementations of modern AI techniques, focusing on:

🔗 LangChain Framework: Building sophisticated LLM applications
🔍 RAG (Retrieval-Augmented Generation): Enhancing LLM responses with relevant context
🧠 Vector Embeddings: Converting text into semantic representations
🔎 Semantic Search: Finding relevant documents using meaning, not just keywords
📄 Document Processing: PDF parsing, chunking, and vectorization strategies

🎯 What You'll Learn

01 - Introduction to Embeddings

PDF Processing: Extract and process text from PDF documents
Vector Embeddings: Convert text chunks into semantic vectors using OpenAI
Similarity Search: Find relevant content using semantic similarity
Interactive Chat: Build a conversational interface with context-aware responses

02 - Real-World FAQ System

Structured Data: Work with JSON-based FAQ datasets
Multi-Category Search: Handle different types of questions (product, service, technical)
Production-Ready Chatbot: Implement a robust FAQ answering system
Context Retrieval: Smart document retrieval for accurate responses

🛠️ Technologies Used

TypeScript - Type-safe JavaScript development
LangChain - Framework for building LLM applications
OpenAI GPT-4 - Advanced language model for text generation
OpenAI Embeddings - Text-to-vector conversion
PostgreSQL - Vector database for storing embeddings
Drizzle ORM - Type-safe database operations
PDF-Parse - PDF document processing

🚀 Quick Start

Prerequisites

Node.js 18+
PostgreSQL database
OpenAI API key

Installation

Clone the repository

git clone https://github.com/Natanaelvich/langchain-rag-embeddings-study.git
cd langchain-rag-embeddings-study

Install dependencies

npm install

Set up environment variables

cp .env.example .env
# Edit .env with your OpenAI API key and database credentials

Run the examples

PDF Processing & Chat:

npm run dev src/01-introduction/gpt-embeddings-pdf.ts

FAQ Chatbot:

npm run dev src/02-real-world-faq/chat-faq.ts

📁 Project Structure

src/
├── 01-introduction/          # Basic embeddings and PDF processing
│   ├── gpt-embeddings-pdf.ts # Interactive chat with PDF content
│   ├── load-embeddings-pdf.ts # PDF loading and vectorization
│   └── search-embeddings-pdf.ts # Vector search implementation
├── 02-real-world-faq/        # Production FAQ system
│   ├── chat-faq.ts          # Interactive FAQ chatbot
│   ├── load-faq-data.ts     # FAQ data loading and processing
│   └── search-faq.ts        # FAQ-specific search logic
└── schema.ts                # Database schema definitions

tmp/
├── agents-data/             # Sample data for agents
├── faq-data/               # FAQ datasets (product, service, technical)
└── pdf/                    # PDF documents for processing

🔧 Available Scripts

npm run dev - Start development server with hot reload
npm run build - Build TypeScript to JavaScript
npm run start - Run built application
npm run test - Run test suite
npm run lint - Check code quality
npm run format - Format code with Prettier
npm run studio - Open Drizzle Studio for database management

🎓 Learning Path

Beginner Level

Start with 01-introduction/ to understand basic concepts
Learn about embeddings and vector search
Build your first RAG application

Intermediate Level

Explore 02-real-world-faq/ for production patterns
Understand structured data processing
Implement multi-category search

Advanced Level

Customize the implementations for your use case
Add new data sources and processing pipelines
Optimize performance and accuracy

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Resources

⭐ Star this repository if you found it helpful for your AI/ML journey!

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.docs/embedding		.docs/embedding
src		src
tmp		tmp
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
drizzle.config.ts		drizzle.config.ts
package.json		package.json
tsconfig.json		tsconfig.json
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🚀 LangChain RAG Embeddings Study

📋 Overview

🎯 What You'll Learn

01 - Introduction to Embeddings

02 - Real-World FAQ System

🛠️ Technologies Used

🚀 Quick Start

Prerequisites

Installation

📁 Project Structure

🔧 Available Scripts

🎓 Learning Path

Beginner Level

Intermediate Level

Advanced Level

🤝 Contributing

📄 License

🔗 Related Resources

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Natanaelvich/ai-rag-embeddings-langchain

Folders and files

Latest commit

History

Repository files navigation

🚀 LangChain RAG Embeddings Study

📋 Overview

🎯 What You'll Learn

01 - Introduction to Embeddings

02 - Real-World FAQ System

🛠️ Technologies Used

🚀 Quick Start

Prerequisites

Installation

📁 Project Structure

🔧 Available Scripts

🎓 Learning Path

Beginner Level

Intermediate Level

Advanced Level

🤝 Contributing

📄 License

🔗 Related Resources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages