Skip to content

A3copilotprogram/PLG4-Recipe-Recommendation-Chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

title emoji colorFrom colorTo sdk pinned license
Recipe Recommendation Chatbot
πŸ₯˜
indigo
pink
docker
false
mit

πŸ₯˜ Recipe Recommendation Chatbot

An AI-powered recipe recommendation system with intelligent chat interface, built using RAG (Retrieval Augmented Generation) and modern streaming architecture. Features intelligent caching for 70-90% cost reduction and memory-optimized streaming for production-grade performance.

✨ Key Features

πŸ€– Advanced AI Capabilities

  • Multi-Provider LLM Support: OpenAI (GPT-5), Google (Gemini-2.5), Anthropic (Claude), Ollama (Local models)
  • RAG Pipeline: Context-aware recipe recommendations with vector search
  • Intent Classification: Smart routing for greetings, recipe queries, and cooking advice
  • Streaming Responses: Real-time chat with memory-optimized architecture

πŸš€ Performance Optimizations

  • Intelligent Caching: 70-90% API cost reduction with multi-layer caching
  • Memory Safety: O(1) memory usage during streaming (no memory leaks)
  • Real-time Monitoring: Cache performance and system health monitoring
  • Production Ready: Comprehensive optimization and documentation

πŸ—„οΈ Data & Storage

  • Vector Databases: ChromaDB (local) and MongoDB Atlas (cloud) support
  • Recipe Mining: Automated scraping from multiple Nigerian recipe sources
  • Smart Embedding: Optimized embedding models for recipe similarity

πŸš€ Quick Start

Prerequisites

  • Python 3.9+ (backend)
  • Node.js 18+ (frontend)
  • API keys for your chosen LLM provider

1. Backend Setup

# Navigate to backend
cd backend

# Install dependencies
pip install -r requirements.txt

# Configure environment
cp .env.example .env
# Edit .env with your API keys and preferences

# Run the backend server
uvicorn app:app --reload --host 127.0.0.1 --port 8080

2. Frontend Setup

# Navigate to frontend
cd frontend

# Install dependencies
npm install
# or
yarn install

# Configure environment
cp .env.example .env
# Configure API endpoint

# Run the development server
npm run dev
# or  
yarn dev

3. Access the Application

πŸ“ Project Architecture

chatbot/
β”œβ”€β”€ 🎯 backend/              # FastAPI backend with RAG pipeline
β”‚   β”œβ”€β”€ handlers/            # Request handlers (chat, health, jobs)
β”‚   β”œβ”€β”€ services/            # Core services (LLM, caching, routing)
β”‚   β”œβ”€β”€ config/             # Configuration and settings
β”‚   β”œβ”€β”€ data_mining/        # Recipe scrapers and data collection
β”‚   β”œβ”€β”€ utils/              # Utilities and helpers
β”‚   β”œβ”€β”€ docs/               # Comprehensive backend documentation
β”‚   └── tests/              # Backend test suite
β”‚
β”œβ”€β”€ πŸ–ΌοΈ frontend/             # Next.js frontend application
β”‚   β”œβ”€β”€ app/                # Next.js app directory
β”‚   β”œβ”€β”€ components/         # React components
β”‚   β”œβ”€β”€ services/           # API integration services
β”‚   β”œβ”€β”€ hooks/              # Custom React hooks
β”‚   └── types/              # TypeScript type definitions
β”‚
β”œβ”€β”€ πŸ“š docs/                # Project-wide documentation
β”‚   β”œβ”€β”€ architecture.md     # System architecture overview
β”‚   β”œβ”€β”€ api-documentation.md # API reference
β”‚   └── deployment.md       # Deployment guides
β”‚
└── πŸš€ deploy-to-hf.sh      # HuggingFace Spaces deployment script

πŸ—οΈ System Architecture

Backend Architecture (Handler-Based Design)

  • Smart Response Router: Intent-based routing with cache-first lookup
  • Memory-Optimized Streaming: Real-time responses without memory leaks
  • Intelligent Caching: Multi-layer caching (LLM, embeddings, search results)
  • Multi-Provider Support: Unified interface for OpenAI, Google, Anthropic, Ollama

Frontend Architecture (Modern React/Next.js)

  • Next.js 15: Latest React framework with TypeScript support
  • Real-time Streaming: AI SDK integration for streaming responses
  • State Management: Zustand for efficient state handling
  • Modern UI: Responsive design with Tailwind CSS

πŸ”§ Configuration

LLM Provider Options

OpenAI (Best Value)

LLM_PROVIDER=openai
OPENAI_API_KEY=your_api_key
OPENAI_MODEL=gpt-5-nano  # Best value: $1/month for 30K queries

Google Gemini (Best Free Tier)

LLM_PROVIDER=google  
GOOGLE_API_KEY=your_api_key
GOOGLE_MODEL=gemini-2.5-flash  # Excellent free tier, then $2/month

Ollama (Privacy/Self-Hosting)

LLM_PROVIDER=ollama
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL=llama3.1:8b  # 4.7GB download, 8GB RAM

πŸ“š For detailed configuration guides, see Backend Documentation

πŸ“š Documentation

πŸ“– Getting Started

πŸš€ Performance & Optimization

πŸ”§ Configuration & Setup

πŸ› οΈ Development & Troubleshooting

πŸ§ͺ Testing & Validation

Performance Validation

  • Cache Hit Rate: 87.5% in testing scenarios
  • Memory Stability: Tested with 1000+ consecutive requests
  • API Cost Reduction: ~79% with intelligent caching
  • Response Times: <50ms average for cached responses

πŸš€ Deployment

HuggingFace Spaces (Recommended)

./deploy-to-hf.sh your-hf-space-name

Local Production

# Backend
cd backend
uvicorn app:app --host 0.0.0.0 --port 8080

# Frontend  
cd frontend
npm run build && npm start

Docker Deployment

The backend includes Docker configuration for containerized deployment.

πŸ”’ Security & Production Features

  • Input Sanitization: XSS protection and length validation
  • Environment Variable Protection: Secure API key management
  • CORS Configuration: Frontend integration protection
  • Health Monitoring: Real-time system status and alerting
  • Error Handling: Graceful degradation and structured logging

πŸ“Š Performance Features

Cost Optimization

  • Intelligent Caching: 70-90% reduction in LLM API costs
  • Cache Warming: Automatic caching of popular queries
  • Usage Monitoring: Real-time token usage and cost tracking

Memory Management

  • Stream-First Architecture: No response accumulation
  • Constant Memory Usage: O(1) memory during streaming
  • Automatic Cleanup: TTL-based cache eviction

Real-time Performance

  • Streaming Responses: Real-time chat experience
  • Cache-First Lookup: <100ms for cached responses
  • Health Monitoring: Real-time system metrics

🀝 Contributing

We welcome contributions! Please see our Contributing Guidelines for:

  • Development setup and workflow
  • Code standards and best practices
  • Testing requirements and procedures
  • Pull request process

πŸ‘₯ Team

GenAI PLG 4 - Andela Community Program

πŸ“„ License

This project is licensed under the MIT License.

About

No description, website, or topics provided.

Resources

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 7