AI-Powered Invoice Data Extraction and Management System
An intelligent invoice management system that automates the extraction, processing, and management of invoice data from various file formats using AI. Built for Swipe's internship assignment.
- π€ AI-Powered Extraction: Uses Google Document AI + Gemini LLM
- π Multi-Format Support: PDF, Images (JPG, PNG), Excel, CSV
- π Three-Tab Dashboard: Invoices, Products, Customers
- π Real-Time Sync: Redux-powered cross-tab updates
- β Smart Validation: Missing field detection and consistency checks
- π¦ Batch Processing: Upload multiple files simultaneously
- π° Tax-aware calculations with automatic totals
- π¦ Bank detail extraction
- π Amount in words conversion
- π Customer purchase history tracking
- π’ Product quantity aggregation
- βοΈ Inline editing with cascade updates
- π¨ Status indicators (OK, Incomplete, Mismatch)
βββββββββββββββ ββββββββββββββββ
β React ββββββββββΆβ FastAPI β
β Frontend β β Backend β
β (TypeScript)βββββββββββ (Python) β
βββββββββββββββ ββββββββββββββββ
β β
β βββββββΆ Google Document AI
β β
ββ Redux Store βββββββΆ Google Gemini LLM
β β
ββ Mantine UI βββββββΆ Pandas (Excel)
β
ββ React Dropzone
- Framework: React 18 with TypeScript
- State Management: Redux Toolkit
- UI Library: Mantine UI v7
- Tables: Mantine React Table
- File Upload: React Dropzone
- Framework: FastAPI (Python 3.12)
- OCR: Google Document AI
- LLM: Google Gemini 2.5 Flash
- Data Processing: Pandas
- Server: Uvicorn
- Node.js 18+
- Python 3.12+
- Google Cloud account with Document AI enabled
- Gemini API key
# Clone repository
git clone https://github.com/YOUR_USERNAME/swipe-invoice-system.git
cd swipe-invoice-system/frontend
# Install dependencies
npm install
# Start development server
npm startcd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment variables
cp .env.example .env
# Edit .env and add your API keys
# Add Google Cloud credentials
# Place service-account-key.json in backend directory
# Start server
python main.pyThe backend will run on http://localhost:8000
The frontend will run on http://localhost:3000
GEMINI_API_KEY=your_gemini_api_key_here
PROJECT_ID=your_google_cloud_project_id
LOCATION=us
PROCESSOR_ID=your_document_ai_processor_id- Create a Google Cloud project
- Enable Document AI API
- Create a Document AI processor (Invoice Parser)
- Create a service account and download JSON key
- Save as
service-account-key.jsonin backend directory
- Get API key from Google AI Studio
- Add to
.envfile
- Drag and drop files onto the upload area
- Or click to browse and select files
- Supports: PDF, JPG, PNG, XLSX, XLS, CSV
- Invoices Tab: View all invoices with details
- Products Tab: See aggregated product data
- Customers Tab: Track customer purchase history
- Click the edit icon (pencil) in any table
- Make changes inline
- Changes automatically sync across all tabs
- Red text indicates missing required fields
- Status badges show data completeness
- Expand invoice rows for detailed breakdown
swipe-invoice-system/
βββ frontend/
β βββ src/
β β βββ components/
β β β βββ Dashboard.tsx
β β β βββ UploadArea.tsx
β β βββ features/
β β β βββ data/
β β β βββ dataSlice.ts
β β βββ store/
β β β βββ store.ts
β β βββ types/
β β βββ types.ts
β βββ package.json
β
βββ backend/
β βββ main.py # FastAPI application
β βββ docai.py # Document AI integration
β βββ llm.py # Gemini LLM extraction
β βββ validator.py # Data validation
β βββ requirements.txt
β
βββ README.md
- Upload β User drops files in upload area
- Route β Backend routes files by type (PDF/Image β DocAI, Excel β Pandas)
- Extract β AI extracts structured data using Gemini prompts
- Validate β Validator checks completeness and consistency
- Store β Redux dispatches action to update state
- Sync β All tabs automatically update with new data
- Edit β User can edit data with cascade updates across tabs
// Products are matched by name across invoices
// Quantities are summed, prices updated if newer data available
if (existing) {
existing.quantity += item.quantity;
if (price > 0) existing.unitPrice = price;
}// Customer totals calculated by summing all their invoices
cust.totalPurchaseAmount = invoices
.filter(inv => inv.customerName === cust.name)
.reduce((sum, inv) => sum + inv.totalAmount, 0);// Invoice is consistent if calculated total matches stated total
const calculated = items.reduce((sum, i) => sum + i.amount, 0) + taxTotal;
const variance = Math.abs(statedTotal - calculated);
isConsistent = variance < 1.0;- OCR accuracy depends on image quality
- Excel format must have standard column headers
- Phone number formatting assumes Indian format
- Service charges (shipping, making) are filtered from products
cd frontend
npm run build
vercel --prodcd backend
# Add Procfile: web: python main.py
# Deploy to Railway or RenderMedidala Aditya
- Institution: VIT Vellore (B.Tech IT, 2022-26)
- Email: djaditya200@gmail.com