Skip to content

pedro-lucinda/file_reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

File Upload API - Full Stack Application

A modern full-stack file upload and search application with a FastAPI backend and Next.js frontend. Upload files, search through their content, and view extracted text with a beautiful, responsive interface.

🏗️ Project Structure

file_upload_api/
├── server/                 # Backend FastAPI application
│   ├── app/               # Application code
│   │   ├── api/           # API routes and exception handlers
│   │   │   ├── routes.py  # Main API endpoints
│   │   │   └── exception_handlers.py
│   │   ├── services/      # Business logic services
│   │   │   ├── file_service.py
│   │   │   ├── file_processor.py
│   │   │   └── search_service.py
│   │   ├── models.py      # Pydantic models
│   │   ├── database.py    # Database operations
│   │   ├── config.py      # Configuration management
│   │   ├── core.py        # FastAPI app setup
│   │   └── utils.py       # Utility functions
│   ├── tests/             # Test suite
│   ├── data/              # Database and file storage
│   │   ├── app.db         # SQLite database
│   │   └── blobs/         # Uploaded files storage
│   ├── main.py            # FastAPI application entry point
│   ├── requirements.txt   # Python dependencies
│   ├── pytest.ini        # Test configuration
│   └── FLOW_DOCUMENTATION.md
├── client/                # Frontend Next.js application
│   ├── src/               # Source code
│   │   ├── app/           # Next.js app router
│   │   │   ├── api/       # API routes (proxy to backend)
│   │   │   ├── globals.css
│   │   │   ├── layout.tsx
│   │   │   └── page.tsx
│   │   ├── components/    # React components
│   │   │   ├── elements/  # Reusable UI elements
│   │   │   ├── modules/   # Feature-specific components
│   │   │   └── ui/        # Base UI components
│   │   ├── hooks/         # Custom React hooks
│   │   ├── services/      # API service layer
│   │   ├── store/         # State management (Zustand)
│   │   ├── types/         # TypeScript type definitions
│   │   └── utils/         # Utility functions
│   ├── public/            # Static assets
│   ├── package.json       # Node.js dependencies
│   ├── next.config.ts     # Next.js configuration
│   ├── tailwind.config.js # Tailwind CSS configuration
│   └── tsconfig.json      # TypeScript configuration
└── README.md              # This file

Flow

mermaid-diagram (1)

🚀 Quick Start

Prerequisites

  • Python 3.9+
  • Node.js 18+
  • npm or yarn
  • SQLite 3 (usually included with Python)

1. Start the Backend Server

# Navigate to server directory
cd server

# Install dependencies
pip install -r requirements.txt

# Start the FastAPI server
uvicorn main:app --reload

The API will be available at:

2. Start the Frontend Client

# Navigate to client directory
cd client

# Install dependencies
npm install

# Start the development server
npm run dev

The frontend will be available at: http://localhost:3000

✨ Features

File Upload & Processing

  • Drag & Drop Interface: Intuitive file upload with visual feedback
  • Multiple File Types: Support for PDF, DOCX, TXT, and CSV files
  • Background Processing: Files are indexed asynchronously for fast uploads
  • Progress Tracking: Real-time upload progress and indexing status

Search & Discovery

  • Full-Text Search: Powered by SQLite FTS5 with BM25 ranking
  • Advanced Queries: Support for phrase searches and term exclusion
  • Highlighted Results: Search terms are highlighted in result snippets
  • Real-time Search: Instant search results as you type

File Management

  • File Viewer: View extracted text content with syntax highlighting
  • File Metadata: Display file information, upload date, and processing status
  • File Deletion: Remove files and all associated data
  • Pagination: Efficient handling of large file collections

User Experience

  • Responsive Design: Works seamlessly on desktop and mobile
  • Error Handling: Comprehensive error states and user feedback
  • Loading States: Smooth loading indicators throughout the app
  • Modern UI: Clean, professional interface built with shadcn/ui

📋 Available Commands

Backend Commands (from server/ directory)

cd server

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest

# Start server directly
uvicorn main:app --reload

# Run with specific host/port
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Frontend Commands (from client/ directory)

cd client

# Install dependencies
npm install

# Start development server
npm run dev

# Build for production
npm run build

# Start production server
npm start

# Run tests
npm test

# Run linting
npm run lint

🔧 API Endpoints

File Management

  • POST /api/v1/files - Upload a file (supports PDF, DOCX, TXT, CSV)
  • GET /api/v1/files - List all files with pagination (?limit=20&offset=0)
  • GET /api/v1/files/{file_id} - Get file metadata and status
  • GET /api/v1/files/{file_id}/content - Get extracted text content of a file
  • DELETE /api/v1/files/{file_id} - Delete a file and all associated data

Search

  • GET /api/v1/search?q={query} - Full-text search with FTS5 (?limit=20&offset=0)
    • Supports phrases in quotes: "exact phrase"
    • Supports exclusion: -term
    • Returns highlighted snippets with [ ] markers

Health & Monitoring

  • GET /api/v1/health - Health check with database connectivity and disk usage

🛠️ Development

Backend Development

The backend is built with:

  • FastAPI - Modern, fast web framework with automatic OpenAPI docs
  • SQLite - Database with FTS5 for full-text search
  • Pydantic - Data validation and serialization
  • PyMuPDF - PDF text extraction
  • python-docx - DOCX text extraction
  • aiosqlite - Async SQLite operations
  • python-multipart - File upload handling

Key features:

  • ✅ Async/await support throughout
  • ✅ Background task processing for file indexing
  • ✅ Full-text search with FTS5 and BM25 ranking
  • ✅ File type validation (PDF, DOCX, TXT, CSV)
  • ✅ Text chunking for improved search relevance
  • ✅ Comprehensive error handling with custom exception handlers
  • ✅ Health monitoring with disk usage tracking
  • ✅ CORS configuration for frontend integration
  • ✅ Type hints throughout with Pydantic models
  • ✅ Structured logging

Frontend Development

The frontend is built with:

  • Next.js 15 - React framework with App Router
  • TypeScript - Type safety throughout
  • Tailwind CSS 4 - Modern utility-first styling
  • shadcn/ui - High-quality UI components
  • Zustand - Lightweight state management
  • Lucide React - Beautiful icons
  • React 19 - Latest React features

Key features:

  • ✅ Modern React patterns with hooks
  • ✅ Type-safe API integration
  • ✅ Responsive design with Tailwind CSS
  • ✅ File upload with drag & drop
  • ✅ Real-time search with debouncing
  • ✅ File content viewer with syntax highlighting
  • ✅ Error boundaries and loading states
  • ✅ Optimistic UI updates
  • ✅ Custom hooks for data fetching
  • ✅ Service layer for API abstraction

🧪 Testing

Backend Tests

cd server
pytest                    # Run all tests
pytest -v                # Verbose output
pytest tests/test_api.py # Run specific test file
pytest --cov=app         # Run with coverage

Frontend Tests

cd client
npm test                 # Run tests
npm run test:watch       # Run tests in watch mode
npm run test:coverage    # Run with coverage

📦 Deployment

Backend Deployment

  1. Environment Setup:

    cd server
    pip install -r requirements.txt
  2. Configuration:

    • Environment variables can be set via .env file or system environment
    • Key settings: FILES_DIR, MAX_FILE_SIZE, CORS_ORIGINS
  3. Run Production Server:

    # Development
    uvicorn main:app --reload
    
    # Production
    uvicorn main:app --host 0.0.0.0 --port 8000 --workers 4

Frontend Deployment

  1. Build:

    cd client
    npm install
    npm run build
  2. Deploy:

    • Static hosting: Deploy the .next/ directory to Vercel, Netlify, or similar
    • Node.js hosting: Use npm start for server-side rendering
  3. Environment Variables:

    # Set backend URL for production
    NEXT_PUBLIC_API_URL=https://your-api-domain.com/api/v1

🔒 Security Features

  • File size limits (configurable, 10MB default)
  • File type validation (PDF, DOCX, TXT, CSV only)
  • Input sanitization and validation with Pydantic
  • Proper error handling with custom exception handlers
  • CORS configuration for secure cross-origin requests
  • Health monitoring with system resource tracking
  • SQL injection protection with parameterized queries
  • File path sanitization to prevent directory traversal

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published