DocumentProcessor - Document OCR Processing Component

DocumentProcessor is a reusable .NET component that provides advanced document OCR (Optical Character Recognition) capabilities with machine learning-based field extraction. It enables applications to process multiple types of financial document images (receipts, invoices, bills, and other financial documents) and extract structured data with high accuracy.

Features

Multi-Document Type Support: Process receipts, invoices, bills, and general financial documents
Automatic Document Classification: Identifies document type with confidence levels
Document OCR Processing: Extract text and structured data from document images
Extended Field Extraction: Document-type-specific fields with confidence scores
- Receipts: vendor, date, total, tax, line items, payment method, cashier, register number
- Invoices: invoice number, due date, payment terms, customer info, PO number, billing details
- Bills: account number, billing period, previous balance, current charges, amount due
- Common Fields: vendor name, address, date, totals, taxes, discounts, shipping, line items
Machine Learning Field Extraction: Identify and extract specific fields using transformer models
Multi-Phase Pipeline: Separate preprocessing, OCR, and inference stages for optimal control
RESTful API: Easy integration via HTTP endpoints
Blazor WebAssembly Component: Ready-to-use UI component for document processing
Python OCR Service: High-accuracy OCR using PaddleOCR, Tesseract, and transformer models
Configurable Preprocessing: Adjustable image enhancement for optimal OCR results

Technology Stack

.NET 10.0: Backend API and component library
ASP.NET Core: RESTful API services
Blazor WebAssembly: Interactive document processing UI component
Python 3.12: OCR and machine learning pipeline
PaddleOCR / Tesseract: Text detection and recognition
Transformer Models: Multiple commercially-licensed vision-language models for field extraction
- Donut (MIT), IDEFICS2 (Apache 2.0), Phi-3-Vision (MIT)
- InternVL (MIT), Qwen2-VL (Apache 2.0)
ImageMagick: Image preprocessing pipeline

Prerequisites

.NET Development

.NET 10.0 SDK or later

Python Requirements for OCR Service

Python version must be 3.12 for all environments
Windows users: The Ninja build system must not be on your PATH, as it can cause build failures for some Python dependencies. If you encounter build errors, ensure Ninja is not present in your PATH (e.g., from Visual Studio installations)

System Dependencies

ImageMagick: Required for image preprocessing
- Ubuntu/Debian: sudo apt-get install imagemagick
- macOS: brew install imagemagick
- Windows: Download from ImageMagick Downloads
Tesseract OCR (optional, fallback engine):
- Ubuntu/Debian: sudo apt-get install tesseract-ocr tesseract-ocr-eng
- macOS: brew install tesseract
- Windows: Download from Tesseract GitHub

Building the Project

1. Clone the Repository

git clone https://github.com/richardforrestbarker/DocumentProcessor.git
cd DocumentProcessor

2. Build the .NET Solution

# Restore NuGet packages and build
dotnet restore
dotnet build

3. Set Up the Python OCR Service

cd Ocr

# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install Python dependencies
pip install -r requirements.txt

# Return to root
cd ..

For detailed Python setup instructions, see the OCR Service README.

Project Structure

DocumentProcessor/
├── Api/                      # ASP.NET Core API service
│   ├── DocumentController.cs # Document processing endpoints
│   └── OcrServiceExtensions.cs # Service registration
├── Data/                     # Shared data models and interfaces
│   ├── IDocumentProcessor.cs # Document processor interface
│   ├── OcrConfiguration.cs   # OCR configuration model
│   └── Messages/             # Request/response DTOs
├── Wasm/                     # Blazor WebAssembly components
│   ├── DocumentProcessing.razor # Main document processing component
│   └── ClientSideDocumentProcessor.cs # Client-side implementation
├── Example/                  # Example Blazor web application
│   ├── Components/Pages/     # Blazor pages
│   │   └── Home.razor        # Home page with DocumentProcessingView
│   ├── Program.cs            # Application startup
│   └── appsettings.json      # Configuration including API URL
├── Ocr/                      # Python OCR service
│   ├── cli.py                # Command-line interface
│   ├── src/                  # Python source code
│   ├── requirements.txt      # Python dependencies
│   └── README.md             # OCR service documentation
├── Tests/                    # Unit and integration tests
└── ServiceDefaults/          # Shared service configuration

Using DocumentProcessor in Your Application

The DocumentProcessor component can be integrated into your .NET application in multiple ways:

Option 1: As a Service (API Integration)

Run the API service and make HTTP requests to the document processing endpoints:

dotnet run --project Api/Api.csproj

The API will be available at https://localhost:5001 (or http://localhost:5000).

Option 2: As a NuGet Package Reference

Add the Data and Api projects to your solution and reference them:

<ItemGroup>
  <ProjectReference Include="..\DocumentProcessor\Data\Data.csproj" />
  <ProjectReference Include="..\DocumentProcessor\Api\Api.csproj" />
</ItemGroup>

Then register the services in your Program.cs:

using Api.Ocr;

var builder = WebApplication.CreateBuilder(args);

// Add DocumentProcessor OCR services
builder.Services.AddOcrDocumentProcessing(builder.Configuration);

var app = builder.Build();

Option 3: Blazor WebAssembly Component

To use the interactive document processing UI in your Blazor application:

Reference the Wasm project:

<ItemGroup>
  <ProjectReference Include="..\DocumentProcessor\Wasm\Wasm.csproj" />
</ItemGroup>

Add the component to your page:

@page "/process-document"
<DocumentProcessing />

Running the Example Application

The repository includes a complete example Blazor web application that demonstrates how to use the DocumentProcessor component with client-side rendering. The Example application includes the DocumentProcessingView component and is pre-configured to work with the API.

Prerequisites

Before running the Example application, ensure you have:

Built the solution (see "Building the Project" section above)
Set up the Python OCR service (see "Set Up the Python OCR Service" section above)

Running the Example

The Example application requires two processes to be running:

1. Start the API Service

In a terminal window, start the API service:

dotnet run --project Api/Api.csproj

The API will be available at https://localhost:7415.

2. Start the Example Application

In another terminal window, start the Example application:

dotnet run --project Example/Example.csproj

The Example application will be available at https://localhost:7256. Open this URL in your browser to access the document processing interface.

Using the Example Application

The home page displays the DocumentProcessingView component
Upload a document image (receipt, invoice, or form)
Adjust preprocessing settings (deskew, denoise, contrast) and preview the results
Continue to OCR to extract text
Continue to Inference to extract structured fields
Accept the final result when satisfied

The Example application demonstrates:

Client-side rendering with Blazor WebAssembly
Integration with both the Api and Wasm projects
Proper configuration of appsettings.json for API communication
Usage of the DocumentProcessingView component
Complete document processing workflow

API Endpoints

The DocumentController provides the following endpoints:

POST /api/document/preprocess

Run preprocessing on an image without DPI resampling. Returns base64-encoded preprocessed image.

Request Body:

{
  "imageBase64": "base64-encoded-image",
  "filename": "document.jpg",
  "jobId": "optional-job-id",
  "denoise": false,
  "deskew": true,
  "fuzzPercent": 30,
  "deskewThreshold": 40,
  "contrastType": "sigmoidal",
  "contrastStrength": 3.0,
  "contrastMidpoint": 120
}

POST /api/document/ocr

Run OCR on a preprocessed image with DPI resampling and safety checks.

Request Body:

{
  "imageBase64": "base64-encoded-preprocessed-image",
  "jobId": "optional-job-id",
  "ocrEngine": "paddle",
  "targetDpi": 300,
  "device": "auto"
}

POST /api/document/inference

Run model inference on OCR results to extract structured fields.

Request Body:

{
  "ocrResult": { /* OCR result object */ },
  "imageBase64": "base64-encoded-image",
  "jobId": "optional-job-id",
  "model": "naver-clova-ix/donut-base-finetuned-cord-v2",
  "modelType": "donut",
  "device": "auto"
}

GET /api/document/status/{jobId}

Get the status of a document processing job.

Configuration

OCR Configuration

Configure OCR settings in your appsettings.json:

{
  "Ocr": {
    "model_name_or_path": "microsoft/layoutlmv3-base",
    "device": "auto",
    "ocr_engine": "paddle",
    "detection_mode": "word",
    "box_normalization_scale": 1000,
    "python_service_path": "./Ocr/cli.py",
    "temp_storage_path": "./temp/documents",
    "max_file_size": 10485760,
    "temp_file_ttl_hours": 1,
    "enable_gpu": true,
    "min_confidence_threshold": 0.8
  }
}

Configuration Options

Option	Description	Default
`model_name_or_path`	HuggingFace model name or local path	`microsoft/layoutlmv3-base`
`device`	Compute device: `auto`, `cuda`, `cpu`	`auto`
`ocr_engine`	OCR engine: `paddle` or `tesseract`	`paddle`
`detection_mode`	Detection mode: `word` or `line`	`word`
`box_normalization_scale`	Bounding box scale for models	`1000`
`python_service_path`	Path to Python CLI script	`./Ocr/cli.py`
`temp_storage_path`	Temporary file storage location	`./temp/documents`
`max_file_size`	Maximum upload size in bytes	`10485760` (10MB)
`temp_file_ttl_hours`	Temporary file retention time	`1` hour
`enable_gpu`	Enable GPU acceleration	`true`
`min_confidence_threshold`	Minimum field confidence (0.0-1.0)	`0.8`

API Integration Configuration

If your application needs to integrate with external barcode APIs, configure them in the Application.Integrations section of appsettings.json. The DocumentProcessor component itself focuses on OCR and document processing, but can be extended to support barcode lookups.

Extension Points

The DocumentProcessor component is designed to be extensible:

Custom Document Processors

Implement the IDocumentProcessor interface to create custom processing pipelines:

public interface IDocumentProcessor
{
    Task<PreprocessingResult> PreprocessImageAsync(PreprocessingRequest request);
    Task<OcrResult> RunOcrAsync(OcrRequest request);
    Task<InferenceResult> RunInferenceAsync(InferenceRequest request);
    Task<JobStatus?> GetJobStatusAsync(string jobId);
}

Service Registration

Register your custom implementation in Program.cs:

// Use the built-in implementation
builder.Services.AddOcrDocumentProcessing(builder.Configuration);

// Or register a custom implementation
builder.Services.AddSingleton<IDocumentProcessor, MyCustomDocumentProcessor>();

Document OCR Pipeline

The DocumentProcessor includes an advanced OCR pipeline that uses machine learning to extract structured data from document images (receipts, invoices, forms). It combines optical character recognition with layout-aware transformer models (LayoutLMv3, Donut, IDEFICS2) to accurately identify and extract fields like vendor names, dates, amounts, and line items.

Architecture

The DocumentProcessor uses a hybrid architecture:

API Service (C#/.NET): Handles HTTP requests, validation, and orchestrates the Python OCR pipeline
Python OCR Service: Performs image processing, OCR, and ML-based field extraction
Blazor Component: Provides interactive UI for document processing with live preview
GPU Acceleration: Supports CUDA-enabled GPUs for faster processing (falls back to CPU)

                    ┌─────────────────────────┐
                    │   Blazor Component      │
                    │  (DocumentProcessing)   │
                    └───────────┬─────────────┘
                                │ HTTP POST
                                ▼
                    ┌─────────────────────────┐
                    │   .NET API Service      │
                    │  (DocumentController)   │
                    │  - Request validation   │
                    │  - Base64 handling      │
                    └───────────┬─────────────┘
                                │ Process.Start
                                │ (Python subprocess)
                                ▼
                    ┌─────────────────────────┐
                    │   Python OCR CLI        │
                    │   (Ocr/cli.py)          │
                    │  - Image preprocessing  │
                    │  - PaddleOCR / Tesseract│
                    │  - Model inference      │
                    │  - Field extraction     │
                    └─────────────────────────┘

OCR Pipeline Stages

Image Preprocessing (using ImageMagick CLI via shell scripts)
- Deskewing (rotation correction)
- Contrast enhancement
- Grayscale conversion
- Remove background
- Denoising
- Convert to TIFF format (optimal for Tesseract)
- Fix resolution to 300 DPI
Text Detection & OCR
- PaddleOCR (primary, high accuracy)
- Tesseract (fallback)
- Word-level bounding boxes with confidence scores
Layout Analysis
- LayoutLMv3 model for document understanding
- Token-to-box mapping (normalized 0-1000 scale)
- Visual and textual feature fusion
Field Extraction
- Vendor name detection
- Date parsing (multiple formats)
- Amount extraction (total, subtotal, tax)
- Line item grouping
- Currency detection

Python OCR Service Setup

The Python OCR service is located in the Ocr/ directory. See Ocr/README.md for complete setup instructions.

Quick Start

cd Ocr

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Test the installation
python cli.py version

GPU Support

For CUDA GPU acceleration on Linux:

pip install paddlepaddle-gpu

For CPU-only or macOS:

pip install paddlepaddle

Using the Document Processor

Via API

Preprocess a document:

curl -X POST http://localhost:5000/api/document/preprocess \
  -H "Content-Type: application/json" \
  -d '{
    "imageBase64": "base64-encoded-image-data",
    "filename": "document.jpg",
    "deskew": true,
    "denoise": false
  }'

Run OCR:

curl -X POST http://localhost:5000/api/document/ocr \
  -H "Content-Type: application/json" \
  -d '{
    "imageBase64": "base64-encoded-preprocessed-image",
    "ocrEngine": "paddle",
    "targetDpi": 300
  }'

Extract fields:

curl -X POST http://localhost:5000/api/document/inference \
  -H "Content-Type: application/json" \
  -d '{
    "ocrResult": {...},
    "imageBase64": "base64-encoded-image",
    "modelType": "donut"
  }'

Via Python CLI

Process a single document:

cd Ocr
python cli.py process --image document.jpg --output result.json

Process with preprocessing options:

python cli.py process \
  --image document.jpg \
  --output result.json \
  --ocr-engine paddle \
  --device cuda \
  --denoise \
  --deskew

Process multi-page document:

python cli.py process \
  --image page1.jpg \
  --image page2.jpg \
  --output result.json

Debug mode (saves intermediate images):

python cli.py process \
  --image document.jpg \
  --output result.json \
  --debug \
  --debug-output-dir ./debug_output

Separate Pipeline Commands

The CLI supports separate commands for each phase of the OCR pipeline, which is useful for the document processing live view feature:

Preprocess only (without DPI resampling):

python cli.py preprocess \
  --image receipt.jpg \
  --output-format base64 \
  --deskew \
  --denoise \
  --fuzz-percent 30 \
  --contrast-type sigmoidal

OCR only (with DPI resampling):

python cli.py ocr \
  --image preprocessed.png \
  --ocr-engine paddle \
  --target-dpi 300 \
  --output ocr_result.json

Inference only (on OCR results):

python cli.py inference \
  --ocr-result ocr_result.json \
  --image preprocessed.png \
  --model naver-clova-ix/donut-base-finetuned-cord-v2 \
  --model-type donut

CLI Options

Option	Description	Default
`--image, -i`	Path to receipt image (can specify multiple)	Required
`--output, -o`	Output JSON file path	stdout
`--model, -m`	LayoutLMv3 model name or path	microsoft/layoutlmv3-base
`--ocr-engine`	OCR engine: `paddle` or `tesseract`	paddle
`--device`	Inference device: `auto`, `cuda`, `cpu`	auto
`--denoise`	Apply denoising preprocessing	false
`--deskew`	Apply deskew correction	false
`--job-id`	Custom job identifier	auto-generated
`--debug`	Enable debug mode: save intermediary images for each processing step	false
`--debug-output-dir`	Directory to save debug output files	./debug_output

OCR Output Format

The OCR system returns structured JSON with the following schema. All fields include confidence levels:

{
  "job_id": "abc123",
  "status": "done",
  "document_type": {
    "value": "invoice",
    "confidence": 0.92,
    "box": null
  },
  "pages": [
    {
      "page_number": 1,
      "raw_ocr_text": "COMPANY NAME\n123 Main St...",
      "words": [
        {
          "text": "COMPANY",
          "box": { "x0": 100, "y0": 50, "x1": 200, "y1": 80 },
          "confidence": 0.98
        }
      ]
    }
  ],
  "vendor_name": {
    "value": "COMPANY NAME",
    "confidence": 0.95,
    "box": { "x0": 100, "y0": 50, "x1": 300, "y1": 80 }
  },
  "date": {
    "value": "2024-01-15",
    "confidence": 0.92,
    "box": { "x0": 400, "y0": 50, "x1": 550, "y1": 80 }
  },
  "total_amount": {
    "value": "1250.00",
    "confidence": 0.96,
    "box": { "x0": 400, "y0": 500, "x1": 500, "y1": 530 }
  },
  "subtotal": {
    "value": "1150.00",
    "confidence": 0.94,
    "box": { "x0": 400, "y0": 450, "x1": 500, "y1": 480 }
  },
  "tax_amount": {
    "value": "100.00",
    "confidence": 0.93,
    "box": { "x0": 400, "y0": 475, "x1": 500, "y1": 505 }
  },
  "currency": {
    "value": "USD",
    "confidence": 0.90,
    "box": null
  },
  "invoice_number": {
    "value": "INV-12345",
    "confidence": 0.88,
    "box": { "x0": 50, "y0": 100, "x1": 150, "y1": 130 }
  },
  "due_date": {
    "value": "2024-02-15",
    "confidence": 0.85,
    "box": { "x0": 450, "y0": 100, "x1": 550, "y1": 130 }
  },
  "customer_name": {
    "value": "Client Company",
    "confidence": 0.87,
    "box": { "x0": 50, "y0": 150, "x1": 250, "y1": 180 }
  },
  "line_items": [
    {
      "description": "Consulting Services",
      "quantity": 10,
      "unit_price": "100.00",
      "line_total": "1000.00",
      "confidence": 0.89,
      "box": { "x0": 50, "y0": 300, "x1": 550, "y1": 330 }
    }
  ]
}

Document-Specific Fields:

Receipts: payment_method, cashier_name, register_number
Invoices: invoice_number, due_date, payment_terms, customer_name, customer_address, po_number
Bills: account_number, billing_period, previous_balance, current_charges, amount_due
All Documents: document_type, vendor_name, merchant_address, date, total_amount, subtotal, tax_amount, currency, line_items, discount, shipping, notes

Bounding Box Format

All bounding boxes are normalized to a 0-1000 scale for consistency with LayoutLM models:

x0: Left edge (0-1000)
y0: Top edge (0-1000)
x1: Right edge (0-1000)
y1: Bottom edge (0-1000)

Output Format

The OCR pipeline returns structured JSON with extracted fields:

{
  "job_id": "abc123",
  "status": "done",
  "pages": [
    {
      "page_number": 1,
      "raw_ocr_text": "STORE NAME\n123 Main St...",
      "words": [
        {
          "text": "STORE",
          "box": { "x0": 100, "y0": 50, "x1": 200, "y1": 80 },
          "confidence": 0.98
        }
      ]
    }
  ],
  "vendor_name": {
    "value": "STORE NAME",
    "confidence": 0.95,
    "box": { "x0": 100, "y0": 50, "x1": 300, "y1": 80 }
  },
  "date": {
    "value": "2024-01-15",
    "confidence": 0.92
  },
  "total_amount": {
    "value": "25.99",
    "confidence": 0.96
  },
  "line_items": [
    {
      "description": "Product 1",
      "quantity": 1.0,
      "unit_price": 12.99,
      "line_total": 12.99,
      "confidence": 0.89
    }
  ]
}

Running Tests

.NET Tests

dotnet test

Python Tests

cd Ocr

# Run all tests
python -m pytest tests/

# Run unit tests only (fast, no dependencies)
python -m pytest tests/test_cli_unit.py

# Run with coverage
python -m pytest tests/ --cov=. --cov-report=html

Troubleshooting

No OCR results returned:

Ensure Python dependencies are installed: pip install -r Ocr/requirements.txt
Check that PaddleOCR or Tesseract is working: python -c "from paddleocr import PaddleOCR; print('OK')"
Verify image is readable and in supported format (JPEG, PNG, TIFF)

GPU not being used:

Check CUDA installation: python -c "import torch; print(torch.cuda.is_available())"
Ensure paddlepaddle-gpu is installed instead of paddlepaddle
Set enable_gpu: true in appsettings.json

Low accuracy:

Try enabling preprocessing options: --denoise --deskew
Ensure image is high resolution (300 DPI recommended)
Use well-lit, non-blurry images
Adjust preprocessing parameters (contrast, fuzz percentage)

Process timeout:

First run downloads models (~500MB), subsequent runs are faster
Increase timeout in configuration if using CPU
Consider using GPU acceleration for better performance

Blazor Component Usage

The DocumentProcessing.razor component provides an interactive UI for document processing with live preview:

Upload a document image: Select a file from your device
Adjust preprocessing settings: Modify deskew, denoise, contrast parameters
Preview results: See preprocessed image before running OCR
Run OCR: Extract text with bounding boxes
Extract fields: Get structured data (vendor, date, amounts, line items)

The component handles the entire workflow through the three-phase pipeline (preprocess → OCR → inference).

Advanced Topics

Building the Python OCR Service

The Python OCR service can be built as a standalone package or containerized:

cd Ocr

# Install dependencies
pip install -r requirements.txt

# Run tests
python -m pytest tests/

# Test CLI
python cli.py version
python cli.py process --help

# Build Docker image (optional)
docker build -t document-processor-ocr .

Supported ML Models

The component supports multiple transformer models for field extraction:

Model	Type	License	Best For
Donut	OCR-free	MIT	Fast processing, receipt-specific
IDEFICS2	Multimodal	Apache 2.0	High accuracy, flexible
LayoutLMv3	Token classification	-	Custom fine-tuning

See Ocr/README.md for model-specific configuration.

Performance

Typical performance on a document (1-2 pages, 300 DPI):

Hardware	Preprocessing	OCR	Inference	Total
CPU only	1-2s	2-4s	8-15s	11-21s
GPU (CUDA)	1-2s	1-2s	1-3s	3-7s

Example Integration

Here's a complete example of integrating DocumentProcessor into an ASP.NET Core application:

// Program.cs
using Api.Ocr;

var builder = WebApplication.CreateBuilder(args);

// Add DocumentProcessor services
builder.Services.AddOcrDocumentProcessing(builder.Configuration);

// Add controllers
builder.Services.AddControllers();

var app = builder.Build();

app.MapControllers();
app.Run();

// YourController.cs
[ApiController]
[Route("api/[controller]")]
public class MyDocumentController : ControllerBase
{
    private readonly IDocumentProcessor _processor;
    
    public MyDocumentController(IDocumentProcessor processor)
    {
        _processor = processor;
    }
    
    [HttpPost("process")]
    public async Task<IActionResult> ProcessDocument([FromBody] ProcessRequest request)
    {
        // Preprocess
        var preprocessed = await _processor.PreprocessImageAsync(new PreprocessingRequest
        {
            ImageBase64 = request.ImageBase64,
            Deskew = true,
            Denoise = true
        });
        
        // OCR
        var ocrResult = await _processor.RunOcrAsync(new OcrRequest
        {
            ImageBase64 = preprocessed.PreprocessedImageBase64,
            OcrEngine = "paddle"
        });
        
        // Extract fields
        var inference = await _processor.RunInferenceAsync(new InferenceRequest
        {
            OcrResult = ocrResult,
            ImageBase64 = preprocessed.PreprocessedImageBase64,
            ModelType = "donut"
        });
        
        return Ok(inference);
    }
}

License

[Specify your license here]

Contributing

Contributions are welcome! Please see the contribution guidelines for this project.

Support

For issues, questions, or feature requests, please open an issue on the GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
Api		Api
Data		Data
Example		Example
Ocr		Ocr
ServiceDefaults		ServiceDefaults
Tests.E2E		Tests.E2E
Tests		Tests
Wasm		Wasm
.gitignore		.gitignore
.tmp_dotnet_test.log		.tmp_dotnet_test.log
BROWSER_PREPROCESSING_MIGRATION_PLAN.md		BROWSER_PREPROCESSING_MIGRATION_PLAN.md
DocumentProcessor.sln		DocumentProcessor.sln
README.md		README.md
copilot-instructions.md		copilot-instructions.md

Folders and files

Latest commit

History

Repository files navigation

DocumentProcessor - Document OCR Processing Component

Features

Technology Stack

Prerequisites

.NET Development

Python Requirements for OCR Service

System Dependencies

Building the Project

1. Clone the Repository

2. Build the .NET Solution

3. Set Up the Python OCR Service

Project Structure

Using DocumentProcessor in Your Application

Option 1: As a Service (API Integration)

Option 2: As a NuGet Package Reference

Option 3: Blazor WebAssembly Component

Running the Example Application

Prerequisites

Running the Example

1. Start the API Service

2. Start the Example Application

Using the Example Application

API Endpoints

POST /api/document/preprocess

POST /api/document/ocr

POST /api/document/inference

GET /api/document/status/{jobId}

Configuration

OCR Configuration

Configuration Options

API Integration Configuration

Extension Points

Custom Document Processors

Service Registration

Document OCR Pipeline

Architecture

OCR Pipeline Stages

Python OCR Service Setup

Quick Start

GPU Support

Using the Document Processor

Via API

Via Python CLI

Separate Pipeline Commands

CLI Options

OCR Output Format

Bounding Box Format

Output Format

Running Tests

.NET Tests

Python Tests

Troubleshooting

Blazor Component Usage

Advanced Topics

Building the Python OCR Service

Supported ML Models

Performance

Example Integration

License

Contributing

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages