Skip to content

Updated readme with poppler dependency #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 63 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,63 @@ This application uses Azure Document Intelligence and Azure OpenAI services to a
- Azure Document Intelligence service
- Azure OpenAI service with GPT-4o deployment

#### Poppler Installation (Required for pdf2image)
This project uses the `pdf2image` library which depends on Poppler. Please install it as follows:

**For Windows:**
1. Download the latest Poppler binary from the [poppler-windows releases](https://github.com/oschwartz10612/poppler-windows/releases).
2. Unzip the downloaded archive to a location under **Program Files** (e.g., `C:\Program Files\poppler`).
3. Add the Poppler `bin` directory (e.g., `C:\Program Files\poppler\poppler-xx\bin`) to your system PATH:
- Press **Win + R**, type `sysdm.cpl`, and press **Enter**.
- Go to the **Advanced** tab and click **Environment Variables**.
- Under **System variables**, select the **Path** variable and click **Edit**.
- Click **New** and add the path to the Poppler `bin` folder.
- Click **OK** to close all dialogs.
4. To verify the installation, open a Command Prompt and run:
```
pdftoppm -v
```

**For macOS:**
1. Install Homebrew if it's not already installed. In Terminal, run:
```bash
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
```
2. Install Poppler using Homebrew:
```bash
brew install poppler
```
3. Verify installation by running:
```bash
pdftoppm -v
```

### Virtual Environment Setup

It is recommended to create a virtual environment for the project:

**On Windows:**
1. Open a Command Prompt in the project directory.
2. Create a virtual environment using:
```
python -m venv venv
```
3. Activate the virtual environment:
```
venv\Scripts\activate
```

**On macOS/Linux:**
1. Open a Terminal in the project directory.
2. Create a virtual environment using:
```
python3 -m venv venv
```
3. Activate the virtual environment:
```
source venv/bin/activate
```

### Installation Steps
1. Clone the repository
2. Install required dependencies:
Expand Down Expand Up @@ -48,7 +105,7 @@ This application uses Azure Document Intelligence and Azure OpenAI services to a
AZURE_OPENAI_ENDPOINT="https://your-openai-resource.openai.azure.com"
```

c. Note that `AZURE_OPENAI_API_KEY` should be the same as `OPENAI_KEY` and `AZURE_OPENAI_ENDPOINT` should be the same as `OPENAI_ENDPOINT`. These duplicated variables are needed due to the way the OpenAI library looks for environment variables.
c. Note that `AZURE_OPENAI_API_KEY` should be the same as `OPENAI_KEY` and `AZURE_OPENAI_ENDPOINT` should be the same as `OPENAI_ENDPOINT`.

### Required Files
The following files are necessary for running the application:
Expand Down Expand Up @@ -88,7 +145,11 @@ The application will be accessible at http://localhost:5000 in your web browser.
### Processing Method Selection Dropdown
![Processing Method Selection](static/screens/2.png)

### Results for Handwritten Multilingual tests
![image](https://github.com/user-attachments/assets/845bd41b-4373-4a35-a072-e30d184e212c)


## Troubleshooting
- If you encounter environment variable errors, ensure your `.env` file contains all required variables
- For PDF rendering issues, ensure you have the necessary system dependencies for pdf2image
- For PDF rendering issues, ensure you have the necessary system dependencies for pdf2image (see Poppler installation above)
- Check application logs for detailed error information