-
Notifications
You must be signed in to change notification settings - Fork 43
Review main-README.md
#113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -2,15 +2,15 @@ | |
|
|
||
| Welcome! Content Understanding is a solution that analyzes and comprehends various media content—including **documents, images, audio, and video**—and transforms it into structured, organized, and searchable data. | ||
|
|
||
| Content Understanding is now a Generally Available (GA) service with the release of the `2025-11-01` API version. | ||
| Content Understanding is now a Generally Available (GA) service with the release of the `2025-11-01` API version. | ||
|
|
||
| - The samples in this repository default to the latest GA API version: `2025-11-01`. | ||
| - We will provide more samples for new functionalities in the GA API versions soon. For details on the updates in the current GA release, see the [Content Understanding What's New Document page](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/whats-new). | ||
| - As of November 2025, the `2025-11-01` API version is now available in a broader range of [regions](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/language-region-support). | ||
| - We will provide more samples for new functionalities in the GA API versions soon. For details on the updates in the current GA release, please see the [Content Understanding What's New Document page](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/whats-new). | ||
| - As of November 2025, the `2025-11-01` API version is available in a broader range of [regions](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/language-region-support). | ||
| - To access sample code for version `2025-05-01-preview`, please check out the corresponding Git tag `2025-05-01-preview` or download it directly from the [release page](https://github.com/Azure-Samples/azure-ai-content-understanding-python/releases/tag/2025-05-01-preview). | ||
| - To access sample code for version `2024-12-01-preview`, please check out the corresponding Git tag `2024-12-01-preview` or download it directly from the [release page](https://github.com/Azure-Samples/azure-ai-content-understanding-python/releases/tag/2024-12-01-preview). | ||
|
|
||
| 👉 If you are looking for **.NET samples**, check out [this repo](https://github.com/Azure-Samples/azure-ai-content-understanding-dotnet/). | ||
| 👉 If you are looking for **.NET samples**, please check out [this repo](https://github.com/Azure-Samples/azure-ai-content-understanding-dotnet/). | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ## Getting Started | ||
|
|
||
|
|
@@ -22,18 +22,18 @@ Run this repository virtually by using GitHub Codespaces, which opens a web-base | |
|
|
||
| [](https://github.com/codespaces/new?skip_quickstart=true&machine=basicLinux32gb&repo=899687170&ref=main&geo=UsEast&devcontainer_path=.devcontainer%2Fdevcontainer.json) | ||
|
|
||
| After clicking the link above, follow these steps to set up the Codespace: | ||
| After clicking the link above, please follow these steps to set up the Codespace: | ||
|
|
||
| 1. Create a new Codespace by selecting the `main` branch, your preferred Codespace region, and the 2-core machine type, as shown in the screenshot below. | ||
|  | ||
|  | ||
| 2. Once the Codespace is ready, open the terminal and follow the instructions in the **Configure Azure AI service resource** section to set up a valid Content Understanding resource. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| ### Local Environment | ||
|
|
||
| 1. Ensure the following tools are installed: | ||
| 1. Please ensure the following tools are installed: | ||
|
|
||
| * [Azure Developer CLI (azd)](https://aka.ms/install-azd) | ||
| * [Python 3.11+](https://www.python.org/downloads/) | ||
| * [Azure Developer CLI (azd)](https://aka.ms/install-azd) | ||
| * [Python 3.11+](https://www.python.org/downloads/) | ||
| * [Git LFS](https://git-lfs.com/) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| 2. Create a new directory named `azure-ai-content-understanding-python` and clone this template into it using the `azd` CLI: | ||
|
|
@@ -49,7 +49,7 @@ After clicking the link above, follow these steps to set up the Codespace: | |
| cd azure-ai-content-understanding-python | ||
| ``` | ||
|
|
||
| - **Important:** If you use `git clone`, you must install Git LFS and run `git lfs pull` to download sample files in the `data` directory: | ||
| - **Important:** If you use `git clone`, please install Git LFS and run `git lfs pull` to download sample files in the `data` directory: | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ```bash | ||
| git lfs install | ||
|
|
@@ -58,7 +58,7 @@ After clicking the link above, follow these steps to set up the Codespace: | |
|
|
||
| 3. Set Up Dev Container Environment | ||
|
|
||
| - Install the following tools that support development containers: | ||
| - Please install the following tools that support development containers: | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| - **Visual Studio Code** | ||
| Download and install [Visual Studio Code](https://code.visualstudio.com/). | ||
|
|
@@ -69,70 +69,73 @@ After clicking the link above, follow these steps to set up the Codespace: | |
|
|
||
| - **Docker** | ||
| Install [Docker Desktop](https://www.docker.com/products/docker-desktop/) for Windows, macOS, or Linux. Docker manages and runs the container environment. | ||
| - Start Docker and ensure it is running in the background. | ||
| - Please start Docker and ensure it is running in the background. | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| - Open the project and start the Dev Container: | ||
|
|
||
| - Open the project folder in VS Code. | ||
| - Open the project folder in VS Code. | ||
| - Press `F1` or `Ctrl+Shift+P`, then type and select: | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ``` | ||
| Dev Containers: Reopen in Container | ||
| ``` | ||
| Alternatively, click the green icon in the lower-left corner of VS Code and select **Reopen in Container**. | ||
| ``` | ||
| Alternatively, click the green icon in the lower-left corner of VS Code and select **Reopen in Container**. | ||
| - VS Code will detect the `.devcontainer` folder, build the development container, and install the necessary dependencies. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| -  | ||
|
|
||
| ## Configure Azure AI Service Resource | ||
|
|
||
| ### Step 1: Create Azure AI Foundry Resource | ||
|
|
||
| First, create an Azure AI Foundry resource that will host both the Content Understanding service and the required model deployments. | ||
| First, please create an Azure AI Foundry resource that will host both the Content Understanding service and the required model deployments. | ||
|
|
||
| 1. Follow the steps in the [Azure Content Understanding documentation](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/) to create an Azure AI Foundry resource. | ||
|
|
||
| 2. Obtain your Foundry resource's endpoint URL from Azure Portal: | ||
|
|
||
| 1. Follow the steps in the [Azure Content Understanding documentation](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/) to create an Azure AI Foundry resource | ||
| 2. Get your Foundry resource's endpoint URL from Azure Portal: | ||
| - Go to [Azure Portal](https://portal.azure.com/) | ||
| - Navigate to your Azure AI Foundry resource | ||
| - Go to **Resource Management** > **Keys and Endpoint** | ||
| - Go to [Azure Portal](https://portal.azure.com/) | ||
| - Navigate to your Azure AI Foundry resource | ||
| - Go to **Resource Management** > **Keys and Endpoint** | ||
| - Copy the **Endpoint** URL (typically `https://<your-resource-name>.services.ai.azure.com/`) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| **⚠️ Important: Grant Required Permissions** | ||
|
|
||
| After creating your Azure AI Foundry resource, you must grant yourself the **Cognitive Services User** role to enable API calls for setting default GPT deployments: | ||
| After creating your Azure AI Foundry resource, please grant yourself the **Cognitive Services User** role to enable API calls for setting default GPT deployments: | ||
|
|
||
| 1. Go to [Azure Portal](https://portal.azure.com/) | ||
| 2. Navigate to your Azure AI Foundry resource | ||
| 3. Go to **Access Control (IAM)** in the left menu | ||
| 4. Click **Add** > **Add role assignment** | ||
| 5. Select the **Cognitive Services User** role | ||
| 6. Assign it to yourself (or the user/service principal that will run the samples) | ||
| 1. Go to [Azure Portal](https://portal.azure.com/) | ||
| 2. Navigate to your Azure AI Foundry resource | ||
| 3. Go to **Access Control (IAM)** in the left menu | ||
| 4. Click **Add** > **Add role assignment** | ||
| 5. Select the **Cognitive Services User** role | ||
| 6. Assign it to yourself (or the user/service principal that will run the samples) | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| > **Note:** This role assignment is required even if you are the owner of the resource. Without this role, you will not be able to call the Content Understanding API to configure model deployments for prebuilt analyzers. | ||
|
|
||
| ### Step 2: Deploy Required Models | ||
|
|
||
| **⚠️ Important:** The prebuilt analyzers require model deployments. You must deploy these models before using prebuilt analyzers: | ||
| - `prebuilt-documentSearch`, `prebuilt-audioSearch`, `prebuilt-videoSearch` require **GPT-4.1-mini** and **text-embedding-3-large** | ||
| - Other prebuilt analyzers like `prebuilt-invoice`, `prebuilt-receipt` require **GPT-4.1** and **text-embedding-3-large** | ||
| **⚠️ Important:** The prebuilt analyzers require model deployments. Please deploy these models before using prebuilt analyzers: | ||
|
|
||
| 1. **Deploy GPT-4.1:** | ||
| - In Azure AI Foundry, go to **Deployments** > **Deploy model** > **Deploy base model** | ||
| - Search for and select **gpt-4.1** | ||
| - Complete the deployment with your preferred settings | ||
| - `prebuilt-documentSearch`, `prebuilt-audioSearch`, `prebuilt-videoSearch` require **GPT-4.1-mini** and **text-embedding-3-large** | ||
| - Other prebuilt analyzers like `prebuilt-invoice` and `prebuilt-receipt` require **GPT-4.1** and **text-embedding-3-large** | ||
|
|
||
| 1. **Deploy GPT-4.1:** | ||
| - In Azure AI Foundry, go to **Deployments** > **Deploy model** > **Deploy base model** | ||
| - Search for and select **gpt-4.1** | ||
| - Complete the deployment with your preferred settings | ||
| - Note the deployment name (by convention, use `gpt-4.1`) | ||
|
|
||
| 2. **Deploy GPT-4.1-mini:** | ||
| - In Azure AI Foundry, go to **Deployments** > **Deploy model** > **Deploy base model** | ||
| - Search for and select **gpt-4.1-mini** | ||
| - Complete the deployment with your preferred settings | ||
| 2. **Deploy GPT-4.1-mini:** | ||
| - In Azure AI Foundry, go to **Deployments** > **Deploy model** > **Deploy base model** | ||
| - Search for and select **gpt-4.1-mini** | ||
| - Complete the deployment with your preferred settings | ||
| - Note the deployment name (by convention, use `gpt-4.1-mini`) | ||
|
|
||
| 3. **Deploy text-embedding-3-large:** | ||
| - In Azure AI Foundry, go to **Deployments** > **Deploy model** > **Deploy base model** | ||
| - Search for and select **text-embedding-3-large** | ||
| - Complete the deployment with your preferred settings | ||
| 3. **Deploy text-embedding-3-large:** | ||
| - In Azure AI Foundry, go to **Deployments** > **Deploy model** > **Deploy base model** | ||
| - Search for and select **text-embedding-3-large** | ||
| - Complete the deployment with your preferred settings | ||
| - Note the deployment name (by convention, use `text-embedding-3-large`) | ||
|
|
||
| For more information on deploying models, see [Deploy models in Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-openai). | ||
| For more information on deploying models, please see [Deploy models in Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-openai). | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ### Step 3: Configure Environment Variables | ||
|
|
||
|
|
@@ -165,15 +168,14 @@ Choose one of the following options to configure your environment: | |
|
|
||
| #### Set Environment Variables with API Key (Alternative) | ||
|
|
||
|
|
||
| 1. Copy the sample environment file: | ||
|
|
||
| ```bash | ||
| cp notebooks/.env.sample notebooks/.env | ||
| ``` | ||
|
|
||
| 2. Edit `notebooks/.env` and set your credentials. | ||
| - Replace `<your-resource-name>` and `<your-azure-ai-api-key>` with your actual values. These can be found in your AI Services resource under **Resource Management** > **Keys and Endpoint**. | ||
| 2. Edit `notebooks/.env` and set your credentials. | ||
| - Replace `<your-resource-name>` and `<your-azure-ai-api-key>` with your actual values. These can be found in your AI Services resource under **Resource Management** > **Keys and Endpoint**. | ||
| - If you used different deployment names in Step 2, update the deployment variables accordingly: | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| ```env | ||
|
|
@@ -184,7 +186,7 @@ Choose one of the following options to configure your environment: | |
| TEXT_EMBEDDING_3_LARGE_DEPLOYMENT=text-embedding-3-large | ||
| ``` | ||
|
|
||
| > ⚠️ **Note:** If you skip the token authentication step above, you must set `AZURE_AI_API_KEY` in your `.env` file. Get your API key from Azure Portal by navigating to your Foundry resource > **Resource Management** > **Keys and Endpoint**. | ||
| > ⚠️ **Note:** If you skip the token authentication step above, you must set `AZURE_AI_API_KEY` in your `.env` file. You can obtain your API key from Azure Portal by navigating to your Foundry resource > **Resource Management** > **Keys and Endpoint**. | ||
|
|
||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| ## Open a Jupyter Notebook and Follow Step-by-Step Guidance | ||
|
|
||
|
|
@@ -217,11 +219,11 @@ Azure AI Content Understanding is a new Generative AI-based [Azure AI service](h | |
|
|
||
| > **Note:** The following samples are currently targeting Preview.2 (API version `2025-05-01-preview`) and will be updated to the GA API version (`2025-11-01`) soon. | ||
|
|
||
| - [Azure Search with Content Understanding](https://github.com/Azure-Samples/azure-ai-search-with-content-understanding-python) | ||
| - [Azure Search with Content Understanding](https://github.com/Azure-Samples/azure-ai-search-with-content-understanding-python) | ||
| - [Azure Content Understanding with OpenAI](https://github.com/Azure-Samples/azure-ai-content-understanding-with-azure-openai-python) | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
|
|
||
| ## Notes | ||
|
|
||
| * **Trademarks** - This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow [Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general). Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos is subject to those third-party’s policies. | ||
|
|
||
| * **Data Collection** - The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft’s privacy statement. Our privacy statement is located at https://go.microsoft.com/fwlink/?LinkID=824704. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices. | ||
| * **Data Collection** - The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft’s privacy statement. Our privacy statement is located at https://go.microsoft.com/fwlink/?LinkID=824704. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices. | ||
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
categories: [Formatting]
categories: [Grammar, Clarity]
categories: [Grammar, Clarity]