Skip to content

Commit 169d911

Browse files
committed
Readme updates.
1 parent d18bb7e commit 169d911

File tree

3 files changed

+48
-4
lines changed

3 files changed

+48
-4
lines changed

README.md

+44-1
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
* [Set Up OpenAI Embeddings Language Processing](#set-up-classification-via-openai-embeddings)
2424
* [Set Up OpenAI Whisper Language Processing](#set-up-audio-transcripts-generation-via-openai-whisper)
2525
* [Set Up Azure AI Language Processing](#set-up-text-to-speech-via-microsoft-azure)
26+
* [Set Up AWS Language Processing](#set-up-text-to-speech-via-amazon-polly)
2627
* [Set Up Azure AI Vision Image Processing](#set-up-image-processing-features-via-microsoft-azure)
2728
* [Set Up OpenAI DALL·E Image Processing](#set-up-image-generation-via-openai)
2829
* [Set Up OpenAI Moderation Language Processing](#set-up-comment-moderation-via-openai-moderation)
@@ -45,7 +46,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
4546
* Generate new images on demand to use in-content or as a featured image using [OpenAI's DALL·E 3 API](https://platform.openai.com/docs/guides/images)
4647
* Generate transcripts of audio files using [OpenAI's Whisper API](https://platform.openai.com/docs/guides/speech-to-text)
4748
* Moderate incoming comments for sensitive content using [OpenAI's Moderation API](https://platform.openai.com/docs/guides/moderation)
48-
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech)
49+
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech) or [Amazon Polly](https://aws.amazon.com/polly/)
4950
* Classify post content using [IBM Watson's Natural Language Understanding API](https://www.ibm.com/watson/services/natural-language-understanding/) and [OpenAI's Embedding API](https://platform.openai.com/docs/guides/embeddings)
5051
* BETA: Recommend content based on overall site traffic via [Microsoft Azure's AI Personalizer API](https://azure.microsoft.com/en-us/services/cognitive-services/personalizer/) *(note that this service has been [deprecated by Microsoft](https://learn.microsoft.com/en-us/azure/ai-services/personalizer/) and as such, will no longer work. We are looking to replace this with a new provider to maintain the same functionality (see [issue#392](https://github.com/10up/classifai/issues/392))*
5152
* Generate image alt text, image tags, and smartly crop images using [Microsoft Azure's AI Vision API](https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/)
@@ -77,6 +78,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
7778
* To utilize the Azure AI Vision Image Processing functionality or Text to Speech Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account.
7879
* To utilize the Azure OpenAI Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account and you will need to [apply](https://aka.ms/oai/access) for OpenAI access.
7980
* To utilize the Google Gemini Language Processing functionality, you will need an active [Google Gemini](https://ai.google.dev/tutorials/setup) account.
81+
* To utilize the AWS Language Processing functionality, you will need an active [AWS](https://console.aws.amazon.com/) account.
8082

8183
## Pricing
8284

@@ -399,6 +401,47 @@ Note that [OpenAI](https://platform.openai.com/docs/guides/speech-to-text) can c
399401
* Click the button to preview the generated speech audio for the post.
400402
* View the post on the front-end and see a read-to-me feature has been added
401403

404+
## Set Up Text to Speech (via Amazon Polly)
405+
406+
### 1. Sign up for AWS (Amazon Web Services)
407+
408+
* [Register for a AWS account](https://aws.amazon.com/free/) or sign into your existing one.
409+
* Sign in to the AWS Management Console and open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/)
410+
* Create IAM User (If you don't have any IAM user)
411+
* In the navigation pane, choose **Users** and then click **Create user**
412+
* On the **Specify user details** page, under User details, in User name, enter the name for the new user.
413+
* Click **Next**
414+
* On the **Set permissions** page, under Permissions options, select **Attach policies directly**
415+
* Under **Permissions policies**, search for the policy **polly** and select **AmazonPollyFullAccess** Policy
416+
* Click **Next**
417+
* On the **Review and create** page, Review all of the choices you made up to this point. When you are ready to proceed, Click **Create user**.
418+
* In the navigation pane, choose **Users**
419+
* Choose the name of the user for which you want to create access keys, and then choose the **Security credentials** tab.
420+
* In the **Access keys** section, click **Create access key**.
421+
* On the **Access key best practices & alternatives** page, select **Application running outside AWS**
422+
* Click **Next**
423+
* On the **Retrieve access key** page, choose **Show** to reveal the value of your user's secret access key.
424+
* Copy and save the credentials in a secure location on your computer or click "Download .csv file" to save the access key ID and secret access key to a `.csv` file.
425+
426+
### 2. Configure AWS credentials under Tools > ClassifAI > Language Processing > Text to Speech
427+
428+
* Select **Amazon Polly** in the provider dropdown.
429+
* In the `AWS access key` field, enter the `Access key
430+
` copied from above.
431+
* In the `AWS secret access key` field, enter your `Secret access key` copied from above.
432+
* In the `AWS Region` field, enter your AWS region value eg: `us-east-1`
433+
* Click **Save Changes** (the page will reload).
434+
* If connected successfully, a new dropdown with the label "Voices" will be displayed.
435+
* Select a voice and voice engine as per your choice.
436+
* Select a post type that should use this service.
437+
438+
### 3. Using the Text to Speech service
439+
440+
* Assuming the post type selected is "post", create a new post and publish it.
441+
* After a few seconds, a "Preview" button will appear under the ClassifAI settings panel.
442+
* Click the button to preview the generated speech audio for the post.
443+
* View the post on the front-end and see a read-to-me feature has been added
444+
402445
## Set Up Image Processing features (via Microsoft Azure)
403446

404447
Note that [Azure AI Vision](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home#image-requirements) can analyze and crop images that meet the following requirements:

includes/Classifai/Providers/AWS/AmazonPolly.php

+2-2
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ public function render_provider_fields() {
4545

4646
add_settings_field(
4747
'access_key_id',
48-
esc_html__( 'AWS access key ID', 'classifai' ),
48+
esc_html__( 'AWS access key', 'classifai' ),
4949
[ $this->feature_instance, 'render_input' ],
5050
$this->feature_instance->get_option_name(),
5151
$this->feature_instance->get_option_name() . '_section',
@@ -58,7 +58,7 @@ public function render_provider_fields() {
5858
'description' => sprintf(
5959
wp_kses(
6060
/* translators: %1$s is replaced with the OpenAI sign up URL */
61-
__( 'Enter the AWS access key ID. Please follow the steps given <a title="AWS documentation" href="%1$s">here</a> to generate AWS credentials.', 'classifai' ),
61+
__( 'Enter the AWS access key. Please follow the steps given <a title="AWS documentation" href="%1$s">here</a> to generate AWS credentials.', 'classifai' ),
6262
[
6363
'a' => [
6464
'href' => [],

readme.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
2323
* Expand or condense text content using [OpenAI's ChatGPT API](https://platform.openai.com/docs/guides/chat), [Microsoft Azure's OpenAI service](https://azure.microsoft.com/en-us/products/ai-services/openai-service) or [Google's Gemini API](https://ai.google.dev/docs/gemini_api_overview)
2424
* Generate new images on demand to use in-content or as a featured image using [OpenAI's DALL·E 3 API](https://platform.openai.com/docs/guides/images)
2525
* Generate transcripts of audio files using [OpenAI's Whisper API](https://platform.openai.com/docs/guides/speech-to-text)
26-
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech)
26+
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech) or [Amazon Polly](https://aws.amazon.com/polly/)
2727
* Classify post content using [IBM Watson's Natural Language Understanding API](https://www.ibm.com/watson/services/natural-language-understanding/) and [OpenAI's Embedding API](https://platform.openai.com/docs/guides/embeddings)
2828
* BETA: Recommend content based on overall site traffic via [Microsoft Azure's AI Personalizer API](https://azure.microsoft.com/en-us/services/cognitive-services/personalizer/) _(note that this service has been deprecated by Microsoft and as such, will no longer work. We are looking to replace this with a new provider to maintain the same functionality)_
2929
* Generate image alt text, image tags, and smartly crop images using [Microsoft Azure's AI Vision API](https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/)
@@ -37,6 +37,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
3737
* To utilize the Azure AI Vision Image Processing functionality or Text to Speech Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account.
3838
* To utilize the Azure OpenAI Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account and you will need to [apply](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUNTZBNzRKNlVQSFhZMU9aV09EVzYxWFdORCQlQCN0PWcu) for OpenAI access.
3939
* To utilize the Google Gemini Language Processing functionality, you will need an active [Google Gemini](https://ai.google.dev/tutorials/setup) account.
40+
* To utilize the AWS Language Processing functionality, you will need an active [AWS](https://console.aws.amazon.com/) account.
4041

4142
== Upgrade Notice ==
4243

0 commit comments

Comments
 (0)