You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+44-1
Original file line number
Diff line number
Diff line change
@@ -23,6 +23,7 @@
23
23
*[Set Up OpenAI Embeddings Language Processing](#set-up-classification-via-openai-embeddings)
24
24
*[Set Up OpenAI Whisper Language Processing](#set-up-audio-transcripts-generation-via-openai-whisper)
25
25
*[Set Up Azure AI Language Processing](#set-up-text-to-speech-via-microsoft-azure)
26
+
*[Set Up AWS Language Processing](#set-up-text-to-speech-via-amazon-polly)
26
27
*[Set Up Azure AI Vision Image Processing](#set-up-image-processing-features-via-microsoft-azure)
27
28
*[Set Up OpenAI DALL·E Image Processing](#set-up-image-generation-via-openai)
28
29
*[Set Up OpenAI Moderation Language Processing](#set-up-comment-moderation-via-openai-moderation)
@@ -45,7 +46,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
45
46
* Generate new images on demand to use in-content or as a featured image using [OpenAI's DALL·E 3 API](https://platform.openai.com/docs/guides/images)
46
47
* Generate transcripts of audio files using [OpenAI's Whisper API](https://platform.openai.com/docs/guides/speech-to-text)
47
48
* Moderate incoming comments for sensitive content using [OpenAI's Moderation API](https://platform.openai.com/docs/guides/moderation)
48
-
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech)
49
+
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech) or [Amazon Polly](https://aws.amazon.com/polly/)
49
50
* Classify post content using [IBM Watson's Natural Language Understanding API](https://www.ibm.com/watson/services/natural-language-understanding/) and [OpenAI's Embedding API](https://platform.openai.com/docs/guides/embeddings)
50
51
* BETA: Recommend content based on overall site traffic via [Microsoft Azure's AI Personalizer API](https://azure.microsoft.com/en-us/services/cognitive-services/personalizer/)*(note that this service has been [deprecated by Microsoft](https://learn.microsoft.com/en-us/azure/ai-services/personalizer/) and as such, will no longer work. We are looking to replace this with a new provider to maintain the same functionality (see [issue#392](https://github.com/10up/classifai/issues/392))*
51
52
* Generate image alt text, image tags, and smartly crop images using [Microsoft Azure's AI Vision API](https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/)
@@ -77,6 +78,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
77
78
* To utilize the Azure AI Vision Image Processing functionality or Text to Speech Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account.
78
79
* To utilize the Azure OpenAI Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account and you will need to [apply](https://aka.ms/oai/access) for OpenAI access.
79
80
* To utilize the Google Gemini Language Processing functionality, you will need an active [Google Gemini](https://ai.google.dev/tutorials/setup) account.
81
+
* To utilize the AWS Language Processing functionality, you will need an active [AWS](https://console.aws.amazon.com/) account.
80
82
81
83
## Pricing
82
84
@@ -399,6 +401,47 @@ Note that [OpenAI](https://platform.openai.com/docs/guides/speech-to-text) can c
399
401
* Click the button to preview the generated speech audio for the post.
400
402
* View the post on the front-end and see a read-to-me feature has been added
401
403
404
+
## Set Up Text to Speech (via Amazon Polly)
405
+
406
+
### 1. Sign up for AWS (Amazon Web Services)
407
+
408
+
*[Register for a AWS account](https://aws.amazon.com/free/) or sign into your existing one.
409
+
* Sign in to the AWS Management Console and open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/)
410
+
* Create IAM User (If you don't have any IAM user)
411
+
* In the navigation pane, choose **Users** and then click **Create user**
412
+
* On the **Specify user details** page, under User details, in User name, enter the name for the new user.
413
+
* Click **Next**
414
+
* On the **Set permissions** page, under Permissions options, select **Attach policies directly**
415
+
* Under **Permissions policies**, search for the policy **polly** and select **AmazonPollyFullAccess** Policy
416
+
* Click **Next**
417
+
* On the **Review and create** page, Review all of the choices you made up to this point. When you are ready to proceed, Click **Create user**.
418
+
* In the navigation pane, choose **Users**
419
+
* Choose the name of the user for which you want to create access keys, and then choose the **Security credentials** tab.
420
+
* In the **Access keys** section, click **Create access key**.
421
+
* On the **Access key best practices & alternatives** page, select **Application running outside AWS**
422
+
* Click **Next**
423
+
* On the **Retrieve access key** page, choose **Show** to reveal the value of your user's secret access key.
424
+
* Copy and save the credentials in a secure location on your computer or click "Download .csv file" to save the access key ID and secret access key to a `.csv` file.
425
+
426
+
### 2. Configure AWS credentials under Tools > ClassifAI > Language Processing > Text to Speech
427
+
428
+
* Select **Amazon Polly** in the provider dropdown.
429
+
* In the `AWS access key` field, enter the `Access key
430
+
` copied from above.
431
+
* In the `AWS secret access key` field, enter your `Secret access key` copied from above.
432
+
* In the `AWS Region` field, enter your AWS region value eg: `us-east-1`
433
+
* Click **Save Changes** (the page will reload).
434
+
* If connected successfully, a new dropdown with the label "Voices" will be displayed.
435
+
* Select a voice and voice engine as per your choice.
436
+
* Select a post type that should use this service.
437
+
438
+
### 3. Using the Text to Speech service
439
+
440
+
* Assuming the post type selected is "post", create a new post and publish it.
441
+
* After a few seconds, a "Preview" button will appear under the ClassifAI settings panel.
442
+
* Click the button to preview the generated speech audio for the post.
443
+
* View the post on the front-end and see a read-to-me feature has been added
444
+
402
445
## Set Up Image Processing features (via Microsoft Azure)
403
446
404
447
Note that [Azure AI Vision](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/home#image-requirements) can analyze and crop images that meet the following requirements:
Copy file name to clipboardexpand all lines: readme.txt
+2-1
Original file line number
Diff line number
Diff line change
@@ -23,7 +23,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
23
23
* Expand or condense text content using [OpenAI's ChatGPT API](https://platform.openai.com/docs/guides/chat), [Microsoft Azure's OpenAI service](https://azure.microsoft.com/en-us/products/ai-services/openai-service) or [Google's Gemini API](https://ai.google.dev/docs/gemini_api_overview)
24
24
* Generate new images on demand to use in-content or as a featured image using [OpenAI's DALL·E 3 API](https://platform.openai.com/docs/guides/images)
25
25
* Generate transcripts of audio files using [OpenAI's Whisper API](https://platform.openai.com/docs/guides/speech-to-text)
26
-
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech)
26
+
* Convert text content into audio and output a "read-to-me" feature on the front-end to play this audio using [Microsoft Azure's Text to Speech API](https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/text-to-speech) or [Amazon Polly](https://aws.amazon.com/polly/)
27
27
* Classify post content using [IBM Watson's Natural Language Understanding API](https://www.ibm.com/watson/services/natural-language-understanding/) and [OpenAI's Embedding API](https://platform.openai.com/docs/guides/embeddings)
28
28
* BETA: Recommend content based on overall site traffic via [Microsoft Azure's AI Personalizer API](https://azure.microsoft.com/en-us/services/cognitive-services/personalizer/) _(note that this service has been deprecated by Microsoft and as such, will no longer work. We are looking to replace this with a new provider to maintain the same functionality)_
29
29
* Generate image alt text, image tags, and smartly crop images using [Microsoft Azure's AI Vision API](https://azure.microsoft.com/en-us/services/cognitive-services/computer-vision/)
@@ -37,6 +37,7 @@ Tap into leading cloud-based services like [OpenAI](https://openai.com/), [Micro
37
37
* To utilize the Azure AI Vision Image Processing functionality or Text to Speech Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account.
38
38
* To utilize the Azure OpenAI Language Processing functionality, you will need an active [Microsoft Azure](https://signup.azure.com/signup) account and you will need to [apply](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUNTZBNzRKNlVQSFhZMU9aV09EVzYxWFdORCQlQCN0PWcu) for OpenAI access.
39
39
* To utilize the Google Gemini Language Processing functionality, you will need an active [Google Gemini](https://ai.google.dev/tutorials/setup) account.
40
+
* To utilize the AWS Language Processing functionality, you will need an active [AWS](https://console.aws.amazon.com/) account.
0 commit comments