Skip to content

Commit 3656a3e

Browse files
authored
Merge pull request #21 from 0xrushi/feat/docupdate
feat: add eleven labs doc
2 parents e0a8a30 + ee84fc2 commit 3656a3e

File tree

2 files changed

+132
-0
lines changed
  • docs/user-guide/backend
  • i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend

2 files changed

+132
-0
lines changed

docs/user-guide/backend/tts.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -296,3 +296,69 @@ MiniMax提供的在线的TTS服务,`speech-02-turbo`等模型具有强大的TT
296296
pronunciation_dict: ''
297297
```
298298
其中`voice_id`是可以配置的声音音色,具体的支持声音列表可以查看[官方文档中查询可用声音ID的部分](https://platform.minimaxi.com/document/get_voice)。`pronunciation_dict`是可以支持的自定义发声规则,比如您可以把`牛肉`发音为`neuro`,可以用类似示例的方法来定义这个发声规则。
299+
300+
## ElevenLabs TTS (在线,需要API密钥)
301+
> 自版本 `v1.2.1` 起可用
302+
303+
ElevenLabs 提供高质量、自然流畅的文本转语音服务,支持多种语言和声音克隆功能。
304+
305+
### 功能特点
306+
- **高质量音频**:行业领先的语音合成质量
307+
- **多语言支持**:支持英语、中文、日语、韩语等多种语言
308+
- **声音克隆**:上传音频样本进行声音克隆
309+
- **丰富的语音库**:提供多种预设语音和社区语音
310+
- **实时生成**:低延迟语音合成
311+
312+
### 配置步骤
313+
1. **注册并获取API密钥**
314+
- 访问 [ElevenLabs](https://elevenlabs.io/) 注册账户
315+
- 从 ElevenLabs 控制台获取您的 API 密钥
316+
317+
2. **选择语音**
318+
- 在 ElevenLabs 控制台中浏览可用语音
319+
- 复制您喜欢的语音的 Voice ID
320+
- 您也可以上传音频样本进行声音克隆
321+
322+
3. **配置 `conf.yaml`**
323+
在配置文件的 `elevenlabs_tts` 段落中,按以下格式填写参数:
324+
325+
```yaml
326+
elevenlabs_tts:
327+
api_key: 'your_elevenlabs_api_key' # 必需:您的 ElevenLabs API 密钥
328+
voice_id: 'JBFqnCBsd6RMkjVDRZzb' # 必需:ElevenLabs 语音 ID
329+
model_id: 'eleven_multilingual_v2' # 模型 ID(默认:eleven_multilingual_v2)
330+
output_format: 'mp3_44100_128' # 输出音频格式(默认:mp3_44100_128)
331+
stability: 0.5 # 语音稳定性(0.0 到 1.0,默认:0.5)
332+
similarity_boost: 0.5 # 语音相似度增强(0.0 到 1.0,默认:0.5)
333+
style: 0.0 # 语音风格夸张度(0.0 到 1.0,默认:0.0)
334+
use_speaker_boost: true # 启用说话人增强以获得更好质量(默认:true)
335+
```
336+
337+
### 参数说明
338+
- **api_key**(必需):您的 ElevenLabs API 密钥
339+
- **voice_id**(必需):语音的唯一标识符,在 ElevenLabs 控制台中找到
340+
- **model_id**:要使用的 TTS 模型。可用选项:
341+
- `eleven_multilingual_v2`(默认)- 支持多种语言
342+
- `eleven_monolingual_v1` - 仅英语
343+
- `eleven_turbo_v2` - 更快的生成速度
344+
- **output_format**:音频输出格式。常用选项:
345+
- `mp3_44100_128`(默认)- MP3,44.1kHz,128kbps
346+
- `mp3_44100_192` - MP3,44.1kHz,192kbps
347+
- `pcm_16000` - PCM,16kHz
348+
- `pcm_22050` - PCM,22.05kHz
349+
- `pcm_24000` - PCM,24kHz
350+
- `pcm_44100` - PCM,44.1kHz
351+
- **stability**:控制语音一致性(0.0 = 更多变化,1.0 = 更一致)
352+
- **similarity_boost**:增强与原始语音的相似度(0.0 到 1.0)
353+
- **style**:控制风格夸张度(0.0 = 中性,1.0 = 更具表现力)
354+
- **use_speaker_boost**:启用说话人增强以提高音频质量
355+
356+
### 使用技巧
357+
- **语音选择**:先尝试预设语音,然后考虑使用声音克隆获得自定义语音
358+
- **参数调优**:调整 `stability` 和 `similarity_boost` 以获得最佳效果
359+
- **成本管理**:ElevenLabs 按使用量收费,大量使用前请先测试
360+
- **网络要求**:需要稳定的网络连接以确保服务可用
361+
362+
:::tip
363+
ElevenLabs 提供免费试用额度,您可以在购买付费计划前先测试质量。
364+
:::

i18n/en/docusaurus-plugin-content-docs/current/user-guide/backend/tts.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -304,3 +304,69 @@ minimax_tts:
304304
pronunciation_dict: ''
305305
```
306306
The `voice_id` parameter can be configured to different voice tones. You can check the [voice ID query section in the official documentation](https://platform.minimaxi.com/document/get_voice) for a complete list of supported voices. The `pronunciation_dict` supports custom pronunciation rules - for example, you can define rules to pronounce "牛肉" as "neuro" using the format shown in the example.
307+
308+
## ElevenLabs TTS (Online, API Key Required)
309+
> Available since version `v1.2.1`
310+
311+
ElevenLabs provides high-quality, natural-sounding text-to-speech with support for multiple languages and voice cloning capabilities.
312+
313+
### Features
314+
- **High-Quality Audio**: Industry-leading speech synthesis quality
315+
- **Multi-language Support**: Supports English, Chinese, Japanese, Korean, and many other languages
316+
- **Voice Cloning**: Upload audio samples to clone voices
317+
- **Rich Voice Library**: Multiple preset voices and community voices available
318+
- **Real-time Generation**: Low-latency speech synthesis
319+
320+
### Configuration Steps
321+
1. **Register and Get API Key**
322+
- Visit [ElevenLabs](https://elevenlabs.io/) to register an account
323+
- Get your API key from the ElevenLabs dashboard
324+
325+
2. **Choose a Voice**
326+
- Browse available voices in the ElevenLabs dashboard
327+
- Copy the Voice ID of your preferred voice
328+
- You can also upload audio samples for voice cloning
329+
330+
3. **Configure `conf.yaml`**
331+
In the `elevenlabs_tts` section of your configuration file, enter parameters as follows:
332+
333+
```yaml
334+
elevenlabs_tts:
335+
api_key: 'your_elevenlabs_api_key' # Required: Your ElevenLabs API key
336+
voice_id: 'JBFqnCBsd6RMkjVDRZzb' # Required: ElevenLabs Voice ID
337+
model_id: 'eleven_multilingual_v2' # Model ID (default: eleven_multilingual_v2)
338+
output_format: 'mp3_44100_128' # Output audio format (default: mp3_44100_128)
339+
stability: 0.5 # Voice stability (0.0 to 1.0, default: 0.5)
340+
similarity_boost: 0.5 # Voice similarity boost (0.0 to 1.0, default: 0.5)
341+
style: 0.0 # Voice style exaggeration (0.0 to 1.0, default: 0.0)
342+
use_speaker_boost: true # Enable speaker boost for better quality (default: true)
343+
```
344+
345+
### Parameter Descriptions
346+
- **api_key** (required): Your ElevenLabs API key
347+
- **voice_id** (required): Unique identifier for the voice, found in your ElevenLabs dashboard
348+
- **model_id**: TTS model to use. Available options:
349+
- `eleven_multilingual_v2` (default) - Supports multiple languages
350+
- `eleven_monolingual_v1` - English only
351+
- `eleven_turbo_v2` - Faster generation
352+
- **output_format**: Audio output format. Common options:
353+
- `mp3_44100_128` (default) - MP3, 44.1kHz, 128kbps
354+
- `mp3_44100_192` - MP3, 44.1kHz, 192kbps
355+
- `pcm_16000` - PCM, 16kHz
356+
- `pcm_22050` - PCM, 22.05kHz
357+
- `pcm_24000` - PCM, 24kHz
358+
- `pcm_44100` - PCM, 44.1kHz
359+
- **stability**: Controls voice consistency (0.0 = more variable, 1.0 = more consistent)
360+
- **similarity_boost**: Enhances similarity to the original voice (0.0 to 1.0)
361+
- **style**: Controls style exaggeration (0.0 = neutral, 1.0 = more expressive)
362+
- **use_speaker_boost**: Enables speaker boost for improved audio quality
363+
364+
### Usage Tips
365+
- **Voice Selection**: Try preset voices first, then consider voice cloning for custom voices
366+
- **Parameter Tuning**: Adjust `stability` and `similarity_boost` for optimal results
367+
- **Cost Management**: ElevenLabs charges based on usage, test first before heavy usage
368+
- **Network Requirements**: Stable internet connection required for service availability
369+
370+
:::tip
371+
ElevenLabs offers free trial credits, so you can test the quality before purchasing a paid plan.
372+
:::

0 commit comments

Comments
 (0)