TTSFM is a free, OpenAI-compatible text-to-speech stack powered by the openai.fm backend. It ships with Python clients, a REST API, and a web playground.
pip install ttsfm # core client
pip install ttsfm[web] # client + Flask web appTTSFM offers two Docker image variants to suit different needs:
docker run -p 8000:8000 dbcccc/ttsfm:latestIncludes ffmpeg for advanced features:
- ✅ MP3 auto-combine for long text
- ✅ Speed adjustment (0.25x - 4.0x)
- ✅ Additional audio formats (AAC, FLAC, OPUS)
docker run -p 8000:8000 dbcccc/ttsfm:v3.4.0-alpha1-slimMinimal image without ffmpeg:
- ✅ Basic TTS (MP3/WAV)
- ✅ WAV auto-combine (simple concatenation)
- ❌ No MP3 auto-combine
- ❌ No speed adjustment
- ❌ No format conversion
The container exposes the web playground at http://localhost:8000 and an OpenAI-style endpoint at /v1/audio/speech.
from ttsfm import TTSClient, AudioFormat, Voice
client = TTSClient()
# Basic usage
response = client.generate_speech(
text="Hello from TTSFM!",
voice=Voice.ALLOY,
response_format=AudioFormat.MP3,
)
response.save_to_file("hello") # -> hello.mp3
# With speed adjustment (requires ffmpeg)
response = client.generate_speech(
text="This will be faster!",
voice=Voice.NOVA,
response_format=AudioFormat.MP3,
speed=1.5, # 1.5x speed (0.25 - 4.0)
)
response.save_to_file("fast") # -> fast.mp3ttsfm "Hello, world" --voice nova --format mp3 --output hello.mp3curl -X POST http://localhost:8000/v1/audio/speech -H "Content-Type: application/json" -d '{"model":"gpt-4o-mini-tts","input":"Hello world!","voice":"alloy"}' --output speech.mp3- Browse the full API reference and operational notes in the web documentation (or see
ttsfm-web/templates/docs.html). - Read the architecture overview for component diagrams.
- Contributions are welcome—see CONTRIBUTING.md for guidelines.
TTSFM is released under the MIT License.