A FastAPI service that answers incoming Twilio voice calls with an OpenAI Realtime model. The HTTP endpoint returns TwiML directing Twilio Media Streams into a WebSocket that relays audio bidirectionally between the caller and OpenAI using the latest Realtime API format with server-side VAD and PCM μ-law audio for optimal quality and interruption handling.
- Python 3.13 (managed via uv)
- Twilio account with a programmable voice number
- OpenAI API key with Realtime access
- ngrok (or alternative tunnel) for exposing the local service
- Install dependencies:
uv sync
- Copy the sample environment file and populate the variables:
The service automatically loads
cp .env.example .env
.envat startup.
| Variable | Description |
|---|---|
OPENAI_API_KEY |
OpenAI credential used to open the realtime WebSocket |
OPENAI_REALTIME_MODEL (optional) |
Override model (default gpt-realtime) |
OPENAI_TEMPERATURE (optional) |
Model temperature for response randomness (default 0.8) |
OPENAI_RESPONSE_VOICE (optional) |
Voice name returned by OpenAI (default alloy) |
OPENAI_SYSTEM_PROMPT (optional) |
System instructions read before the conversation |
OPENAI_LOG_EVENT_TYPES (optional) |
Comma-separated list of OpenAI event types to log for debugging |
OPENAI_SHOW_TIMING_MATH (optional) |
Enable detailed timing logs for audio interruption debugging |
TWILIO_INTRO_VOICE (optional) |
Voice Twilio uses for the introductory IVR message |
Load the variables by exporting them or by keeping them in .env (automatically loaded).
- Start ngrok to expose the FastAPI service:
Note the
ngrok http 8000
https://URL (e.g.,https://abcd1234.ngrok.io). Twilio will use this for webhooks and media streaming. - In another terminal, launch the bridge (either command works):
uv run python main.py # or uv run uvicorn app:app --host 0.0.0.0 --port 8000 --reload - Optionally watch verbose logs by setting
LOG_LEVEL=debugbefore running uvicorn.
Railway provides an easy way to deploy your TwilioOpenAI service with automatic HTTPS and public URLs.
- Railway account (railway.app)
- Railway CLI installed:
npm install -g @railway/cliorcurl -fsSL https://railway.app/install.sh | sh
-
Login to Railway:
railway login
-
Initialize and deploy your project:
railway init railway up
-
Set required environment variables:
# Required: Your OpenAI API key railway variables --set "OPENAI_API_KEY=sk-your-openai-key-here" # Optional: Customize your AI assistant railway variables --set "OPENAI_SYSTEM_PROMPT=You are a helpful AI assistant..." railway variables --set "OPENAI_RESPONSE_VOICE=alloy" railway variables --set "OPENAI_TEMPERATURE=0.8" railway variables --set "TWILIO_INTRO_VOICE=Google.en-US-Chirp3-HD-Aoede"
-
Get your public URL:
railway domain
This will show your Railway URL (e.g.,
https://your-project-production.up.railway.app)
- Go to railway.app and select your project
- Navigate to the Variables tab
- Add the following variables:
OPENAI_API_KEY: Your OpenAI API keyOPENAI_SYSTEM_PROMPT: (Optional) Custom system promptOPENAI_RESPONSE_VOICE: (Optional) Default: "alloy"OPENAI_TEMPERATURE: (Optional) Default: "0.8"TWILIO_INTRO_VOICE: (Optional) Default: "Google.en-US-Chirp3-HD-Aoede"
- In your Twilio Console, go to Phone Numbers → Manage → Active Numbers
- Select your Twilio phone number
- Update A CALL COMES IN webhook to:
POST https://your-railway-url.up.railway.app/incoming-call - Save the configuration
Your service is now live and accessible via the Railway URL!
- Navigate to Phone Numbers → Manage → Active Numbers, select your number, and update A CALL COMES IN to:
- Webhook:
POST https://<ngrok-domain>/incoming-call
- Webhook:
- Save the configuration. When the call connects, Twilio will open a secure WebSocket to
wss://<ngrok-domain>/media-streambased on the TwiML response returned byapp.api.incoming_call.
- The bridge negotiates a session with
output_modalitiesset to audio only using server-side VAD for natural conversation flow. - Audio format is PCM μ-law for optimal compatibility with Twilio's media streams.
- The bridge implements intelligent interruption handling that truncates responses when speech is detected.
- Adjust
OPENAI_SYSTEM_PROMPTto control tone andOPENAI_TEMPERATUREfor response variability.
- Call your Twilio number from any phone.
- The FastAPI logs should show an
Incoming stream startedentry followed by OpenAI event processing. - When the OpenAI realtime session initializes, the assistant issues its scripted greeting automatically, then responds to the caller in near real-time with natural interruption handling.
- Try interrupting the assistant mid-response to test the speech detection and truncation features.
- End the call to close both WebSocket connections gracefully.
Run Python compilation checks or test suites with uv:
uv run pytest # if tests are added
uv run python -m compileall app main.py
uv run ruff check .
uv run ruff formatapp/
__init__.py # FastAPI factory and router mount
api.py # HTTP TwiML webhook and Twilio media WebSocket
bridge.py # Bidirectional audio bridge with interruption handling and initial greeting
config.py # Settings loader for environment variables
main.py # uvicorn entry point
- Server-side VAD: Uses OpenAI's voice activity detection for natural conversation flow
- Interruption Handling: Automatically truncates assistant responses when user starts speaking
- Audio Quality: PCM μ-law format ensures optimal compatibility with Twilio
- Concurrent Processing: Separate tasks handle Twilio→OpenAI and OpenAI→Twilio streams
- Robust Error Handling: Graceful connection cleanup and comprehensive logging