|
| 1 | +--- |
| 2 | +title: Voice Simulation Runs |
| 3 | +description: Test your Voice Agent's interaction capabilities with realistic voice simulations across thousands of scenarios. |
| 4 | +--- |
| 5 | + |
| 6 | +## Test voice agents at scale with simulated conversations |
| 7 | + |
| 8 | +Run tests with datasets containing multiple scenarios for your voice agent to evaluate performance across different situations. |
| 9 | + |
| 10 | +<Steps> |
| 11 | + |
| 12 | +<Step title="Create a dataset for testing"> |
| 13 | +Configure your agent dataset template with: |
| 14 | +- **Agent scenarios**: Define specific situations for testing (e.g., "Update address", "Order an iPhone") |
| 15 | +- **Expected steps**: List the actions and responses you expect |
| 16 | + |
| 17 | + |
| 18 | +</Step> |
| 19 | + |
| 20 | +<Step title="Set up the test run"> |
| 21 | +- Navigate to your voice agent and click **Test** |
| 22 | +- **Simulated session** mode will be pre-selected (voice agents can't be tested in single-turn mode) |
| 23 | +- Select your agent dataset from the dropdown |
| 24 | +- Choose relevant evaluators |
| 25 | + |
| 26 | +<Note> |
| 27 | + Only built-in evaluators are currently supported for voice simulation runs. Custom evaluators will be available soon. |
| 28 | +</Note> |
| 29 | + |
| 30 | + |
| 31 | + |
| 32 | +</Step> |
| 33 | + |
| 34 | +<Step title="Trigger the test run"> |
| 35 | +Click **Trigger test run** to start. The system will call your voice agent and simulate conversations for each scenario. |
| 36 | +</Step> |
| 37 | + |
| 38 | +<Step title="Review results"> |
| 39 | +Each session runs end-to-end for thorough evaluation: |
| 40 | +- View detailed results for every scenario |
| 41 | +- Text-based evaluators assess turn-by-turn call transcription |
| 42 | +- Audio-based evaluators analyze the call recording |
| 43 | + |
| 44 | + |
| 45 | +</Step> |
| 46 | + |
| 47 | +<Step title="Inspect individual entries"> |
| 48 | +Click any entry to see detailed results for that specific scenario. |
| 49 | + |
| 50 | +By default, test runs evaluate these performance metrics from the recording audio file: |
| 51 | +- **Avg latency**: How long the agent took to respond |
| 52 | +- **Talk ratio**: Agent talk time compared to simulation agent talk time |
| 53 | +- **Avg pitch**: The average pitch of the agent's responses |
| 54 | +- **Words per minute**: The agent's speech rate |
| 55 | + |
| 56 | + |
| 57 | +</Step> |
| 58 | + |
| 59 | +</Steps> |
0 commit comments