Skip to content

Commit b863450

Browse files
committed
Updated readmes.
1 parent b5cd4cd commit b863450

File tree

7 files changed

+375
-40
lines changed

7 files changed

+375
-40
lines changed

README.md

Lines changed: 18 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,14 @@
11
# .NET Library for OpenAI WebRTC End Point
22

3-
This repository contains a .NET library for interacting with [OpenAI's real-time WebRTC API](https://platform.openai.com/docs/guides/realtime-webrtc). It provides helper classes to negotiate peer connections, send and receive Opus audio frames and exchange control messages over a data channel.
3+
This repository contains a .NET library for interacting with [OpenAI's real-time WebRTC API](https://platform.openai.com/docs/guides/realtime-webrtc). It provides helper classes to negotiate peer connections, send and receive OPUS audio frames and exchange control messages over a data channel.
44

55
## Features
66

7-
- Establish a `RTCPeerConnection` with OpenAI using a REST based signalling helper.
7+
- Establish a `RTCPeerConnection` (WebRTC) with OpenAI using a REST based signalling helper.
88
- Send audio samples or pipe them from existing SIPSorcery media end points.
9-
- Receive audio and transcript events via the data channel.
9+
- Receive transcript and other events via the data channel.
1010
- `DataChannelMessenger` class to assist with sending session updates, function call results and response prompts.
11-
- Designed to work with dependency injection or standalone instances.
12-
13-
The solution files are located under `src/` and `examples/`.
11+
- Designed to work with dependency injection (ASP.NET) or standalone alone applications (Console & WinForms).
1412

1513
## Installation
1614

@@ -24,7 +22,7 @@ dotnet add package SIPSorcery.OpenAI.WebRTC
2422

2523
### Console/WinForms Direct WebRTC Connection to OpenAI Realtime End Point
2624

27-
See GetStarted example for full source.
25+
See [GetStarted](https://github.com/sipsorcery-org/SIPSorcery.OpenAI.WebRTC/tree/main/examples/GetStarted) example for full source.
2826

2927
```csharp
3028
using SIPSorcery.OpenAIWebRTC;
@@ -88,16 +86,26 @@ webrtcEndPoint.OnDataChannelMessage += (dc, message) =>
8886

8987
```
9088

89+
Example Output:
90+
91+
```
92+
[20:45:29 INF] AI ✅: Hello! How can I assist you today?
93+
[20:45:40 INF] ME ✅: Tell me a nursery rhyme and use as many emojis as you can in the transcription.
94+
[20:45:44 INF] AI ✅: 🍼🎶 Humpty Dumpty sat on a wall, 🥚⬆️🌉 Humpty Dumpty had a great fall. 🥚💥⤵️ All the king's horses 🐎👑 and all the king's men 👨‍✈️👑 couldn't put Humpty together again! 🥚❌⚒️🐣
95+
[20:46:06 INF] AI ✅: You're welcome! 😊 Anytime!
96+
[20:46:06 INF] ME ✅: Thank you.
97+
```
98+
9199
### ASP.NET WebRTC Bridge: Browser <- ASP.NET Bridge -> OpenAI Realtime End Point
92100

93-
See BrowserBridge example for full source.
101+
See [BrowserBridge](https://github.com/sipsorcery-org/SIPSorcery.OpenAI.WebRTC/tree/main/examples/BrowserBridge) example for full source.
94102

95103
```csharp
96104
using SIPSorcery.OpenAIWebRTC;
97105
using SIPSorcery.OpenAIWebRTC.Models;
98106

99107
// Set up an ASP.NET web socket to listen for connections.
100-
// The web socket is NOT used for the connection to OpenAI. It's a covenience signalling channel to allow the browser
108+
// The web socket is NOT used for the connection to OpenAI. It's a convenience signalling channel to allow the browser
101109
// to establish a WebRTC connection with the ASP.NET app.
102110
103111
app.Map("/ws", async (HttpContext context,
@@ -165,4 +173,4 @@ Each example folder contains its own README with usage instructions.
165173

166174
## License
167175

168-
Distributed under the BSD 3‑Clause license with an additional BDS BY‑NC‑SA restriction. See [LICENSE.md](LICENSE.md) for details.
176+
Distributed under the BSD 3‑Clause license with an additional BDS BY‑NC‑SA restriction. See [LICENSE.md](https://github.com/sipsorcery-org/SIPSorcery.OpenAI.WebRTC/tree/LICENSE.md) for details.

examples/AliceAndBob/README.md

100644100755
Lines changed: 90 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,97 @@
1-
# AliceAndBob Example
1+
# AliceAndBob - OpenAI Realtime WebRTC Demo
22

3-
Runs two OpenAI WebRTC sessions ("Alice" and "Bob") and pipes their audio
4-
between each other. A simple OpenGL scope visualises which side is speaking.
3+
**AliceAndBob** is a demonstration project that connects two [OpenAI Realtime API](https://platform.openai.com/docs/guides/realtime) WebRTC sessions—nicknamed *Alice* and *Bob*—and pipes their audio between one another. This simulates a live, bi-directional conversation between two AI agents. A Windows-based OpenGL audio visualisation displays which side is currently speaking.
54

6-
## Usage
5+
---
6+
7+
## 🎯 Features
8+
9+
- Initiates and maintains two OpenAI WebRTC sessions.
10+
- Forwards audio from Alice to Bob and vice versa.
11+
- Uses [SIPSorcery](https://github.com/sipsorcery/sipsorcery) WebRTC libraries and [NAudio](https://github.com/naudio/NAudio) for media handling.
12+
- Visualises audio signal strength in real-time using a WinForms OpenGL scope.
13+
- Shows how to work with:
14+
- WebRTC connections
15+
- OpenAI Realtime API
16+
- Audio frame decoding and manipulation
17+
- Session updates and message prompting via data channels
18+
19+
---
20+
21+
## 🛠 Requirements
22+
23+
- **Operating System**: Windows
24+
- **.NET Version**: [.NET 8.0 SDK](https://dotnet.microsoft.com/download/dotnet/8.0)
25+
- **Audio**: A working input/output device (e.g., microphone, speakers)
26+
- **API Access**: OpenAI API key with access to Realtime features
27+
28+
---
29+
30+
## 🚀 Getting Started
31+
32+
### 1. Clone the Repository
33+
34+
```bash
35+
git clone https://github.com/sipsorcery-org/SIPSorcery.OpenAI.WebRTC.git
36+
cd SIPSorcery.OpenAI.WebRTC/examples/AliceAndBob
37+
```
38+
39+
### 2. Set Your OpenAI API Key
40+
41+
Set the `OPENAI_API_KEY` environment variable in your terminal:
42+
43+
```bash
44+
set OPENAI_API_KEY="<your-openai-api-key>"
45+
```
46+
47+
### 3. Run the Application
748

849
```bash
9-
export OPENAI_API_KEY="<your OpenAI key>"
1050
dotnet run
1151
```
1252

13-
You'll need a Windows machine with audio devices and .NET 8.0 installed.
53+
You should see a WinForms window showing two audio scopes—one for Alice and one for Bob—while the agents begin a real-time audio exchange.
54+
55+
---
56+
57+
## 📷 Application Preview
58+
59+
The `Program.cs` does the following:
60+
61+
- Creates a WinForms UI thread for audio scope visualization.
62+
- Starts two OpenAI WebRTC sessions (`Alice` and `Bob`).
63+
- Establishes WebRTC peer connections.
64+
- Routes audio received from Alice to Bob, and vice versa.
65+
- Uses OpenAI’s `DataChannelMessenger` to:
66+
- Send a `response.create` event from Alice saying “Hi!”
67+
- Change Bob’s voice using a `session.update` event.
68+
- Decodes and visualizes incoming audio frames.
69+
70+
---
71+
72+
## 📦 Key Technologies
73+
74+
| Component | Role |
75+
|-------------------|--------------------------------------------------|
76+
| `SIPSorcery` | WebRTC signaling, media transport, SDP |
77+
| `NAudio` | Windows audio capture/playback (used internally) |
78+
| `OpenAI Realtime` | API endpoints for streaming AI audio responses |
79+
| `WinForms` | UI for displaying real-time audio signal |
80+
| `OpenGL` | Audio waveform visualization |
81+
| `Serilog` | Structured logging to console |
82+
83+
---
84+
85+
## 🧩 Files of Interest
86+
87+
| File | Description |
88+
|-----------------------|--------------------------------------------------|
89+
| `Program.cs` | Main entry point. Starts sessions and handles logic. |
90+
| `FormAudioScope.cs` | Audio visualisation using WinForms and OpenGL. |
91+
| `WebRTCEndPoint.cs` | WebRTC peer connection wrapper for OpenAI. |
92+
93+
---
94+
95+
## 📝 License
96+
97+
This project is licensed under the **BSD 3-Clause License** with an additional **BY-NC-SA** restriction. See [`LICENSE.md`](./LICENSE.md) for full terms.

examples/DependencyInjection/README.md

100644100755
Lines changed: 63 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,70 @@
1-
# DependencyInjection Example
1+
# ASP.NET OpenAI WebRTC Get STarted Example
22

3-
Illustrates registering `IWebRTCEndPoint` with the .NET dependency injection
4-
container. Audio is sent to OpenAI and transcripts are printed to the console.
3+
This example demonstrates how to set up and run a basic WebRTC application that interacts with OpenAI's [Real-time WebRTC API](https://platform.openai.com/docs/guides/realtime-webrtc). This version is designed for ASP.NET Core and Dependency Injection (DI) environments, using `HttpClientFactory` and DI-compliant service wiring.
4+
5+
> ⚠️ **Note:** This demo does not include echo cancellation. If your Windows audio device doesn't provide echo cancellation, ChatGPT may end up talking to itself. To avoid this, use a headset.
6+
7+
## Features
8+
9+
- Initializes WebRTC connections to OpenAI’s Realtime endpoint.
10+
- Sends and receives audio using Windows audio devices.
11+
- Uses `Microsoft.Extensions.DependencyInjection` for DI and `Serilog` for logging.
12+
- Sends a "Say Hi!" message to trigger conversation once connection is established.
13+
- Shows how to listen for final transcript messages from the assistant.
514

615
## Usage
716

17+
Set your OpenAI API key in the environment and run:
18+
819
```bash
9-
export OPENAI_API_KEY="<your OpenAI key>"
20+
set OPENAI_API_KEY=<your_openai_key>
1021
dotnet run
1122
```
23+
24+
### Requirements
25+
26+
- Windows machine with audio devices
27+
- [.NET 8.0 SDK](https://dotnet.microsoft.com/en-us/download)
28+
- Valid [OpenAI API Key](https://platform.openai.com/account/api-keys)
29+
30+
## Program Structure
31+
32+
- Uses `IServiceCollection` to register dependencies.
33+
- Adds `AddOpenAIRealtimeWebRTC(openAiKey)` to DI container.
34+
- Resolves `IWebRTCEndPoint` from `IServiceProvider`.
35+
- Sends audio from local device and handles real-time responses via DataChannel.
36+
- Clean shutdown with Ctrl+C.
37+
38+
## Code Highlights
39+
40+
### Dependency Injection Setup
41+
42+
```csharp
43+
var services = new ServiceCollection();
44+
services.AddLogging(builder => builder.AddSerilog(dispose: true));
45+
services.AddOpenAIRealtimeWebRTC(openAiKey);
46+
using var provider = services.BuildServiceProvider();
47+
var webrtcEndPoint = provider.GetRequiredService<IWebRTCEndPoint>();
48+
```
49+
50+
### Trigger Assistant Response
51+
52+
```csharp
53+
webrtcEndPoint.DataChannelMessenger.SendResponseCreate(RealtimeVoicesEnum.shimmer, "Say Hi!");
54+
```
55+
56+
### Handle Final Transcripts
57+
58+
```csharp
59+
webrtcEndPoint.OnDataChannelMessage += (dc, message) =>
60+
{
61+
if (message is RealtimeServerEventResponseAudioTranscriptDone done)
62+
{
63+
Log.Information($"Transcript done: {done.Transcript}");
64+
}
65+
};
66+
```
67+
68+
## License
69+
70+
BSD 3-Clause "New" or "Revised" License and the additional BDS BY-NC-SA restriction. See `LICENSE.md`.

examples/GetPaid/README.md

100644100755
Lines changed: 91 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,98 @@
1-
# GetPaid Example
1+
# OpenAI WebRTC Demo: Payment Request via Function Calling
22

3-
Shows how local function calls can be used to create a mock payment request.
4-
The application asks the OpenAI agent to call a function and then returns a
5-
payment ID before continuing the conversation.
3+
This example demonstrates how to use [OpenAI's real-time WebRTC API](https://platform.openai.com/docs/guides/realtime-webrtc)
4+
in conjunction with **function calling** to simulate a sales assistant that creates payment requests.
5+
6+
## Features
7+
8+
- WebRTC audio stream using local Windows audio devices.
9+
- Realtime voice-to-text and assistant responses.
10+
- Uses OpenAI's function calling to trigger a local function for generating payment requests.
11+
- JSON messages are exchanged over WebRTC data channels.
12+
13+
## Requirements
14+
15+
- .NET 8
16+
- A Windows environment (uses `WindowsAudioEndPoint`).
17+
- An OpenAI API key with access to the real-time WebRTC feature.
618

719
## Usage
820

21+
1. Clone the repo or copy the example.
22+
2. Set your OpenAI API key as an environment variable:
23+
924
```bash
10-
export OPENAI_API_KEY="<your OpenAI key>"
25+
set OPENAI_API_KEY=your_openai_key
1126
dotnet run
1227
```
28+
29+
## What It Does
30+
31+
- Connects to OpenAI's WebRTC endpoint.
32+
- Sends and receives audio using your Windows microphone and speaker.
33+
- Initiates a conversation with the assistant.
34+
- Configures a local tool named `create_payment_request` that the assistant can call.
35+
- When the assistant calls the tool, a simulated payment request is generated locally.
36+
- The tool's response is sent back to the assistant to continue the conversation.
37+
38+
## Function Calling Example
39+
40+
```json
41+
{
42+
"name": "create_payment_request",
43+
"arguments": {
44+
"amount": 49.95,
45+
"currency": "USD"
46+
}
47+
}
48+
```
49+
50+
This will trigger the local C# method `CreatePaymentRequest`, which generates a mock payment request and responds with a result like:
51+
52+
```
53+
New payment request order ID is X1234
54+
```
55+
56+
## Relevant Code Snippets
57+
58+
### Registering the Tool
59+
60+
```csharp
61+
new RealtimeTool
62+
{
63+
Name = "create_payment_request",
64+
Description = "Creates a payment request.",
65+
Parameters = new RealtimeToolParameters
66+
{
67+
Properties = new Dictionary<string, RealtimeToolProperty>
68+
{
69+
{ "amount", new RealtimeToolProperty { Type = "number" } },
70+
{ "currency", new RealtimeToolProperty { Type = "string" } }
71+
},
72+
Required = new List<string> { "amount", "currency" }
73+
}
74+
}
75+
```
76+
77+
### Function Execution
78+
79+
```csharp
80+
private static RealtimeClientEventConversationItemCreate CreatePaymentRequest(...)
81+
{
82+
string orderID = "X1234";
83+
return new RealtimeClientEventConversationItemCreate
84+
{
85+
Output = $"New payment request order ID is {orderID}",
86+
...
87+
};
88+
}
89+
```
90+
91+
## Notes
92+
93+
- This is a demonstration app. The payment request is mocked and not tied to any payment system.
94+
- You can adapt the function to integrate with a real API or database.
95+
96+
## License
97+
98+
BSD 3-Clause + BY-NC-SA restriction — see LICENSE.md.

examples/GetStarted/Program.cs

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -110,11 +110,12 @@ static async Task Main()
110110
var log = message switch
111111
{
112112
RealtimeServerEventSessionUpdated sessionUpdated => $"Session updated: {sessionUpdated.ToJson()}",
113-
RealtimeServerEventConversationItemInputAudioTranscriptionDelta inputDelta => $"ME ⌛: {inputDelta.Delta?.Trim()}",
113+
//RealtimeServerEventConversationItemInputAudioTranscriptionDelta inputDelta => $"ME ⌛: {inputDelta.Delta?.Trim()}",
114114
RealtimeServerEventConversationItemInputAudioTranscriptionCompleted inputTranscript => $"ME ✅: {inputTranscript.Transcript?.Trim()}",
115-
RealtimeServerEventResponseAudioTranscriptDelta responseDelta => $"AI ⌛: {responseDelta.Delta?.Trim()}",
115+
//RealtimeServerEventResponseAudioTranscriptDelta responseDelta => $"AI ⌛: {responseDelta.Delta?.Trim()}",
116116
RealtimeServerEventResponseAudioTranscriptDone responseTranscript => $"AI ✅: {responseTranscript.Transcript?.Trim()}",
117-
_ => $"Received {message.Type} -> {message.GetType().Name}"
117+
//_ => $"Received {message.Type} -> {message.GetType().Name}"
118+
_ => string.Empty
118119
};
119120

120121
if (log != string.Empty)

0 commit comments

Comments
 (0)