🎙️ Gemini Live Voice Chat

A real-time, bidirectional voice chat application powered by Google's Gemini Live API. Experience fluid, natural conversations with AI using your voice, now with a stunning glassmorphic UI and perfect mobile responsiveness.

✨ Key Features

Real-time Voice Conversation: Talk naturally to Gemini with continuous, low-latency audio streaming.
Improved Transcription System:
- Dual-Channel Transcription: See your own words (blue bubbles) and Gemini's responses (white bubbles) in real-time.
- Smart Text Merging: Advanced frontend logic eliminates visual stuttering and duplication by intelligently merging streaming text chunks.
Premium Glassmorphism UI:
- Translucent panels with background blur.
- Deep, dynamic gradients inspired by modern aesthetics.
- Clean chat bubble interface for clear conversation flow.
Mobile First Experience:
- Adaptive layout (100dvh) that fits perfectly on mobile browsers.
- Sticky controls that never get lost.
- Optimized touch targets.
Camera & Screen Sharing: Toggle your camera or share your screen to give Gemini real-time visual context for code reviews or troubleshooting.

🚀 Getting Started

Prerequisites

Docker installed.
A Google Gemini API Key (get it from Google AI Studio).

Installation

Clone the repository:

git clone https://github.com/calebrio02/Gemini-Live-API
cd Gemini-Live-API

Configure Environment: Create a .env file in the root directory:

# .env
GEMINI_API_KEY=your_api_key_here
PORT=3600
DEFAULT_VOICE=Kore

Run with Docker:
```
docker-compose up --build
```
Access the App: Open your browser (Chrome/Edge recommended) and go to: http://localhost:3600

Note: For mobile devices on the same network, use your computer's local IP address (e.g., https://192.168.1.x:3600). You may need to set up HTTPS or allow insecure origins for microphone access.

🛠️ Tech Stack

Backend: Node.js, Express, ws (WebSocket), Gemini Multimodal Live API.
Frontend: Vanilla JavaScript, CSS3 (Glassmorphism), WebSocket API, Web Audio API.
Infrastructure: Docker, Docker Compose.

📝 Usage Guide

Start Chat: Click the microphone button to begin.
Speak: Talk naturally. The "Listening..." indicator will pulse.
Read: Watch the conversation unfold in the transcript view.
- User Bubbles (Right): Your speech, transcribed by Gemini.
- AI Bubbles (Left): Gemini's audio response, transcribed in real-time.
Controls:
- Toggle Transcript: Show/hide the text history.
- Camera: Share your camera stream.
- Screen Share: Share your screen/window for troubleshooting.
- Settings: Change voice tone (Kore, Fenrir, Aoede, etc.) or system prompt.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Built with ❤️ using Gemini API and Google Antigravity

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
public		public
src		src
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
screenshot.png		screenshot.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎙️ Gemini Live Voice Chat

✨ Key Features

🚀 Getting Started

Prerequisites

Installation

🛠️ Tech Stack

📝 Usage Guide

🤝 Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎙️ Gemini Live Voice Chat

✨ Key Features

🚀 Getting Started

Prerequisites

Installation

🛠️ Tech Stack

📝 Usage Guide

🤝 Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages