Skip to content

gianpaj/sexyvoice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1,005 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

SexyVoice.ai - AI Voice Generation Platform

SexyVoice.ai - AI Voice Generation Platform

Create stunning AI-generated voices and clone your own voice with advanced machine learning technology

🌐 Live Demo β€’ πŸ—ΊοΈ Roadmap β€’ πŸš€ Quick Start β€’ ✨ Features β€’ πŸ› οΈ Tech Stack β€’ βš™οΈ DevOps Guide


🌟 About

  • Generate AI voices in 24+ supported languages and locales
  • Major multilingual voice families from Google Gemini (gpro) and xAI Grok (xai)
  • Voice cloning support across 20+ supported languages

SexyVoice.ai is a cutting-edge AI voice generation platform that empowers users to create high-quality, realistic voices and clone their own voice using advanced machine learning technology. Whether you're a content creator, developer, or business professional, this platform provides the tools you need to generate professional-grade audio content with featured Gemini and Grok voices, plus custom voice cloning capabilities.

✨ Features

🎯 Core Functionality

  • AI Voice Generation: Create realistic voices powered by Google Gemini, xAI Grok, and additional TTS models
  • Voice Cloning: Clone your own voice with as little as 10 seconds of audio
  • Voice Selection System: Choose from featured Gemini voices like achernar, aoede, kore, puck, sulafat, and zephyr, plus Grok voices like ara, eve, leo, rex, and sal
  • Multi-language Support: Generate speech in 24+ supported languages and locales, with broad multilingual coverage for generation, cloning, and real-time voice experiences
  • Audio Transcription: Transcribe audio files to text offline in 99+ languages with optional translation to English using Whisper AI

πŸ” User Experience

  • Secure Authentication: Multiple login options with Google (more coming soon)
  • Credit-based System: Fair usage tracking with transparent pricing
  • Profile Management: Personalized dashboard and settings
  • Audio History: Track and manage all your generated content

🌍 Platform Features

  • Responsive Design: Optimized for desktop and mobile devices
  • International Support: Full i18n implementation powered by next-intl for global accessibility (EN/ES/DE/DA/IT/FR)
  • Localized Site Banners: Shared banner system for promos and announcements across landing, blog, and dashboard with independent dismiss state and one visible banner at a time
  • Rate Limiting: Fair usage policies to ensure platform stability
  • Real-time Updates: Live audio generation with progress tracking
  • Public Tools: Free utility tools including audio transcription and format conversion

πŸ› οΈ Tech Stack

Frontend

  • Next.js 16 - React framework with App Router and TypeScript
  • next-intl - Internationalization for Next.js App Router; messages in apps/web/messages/*.json; getMessages() for server components, useTranslations() for client components
  • React 19 - Server Components (RSCs), Suspense, and Server Actions
  • Tailwind 3 CSS - Utility-first CSS framework
  • shadcn/ui - Modern component library
  • Radix UI - Headless component primitives

Backend & Database

  • Supabase - Authentication and PostgreSQL database with SSR support
  • Drizzle ORM - Type-safe database operations (planned)
  • Cloudflare R2 - Scalable audio file storage with global CDN

DevOps & Monitoring

  • Vercel - Deployment and hosting platform
  • Sentry - Error tracking and performance monitoring
  • PostHog - Product analytics and feature flags
  • Axiom - Structured request logging for API routes
  • Stripe - Payment processing and subscription management

Development Tools

Repository Layout

  • apps/web - Next.js web app deployed to Vercel.
  • apps/docs - Mintlify docs app for docs.sexyvoice.ai.
  • scripts - operational scripts kept outside the web app as @sexyvoice/scripts.
  • docs - internal engineering and operational docs.

Root commands are orchestrated with Turborepo. Use package filters when you only want one app, for example pnpm --filter @sexyvoice/web dev.

πŸš€ Getting Started

Prerequisites

Installation

  1. Clone the repository

    git clone https://github.com/gianpaj/sexyvoice.git
    cd sexyvoice
  2. Install dependencies

    pnpm install
  3. Set up environment variables

    cp apps/web/.env.example apps/web/.env.local

    Fill in the required environment variables as defined in apps/web/.env.example:

    • Supabase
      • NEXT_PUBLIC_SUPABASE_URL
      • NEXT_PUBLIC_SUPABASE_ANON_KEY
      • SUPABASE_SERVICE_ROLE_KEY - For admin access to Supabase (used in Telegram bot cronjob)
    • Your Redis (Upstash)
      • KV_REST_API_URL
      • KV_REST_API_TOKEN
    • Cloudflare R2 storage
      • R2_ACCESS_KEY_ID
      • R2_SECRET_ACCESS_KEY
      • R2_BUCKET_NAME
      • R2_SPEECH_API_BUCKET_NAME - Dedicated bucket for /api/v1/speech generated audio
      • R2_ENDPOINT - Your Cloudflare R2 endpoint URL (https://xxx.r2.cloudflarestorage.com)
    • AI 3rd party services
      • REPLICATE_API_TOKEN - Your Replicate API token for AI voice generation
      • FAL_KEY - Your fal.ai API key for voice cloning
      • GOOGLE_GENERATIVE_AI_API_KEY - Your Google Generative AI API key for text-to-speech and enhance text (automatically add emotion tags)
      • XAI_API_KEY - Your xAI API key for Grok TTS voice generation
    • Real-time Calls (LiveKit)
      • LIVEKIT_URL
      • LIVEKIT_API_KEY
      • LIVEKIT_API_SECRET
    • Stripe
      • STRIPE_SECRET_KEY
      • STRIPE_WEBHOOK_SECRET
      • STRIPE_PRICING_ID - Stripe pricing ID for Pricing table
      • STRIPE_PUBLISHABLE_KEY - for Stripe Pricing table
      • STRIPE_TOPUP_5_PRICE_ID
      • STRIPE_TOPUP_10_PRICE_ID
      • STRIPE_TOPUP_99_PRICE_ID
    • Banner and promotion configuration
      • NEXT_PUBLIC_PROMO_ENABLED - Enables promo banners and bonus-credit pricing
      • NEXT_PUBLIC_ACTIVE_PROMO_BANNER - Active promo banner id from apps/web/messages/*.json and apps/web/lib/banners/registry.ts
      • NEXT_PUBLIC_ACTIVE_ANNOUNCEMENT_BANNER - Active announcement banner id from apps/web/messages/*.json and apps/web/lib/banners/registry.ts
      • NEXT_PUBLIC_PROMO_TRANSLATIONS - Legacy fallback for active promo banner selection
      • NEXT_PUBLIC_PROMO_THEME - Banner theme (pink, orange, blue)
      • NEXT_PUBLIC_PROMO_COUNTDOWN_END_DATE - Optional countdown end date for promo banners that support it
      • NEXT_PUBLIC_PROMO_ID - Promo identifier still used by Stripe metadata and credit bonus flows
      • NEXT_PUBLIC_PROMO_BONUS_STARTER
      • NEXT_PUBLIC_PROMO_BONUS_STANDARD
      • NEXT_PUBLIC_PROMO_BONUS_PRO
    • Telegram cronjob
      • TELEGRAM_WEBHOOK_URL - for daily stats notifications
      • CRON_SECRET - For securing the API route - See Managing Cron Jobs
    • Axiom logging (optional)
      • AXIOM_TOKEN - Your Axiom API token for structured request logging on /api/v1/speech
    • API key security
      • API_KEY_HMAC_SECRET - Secret used to HMAC-SHA256 hash API keys before storing them in the database. Generate with openssl rand -hex 32. Without this, keys fall back to plain SHA-256 (acceptable in development, never in production).
    • Vercel Edge Config (optional)
      • EDGE_CONFIG - Your Vercel Edge Config connection string (automatically set when you link an Edge Config to your project)
    • Additional optional variables for analytics and monitoring (Crisp, Posthog)

    For the full environment variable reference, deployment setup, infrastructure notes, and operational guidance, see DevOps Guide.

  4. Set up Supabase

    • Create a new project at Supabase
    • Run database migrations:
    cd apps/web
    supabase db push
    cd ../..
  5. Start the development server

    pnpm dev
  6. Open your browser The web app runs through Portless at https://sv.dev. The portless.json app name only sets the route name; PORTLESS_TLD=dev in apps/web/package.json makes the route use .dev instead of the default .localhost.

    If you previously installed the Portless startup service, it may restart the default .localhost proxy on port 443 and block .dev. Remove the service before starting the app:

    sudo portless service uninstall
    sudo portless proxy stop
    pnpm dev

Banner System

The app uses a shared banner system for both promotions and announcements:

  • apps/web/components/banner.tsx renders the banner UI
  • apps/web/lib/banners/registry.ts defines supported banners
  • apps/web/lib/banners/resolve-banner.ts resolves the single visible banner per placement
  • apps/web/app/[lang]/actions/banners.ts handles dismissal cookies

Banner copy is localized in apps/web/messages/*.json. Only one banner is shown at a time, and each banner has its own dismiss cookie.

πŸ§ͺ Development

Available Scripts

Command Description
pnpm dev Start all workspace dev tasks
pnpm --filter @sexyvoice/web dev Start only the web app dev server
pnpm build Build workspace apps with Turbo
pnpm test Run test suites
pnpm test:ui Run Vitest UI for the web app
pnpm lint Lint codebase with Biome
pnpm type-check Run TypeScript type checking
pnpm format Format code with Biome
pnpm check-translations Validate all locale files have matching keys
pnpm build:content Build web app content layer
pnpm clean Check unused dependencies with Knip
pnpm fixall Run all fixes: lint, format, and check

Testing

Run the test suite:

pnpm test

For the Vitest UI during development:

pnpm test:ui

Database Operations

Generate TypeScript types from Supabase Cloud Database:

pnpm run generate-supabase-types

Push schema changes to Supabase:

cd apps/web
supabase db push
cd ../..

Fetch database migrations:

cd apps/web
supabase migration fetch
cd ../..

Backup database and schema:

export SUPABASE_DB_URL=postgresql://postgres:xxx@db.yyyy.supabase.co:5432/postgres
sh ./scripts/db_backups.sh

Mintlify

The docs site remains the Mintlify project for docs.sexyvoice.ai.

  • In Mintlify Git Settings, point the project to this monorepo repository and the production branch.
  • Enable Mintlify monorepo mode.
  • Set the docs path to /apps/docs with no trailing slash.
  • Keep the existing custom domain and GitHub App installation attached to the repository/branch used for docs deployments.

Video Generation

Generate waveform videos for audio files using seewav:

pip3 install seewav
seewav your_audio.mp3 --color '0.8,0.0,0.4'

wav to mp3

# Convert WAV to MP3 with specific audio settings
# -i input.wav: Input file
# -acodec libmp3lame: Use LAME MP3 encoder
# -q:a 2: Variable bit rate quality (0=highest, 9=lowest)
# -ar 24000: Set audio sample rate to 24kHz
# -ac 1: Set audio channels to mono (1 channel)
# output.mp3: Output file
ffmpeg -i input.wav -acodec libmp3lame -q:a 2 -ar 24000 -ac 1 output.mp3

# For high quality MP3
ffmpeg -i input.wav -acodec libmp3lame -q:a 0 -ar 44100 -ac 2 output-high-quality.mp3

# For lowest quality MP3 possible
ffmpeg -i input.wav -acodec libmp3lame -q:a 9 -ar 8000 -ac 1 output-lowest.mp3

πŸ”’ Security

SexyVoice.ai implements multiple security layers:

  • Authentication: Secure OAuth integration with Supabase Auth
  • Data Protection: Row-level security (RLS) policies in PostgreSQL
  • API Security: Rate limiting and request validation
  • File Security: Secure R2 storage with access controls
  • Error Handling: Comprehensive error tracking with Sentry
  • Environment Isolation: Separate configurations for development and production

🀝 Contributing

We welcome contributions!

  • Report bugs
  • Suggest features
  • Submit pull requests
  • Review the DevOps Guide for environment variables, deployment, infrastructure, and operational setup changes

Setup

Zed with Cspell extension

npm install -g cspell @cspell/dict-de-de @cspell/dict-es-es
asdf reshim nodejs
cspell link add @cspell/dict-de-de
cspell link add @cspell/dict-es-es
# restart Zed language server

πŸ“„ License

This project is licensed under the MIT License.

πŸ”— Links

πŸ—οΈ Project Status

SexyVoice.ai is actively developed and maintained. Check the roadmap for upcoming features and improvements.

Current Version

  • βœ… Core voice generation functionality
  • βœ… Voice cloning with custom audio samples
  • βœ… User authentication and profiles
  • βœ… Credit system and payment processing
  • βœ… Website multi-language support (EN/ES/DE/DA/IT/FR) via next-intl
  • βœ… Audio transcription and translation tool
  • βœ… Real-time AI voice calls with configurable AI agents
  • βœ… API access

Supported Voice Families and Languages

Google Gemini (gpro) multilingual voices

Primary Gemini voices currently exposed in the app:

  • achernar
  • aoede
  • autonoe
  • callirrhoe
  • despina
  • erinome
  • gacrux
  • kore
  • puck
  • sulafat
  • zephyr

These multilingual Gemini voices support style prompting and the following language/locale set:

Language BCP-47 Code Language BCP-47 Code
Arabic (Egyptian) ar-EG German (Germany) de-DE
English (US) en-US Spanish (US) es-US
French (France) fr-FR Hindi (India) hi-IN
Indonesian (Indonesia) id-ID Italian (Italy) it-IT
Japanese (Japan) ja-JP Korean (Korea) ko-KR
Portuguese (Brazil) pt-BR Russian (Russia) ru-RU
Dutch (Netherlands) nl-NL Polish (Poland) pl-PL
Thai (Thailand) th-TH Turkish (Turkey) tr-TR
Vietnamese (Vietnam) vi-VN Romanian (Romania) ro-RO
Ukrainian (Ukraine) uk-UA Bengali (Bangladesh) bn-BD
English (India) en-IN & hi-IN bundle Marathi (India) mr-IN
Tamil (India) ta-IN Telugu (India) te-IN

xAI Grok (xai) expressive voices

Primary Grok voices currently exposed in the app:

  • ara
  • eve
  • leo
  • rex
  • sal

These voices support expressive inline tags like [laugh] and wrapping tags like <fast>...</fast>, plus automatic language detection and the following language/locale options:

Language / Locale Code Language / Locale Code
English en Japanese ja
Arabic (Egypt) ar-EG Korean ko
Arabic (Saudi Arabia) ar-SA Portuguese (Brazil) pt-BR
Arabic (United Arab Emirates) ar-AE Portuguese (Portugal) pt-PT
Bengali bn Russian ru
Chinese (Simplified) zh Spanish (Spain) es-ES
French fr Spanish (Mexico) es-MX
German de Turkish tr
Hindi hi Vietnamese vi
Indonesian id Italian it

Additional English voices

We also expose English-focused Orpheus voices:

  • dan (en-GB)
  • emma (en-US)
  • josh (en-US)
  • tara (en-US)

Made with ❀️ by Gianfranco

About

Voice Cloning, Voice Call, and Text-to-Speech platform. Perfect for content creators, developers, and storytellers πŸ˜‰

Topics

Resources

License

Stars

Watchers

Forks

Contributors