⚠️ Internal Alpha - This project is in early development and not ready for production use.
- Install Rust (stable) and required system dependencies for your platform.
- Use the provided scripts in
scripts/
to help with local environment setup.
To reduce the chance of CI failures from formatting, this repository includes a small pre-commit hook that runs cargo fmt --all
before each commit and blocks the commit if rustfmt
makes changes. To enable it for your local clone:
cd <repo-root>
./scripts/install-githooks.sh
This copies the hooks from .githooks/
into .git/hooks/
and makes them executable. You can remove or modify the hook if you want a different behavior.
⚠️ Internal Alpha - This project is in early development and not ready for production use.
Minimal root README. Full developer & architecture guide: see CLAUDE.md
.
ColdVox is a modular Rust workspace providing real‑time audio capture, VAD, STT (Vosk), and cross‑platform text injection.
For Voice Dictation (Recommended):
# Run with default Vosk STT and text injection (model auto-discovered)
cargo run --features text-injection
# With specific microphone device
cargo run --features text-injection -- --device "HyperX QuadCast"
# TUI Dashboard with controls
cargo run --bin tui_dashboard --features tui
Other Usage:
# VAD-only mode (no speech recognition)
cargo run
# Test microphone setup
cargo run --bin mic_probe -- list-devices
Note on Defaults: Vosk STT is now the default feature (enabled automatically), ensuring real speech recognition in the app and tests. This prevents fallback to the mock plugin, which skips transcription. Override with --stt-preferred mock
or env COLDVOX_STT_PREFERRED=mock
if needed for testing. For other STT backends (e.g., Whisper), enable their features and set preferred accordingly.
- Small Model (~40MB, included): Located at
models/vosk-model-small-en-us-0.15/
- Auto-Discovery: Model automatically found when running from project root
- Manual Path: Set
VOSK_MODEL_PATH
for custom locations if needed - Verification:
sha256sum -c models/vosk-model-small-en-us-0.15/SHA256SUMS
- Audio Capture → VAD → STT → Text Injection
- Push-to-Talk: Hold
Super+Ctrl
, speak, release (hotkey mode) - Voice Activation: Automatically detects speech and transcribes (VAD mode)
More detail: See CLAUDE.md
for full developer guide.
Some end‑to‑end tests exercise real injection & STT. Gate them locally by setting an env variable (planned):
export COLDVOX_SLOW_TESTS=1
cargo test -- --ignored
Headless behavior notes: see docs/text_injection_headless.md
.
Dual-licensed under MIT or Apache-2.0. See LICENSE-MIT
and LICENSE-APACHE
if present, else crate-level manifests.