🧠Jarvis – A Modular Desktop Voice Assistant (Python)

Jarvis is a fully capable, extensible desktop voice assistant built in Python. This project started as a hobby in Class 8, gradually evolved over the years, and was even showcased at a school exhibition. Today, it has grown into a feature-rich assistant capable of voice commands, web automation, translation, media control, smart-home triggers, and more.

Project Overview

Jarvis is a voice-driven personal assistant built with modular Python scripts. It listens to commands through speech recognition and performs actions such as:

Searching Wikipedia
Controlling system apps
Managing Spotify music
Opening games & software
Translating speech
Reading news
Opening the camera
Setting alarms
Automating smart bulb actions
Keyboard & volume automation …and much more. The architecture is split into multiple small modules, making the code more maintainable and scalable.

Features

🎤 Voice Interaction

Hotword-less continuous listening
Speech recognition (Google)
High-quality offline TTS using pyttsx3

🌐 Internet & Web Tasks

Wikipedia search
Google queries & website opening
Live weather extraction from Google
News headlines via NewsAPI

🖥 System & Application Automation

Open Chrome, CMD, VS Code, Steam, Photoshop, Audacity, Premiere Pro, Minecraft, Fall Guys
PC volume control using pynput keyboard
YouTube controls (pause/play/mute) using pyautogui

🎵 Music Automation

Spotify search
Auto-click based playback/like/add-to-playlist

📸 Camera Access

Live webcam feed using OpenCV

⏰ Alarm System

Dedicated alarm.py script
Config saved through Alarmtext.txt
Auto music playback when time matches

💡 Smart Home Automation

Trigger smart bulb automation via keyboard search + UI navigation
Color change, brightness interactions

🌍 Translation Module

Real-time speech translation using googletrans + gTTS
Auto-generated voice output

😂 Entertainment

Joke generator using pyjokes

Modules Explained

Here is a clean explanation of all modules in the codebase:

1️⃣ jarvis.py (Main Module)

The core engine. Handles:
Voice input/output
Command parsing
All high-level task routing
Wikipedia search
App automation
Smart device interactions
Temperature extraction
Camera access
Email sending
Spotify automation
Translating via translator module
News reading (via NewsRead module)
Calling alarm system
This is the brain of your assistant.

2️⃣ alarm.py

Handles alarm scheduling and ringing:
Reads time from Alarmtext.txt
Compares current PC time with target
Plays music (music.mp3)
Auto-reset alarm text file

3️⃣ click coordinate.py

A utility tool:
Prints current mouse coordinates
Helps you record pyautogui click positions for automation

4️⃣ greetme.py

Provides a clean greeting function:
Detects morning/afternoon/evening
Speaks greeting messages
Reusable inside other modules

5️⃣ keyboard.py

Custom keyboard controller using pynput:
Volume up/down functions
Smooth press-release loops

6️⃣ NewsRead.py

News reading engine:
Fetches category-wise headlines
Uses NewsAPI
Reads each headline aloud
Offers "continue/stop" interactive CLI

7️⃣ Translator.py

Full-feature speech → translation → speech system:
Listens via microphone
Uses googletrans to translate any input
Uses gTTS to create temporary output audio
Plays the file, then deletes it

8️⃣ Additional Files

Alarmtext.txt → stores temporary alarm time
music.mp3 → plays when alarm triggers
settings.json (VS Code) → personal environment setup
pycache/ → Python cache files (ignored in README)

Folder Structure

Jarvis/
│
├── jarvis.py
├── alarm.py
├── click coordinate.py
├── greetme.py
├── keyboard.py
├── NewsRead.py
├── Translator.py
│
├── Alarmtext.txt
├── music.mp3
├── settings.json (if present)
│
├── __pycache__/
└── README.md (generated)

Installation

🔧 Requirements

Install the required libraries:

 pip install pyttsx3 speechrecognition wikipedia pyautogui    opencv-python
 pip install requests beautifulsoup4 pyjokes selenium  googletrans==3.1.0a0
 pip install gTTS playsound pynput keyboard

🔧 Additional Setup

Chrome and Apps path: Update Chrome and Apps path inside jarvis.py if needed
NewsAPI key: Replace with your own API key inside NewsRead.py
Alarm music: Ensure music.mp3 exists in project root
Microphone required

How It Works

Jarvis boots & greets the user
Begins listening continuously
Speech is converted to text
Query is matched with command blocks
Corresponding module is triggered
Module performs action (web automation, opening apps, etc.)
Jarvis confirms completion & waits for next instruction The system is based on simple conditional command parsing, making it easy to add new commands.

Usage Guide

Speak commands such as:

🌐 Internet

“Wikipedia Elon Musk”
“Search Instagram”
“Open Chrome”

🖥 Apps

“Open Command Prompt”
“Open VS Code”
“Open Steam”
“Open Photoshop”

🎵 Music

“Play music”
“Like this song”
“Add this song to the playlist”

📸 Hardware

“Open camera”

⏰ Alarm

“Set an alarm”

😂 Fun

“Tell me a joke”

Future Improvements

🔮 AI/ML Enhancements

Add an NLP model (transformer-based) to understand natural language beyond keywords
Build a context-aware conversation engine
Train a lightweight intent classifier for smarter routing
Add embeddings to remember user preferences
Integrate offline LLM (e.g., GPT4All, LLaMA, Whisper)

🌐 System & Feature Enhancements

Add wake-word engine (“Hey Jarvis”)
Add GUI dashboard
Create plugin system for custom commands
Add robust error-handling & logs
Cloud sync for preferences and history

Tech Stack

Programming Language:

Python 3.x

Core Libraries:

pyttsx3 (Offline TTS)
speech_recognition (Voice input)
wikipedia
opencv-python (Camera module)
pyautogui (Automation)
requests + BeautifulSoup (Web scraping)
pynput (Keyboard/volume control)
googletrans + gTTS (Translation)
selenium (Browser automation)
NewsAPI (News)

Concepts Used:

Modular Python design
Rule-based command engine
Web automation
Multimedia automation
Basic natural language processing (keyword parsing)
Multi-threaded interaction (via separate alarm module)
System-level automation

Credits

Developer: Darsh Yadav
Project started in Class 8, gradually improved over the years
Presented at a school exhibition
Powered entirely by Python and open-source libraries

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.vscode		.vscode
__pycache__		__pycache__
Alarm		Alarm
Alarm.py		Alarm.py
Alarmtext.txt		Alarmtext.txt
GreetMe.py		GreetMe.py
Jarvis.py		Jarvis.py
NewsRead.py		NewsRead.py
README.md		README.md
Translator.py		Translator.py
click cordinates.py		click cordinates.py
keyboard.py		keyboard.py
music.mp3		music.mp3

Folders and files

Latest commit

History

Repository files navigation

🧠Jarvis – A Modular Desktop Voice Assistant (Python)

Project Overview

Features

🎤 Voice Interaction

🌐 Internet & Web Tasks

🖥 System & Application Automation

🎵 Music Automation

📸 Camera Access

⏰ Alarm System

💡 Smart Home Automation

🌍 Translation Module

😂 Entertainment

Modules Explained

1️⃣ jarvis.py (Main Module)

2️⃣ alarm.py

3️⃣ click coordinate.py

4️⃣ greetme.py

5️⃣ keyboard.py

6️⃣ NewsRead.py

7️⃣ Translator.py

8️⃣ Additional Files

Folder Structure

Installation

🔧 Requirements

How It Works

Usage Guide

🌐 Internet

🖥 Apps

🎵 Music

📸 Hardware

⏰ Alarm

😂 Fun

Future Improvements

🔮 AI/ML Enhancements

🌐 System & Feature Enhancements

Tech Stack

Programming Language:

Core Libraries:

Concepts Used:

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages