Skip to content

darshyadav12/Jarvis

Repository files navigation

🧠Jarvis – A Modular Desktop Voice Assistant (Python)

Jarvis is a fully capable, extensible desktop voice assistant built in Python. This project started as a hobby in Class 8, gradually evolved over the years, and was even showcased at a school exhibition. Today, it has grown into a feature-rich assistant capable of voice commands, web automation, translation, media control, smart-home triggers, and more.

Project Overview

Jarvis is a voice-driven personal assistant built with modular Python scripts. It listens to commands through speech recognition and performs actions such as:

  • Searching Wikipedia
  • Controlling system apps
  • Managing Spotify music
  • Opening games & software
  • Translating speech
  • Reading news
  • Opening the camera
  • Setting alarms
  • Automating smart bulb actions
  • Keyboard & volume automation …and much more. The architecture is split into multiple small modules, making the code more maintainable and scalable.

Features

🎤 Voice Interaction

  • Hotword-less continuous listening
  • Speech recognition (Google)
  • High-quality offline TTS using pyttsx3

🌐 Internet & Web Tasks

  • Wikipedia search
  • Google queries & website opening
  • Live weather extraction from Google
  • News headlines via NewsAPI

🖥 System & Application Automation

  • Open Chrome, CMD, VS Code, Steam, Photoshop, Audacity, Premiere Pro, Minecraft, Fall Guys
  • PC volume control using pynput keyboard
  • YouTube controls (pause/play/mute) using pyautogui

🎵 Music Automation

  • Spotify search
  • Auto-click based playback/like/add-to-playlist

📸 Camera Access

  • Live webcam feed using OpenCV

⏰ Alarm System

  • Dedicated alarm.py script
  • Config saved through Alarmtext.txt
  • Auto music playback when time matches

💡 Smart Home Automation

  • Trigger smart bulb automation via keyboard search + UI navigation
  • Color change, brightness interactions

🌍 Translation Module

  • Real-time speech translation using googletrans + gTTS
  • Auto-generated voice output

😂 Entertainment

  • Joke generator using pyjokes

Modules Explained

Here is a clean explanation of all modules in the codebase:

1️⃣ jarvis.py (Main Module)

  • The core engine. Handles:
  • Voice input/output
  • Command parsing
  • All high-level task routing
  • Wikipedia search
  • App automation
  • Smart device interactions
  • Temperature extraction
  • Camera access
  • Email sending
  • Spotify automation
  • Translating via translator module
  • News reading (via NewsRead module)
  • Calling alarm system
  • This is the brain of your assistant.

2️⃣ alarm.py

  • Handles alarm scheduling and ringing:
  • Reads time from Alarmtext.txt
  • Compares current PC time with target
  • Plays music (music.mp3)
  • Auto-reset alarm text file

3️⃣ click coordinate.py

  • A utility tool:
  • Prints current mouse coordinates
  • Helps you record pyautogui click positions for automation

4️⃣ greetme.py

  • Provides a clean greeting function:
  • Detects morning/afternoon/evening
  • Speaks greeting messages
  • Reusable inside other modules

5️⃣ keyboard.py

  • Custom keyboard controller using pynput:
  • Volume up/down functions
  • Smooth press-release loops

6️⃣ NewsRead.py

  • News reading engine:
  • Fetches category-wise headlines
  • Uses NewsAPI
  • Reads each headline aloud
  • Offers "continue/stop" interactive CLI

7️⃣ Translator.py

  • Full-feature speech → translation → speech system:
  • Listens via microphone
  • Uses googletrans to translate any input
  • Uses gTTS to create temporary output audio
  • Plays the file, then deletes it

8️⃣ Additional Files

  • Alarmtext.txt → stores temporary alarm time
  • music.mp3 → plays when alarm triggers
  • settings.json (VS Code) → personal environment setup
  • pycache/ → Python cache files (ignored in README)

Folder Structure

Jarvis/
│
├── jarvis.py
├── alarm.py
├── click coordinate.py
├── greetme.py
├── keyboard.py
├── NewsRead.py
├── Translator.py
│
├── Alarmtext.txt
├── music.mp3
├── settings.json (if present)
│
├── __pycache__/
└── README.md (generated)

Installation

🔧 Requirements

Install the required libraries:

 pip install pyttsx3 speechrecognition wikipedia pyautogui    opencv-python
 pip install requests beautifulsoup4 pyjokes selenium  googletrans==3.1.0a0
 pip install gTTS playsound pynput keyboard

🔧 Additional Setup

  • Chrome and Apps path: Update Chrome and Apps path inside jarvis.py if needed
  • NewsAPI key: Replace with your own API key inside NewsRead.py
  • Alarm music: Ensure music.mp3 exists in project root
  • Microphone required

How It Works

  • Jarvis boots & greets the user
  • Begins listening continuously
  • Speech is converted to text
  • Query is matched with command blocks
  • Corresponding module is triggered
  • Module performs action (web automation, opening apps, etc.)
  • Jarvis confirms completion & waits for next instruction The system is based on simple conditional command parsing, making it easy to add new commands.

Usage Guide

Speak commands such as:

🌐 Internet

  • “Wikipedia Elon Musk”
  • “Search Instagram”
  • “Open Chrome”

🖥 Apps

  • “Open Command Prompt”
  • “Open VS Code”
  • “Open Steam”
  • “Open Photoshop”

🎵 Music

  • “Play music”
  • “Like this song”
  • “Add this song to the playlist”

📸 Hardware

  • “Open camera”

⏰ Alarm

  • “Set an alarm”

😂 Fun

  • “Tell me a joke”

Future Improvements

🔮 AI/ML Enhancements

  • Add an NLP model (transformer-based) to understand natural language beyond keywords
  • Build a context-aware conversation engine
  • Train a lightweight intent classifier for smarter routing
  • Add embeddings to remember user preferences
  • Integrate offline LLM (e.g., GPT4All, LLaMA, Whisper)

🌐 System & Feature Enhancements

  • Add wake-word engine (“Hey Jarvis”)
  • Add GUI dashboard
  • Create plugin system for custom commands
  • Add robust error-handling & logs
  • Cloud sync for preferences and history

Tech Stack

Programming Language:

  • Python 3.x

Core Libraries:

  • pyttsx3 (Offline TTS)
  • speech_recognition (Voice input)
  • wikipedia
  • opencv-python (Camera module)
  • pyautogui (Automation)
  • requests + BeautifulSoup (Web scraping)
  • pynput (Keyboard/volume control)
  • googletrans + gTTS (Translation)
  • selenium (Browser automation)
  • NewsAPI (News)

Concepts Used:

  • Modular Python design
  • Rule-based command engine
  • Web automation
  • Multimedia automation
  • Basic natural language processing (keyword parsing)
  • Multi-threaded interaction (via separate alarm module)
  • System-level automation

Credits

  • Developer: Darsh Yadav
  • Project started in Class 8, gradually improved over the years
  • Presented at a school exhibition
  • Powered entirely by Python and open-source libraries

About

Jarvis is a locally-run, modular, voice-controlled desktop assistant built entirely with Python.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages