An intelligent system that automatically generates real-time football commentary by analyzing match footage using computer vision and natural language processing.
- Overview
- Features
- System Architecture
- Installation
- Usage
- Project Structure
- Technical Implementation
- Roadmap
- Contributing
- Team
Sanjaya Uwacha is an AI-driven football commentary generation system that transforms raw match footage into engaging, context-aware commentary. The system leverages state-of-the-art computer vision models for player and ball detection, tracking algorithms for possession analysis, and natural language generation for creating dynamic commentary.
- Real-time Player Detection & Tracking: Identifies and tracks all players, referees, and the ball
- Event Analysis: Determines what event has occured during the football match and generates commentary
- Automated Commentary Generation: Creates natural language commentary based on detected events
- β Player & Ball Detection: YOLOv8-based object detection using Roboflow API
- β Multi-Object Tracking: ByteTrack algorithm for consistent player identification
- β Player Identification: Individual player name mapping and labeling
- β Possession Detection: Distance-based algorithm to determine ball control
- β Basic Commentary: Text-to-speech commentary for possession changes
- β Video Annotation: Visual overlay of detections, tracking IDs, and possession indicators
- π Advanced Event Detection: Goals, passes, fouls, corners, throw-ins
- π Team Classification: Automatic team identification using color analysis
- π Contextual Commentary: LLM-powered natural commentary generation
- π Action Classification: Detailed activity recognition (shooting, dribbling, tackling)
- π Multi-language Support: Commentary in English, Hindi, and Nepali
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INPUT VIDEO STREAM β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β COMPUTER VISION MODULE β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β YOLOv8 ββ β ByteTrack ββ β Event β β
β β Detection β β Tracking β β Analysis β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EVENT DETECTION MODULE β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Temporal ββ β Action ββ β Event β β
β β Analysis β β Recognition β β Classificationβ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β NATURAL LANGUAGE GENERATION MODULE β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β LLM ββ β Context ββ β TTS β β
β β β β Builder β β Synthesis β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AUDIO-VIDEO SYNCHRONIZATION β
β & OUTPUT GENERATION β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Python 3.8 or higher
- CUDA-compatible GPU (recommended for real-time processing)
- FFmpeg (for video processing)
- Clone the repository
git clone https://github.com/fuseai-fellowship/Football-Commentary-Generation.git
cd Football-Commentary-Generation- Install uv (if not already installed)
# On macOS and Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# On Windows
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# Or using pip
pip install uv- Install project dependencies
# Install all dependencies including dev dependencies
uv sync
# Or install only production dependencies
uv sync --no-dev- Configure environment variables
cp .env.example .env
# Edit .env and add your Roboflow API key
ROBOFLOW_API_KEY=your_api_key_here- Install FFmpeg (if not already installed)
# Ubuntu/Debian
sudo apt install ffmpeg
# macOS
brew install ffmpeg
# Windows
# Download from https://ffmpeg.org/download.htmlIf you prefer using pip instead of uv:
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies using pyproject.toml
pip install -e .
# For development dependencies
pip install -e ".[dev]"# Navigate to notebooks directory
cd notebooks
# Run basic detection and tracking
jupyter notebook first_step.ipynb
# Run possession detection with commentary
jupyter notebook possession_with_commentary.ipynbFootball-Commentary-Generation/
β
βββ notebooks/
β βββ first_step.ipynb # Basic detection & tracking
β βββ possession_with_commentary.ipynb # Possession + commentary
β βββ tracking_experiments/ # Tracker comparison experiments
β β βββ 01_deepsort_experiment.ipynb
β β βββ 02_sort_experiment.ipynb
β β βββ 03_comparison_analysis.ipynb
β β βββ tracker_utils.py
β βββ play.mp4 # Sample input video
β
βββ src/
β βββ detection/
β β βββ player_detector.py # Player detection module
β β βββ ball_detector.py # Ball detection module
β βββ tracking/
β β βββ tracker.py # Multi-object tracking
β βββ possession/
β β βββ possession_analyzer.py # Possession detection
β βββ commentary/
β βββ event_detector.py # Event detection
β βββ text_generator.py # NLG module
β
βββ data/
β βββ raw/ # Raw video inputs
β βββ processed/ # Processed outputs
β βββ annotations/ # Training annotations
β
βββ results/
β βββ videos/ # Output videos
β βββ metrics/ # Performance metrics
β βββ comparison_report.md # Tracker comparison report
β
βββ .gitignore # Git ignore rules
βββ requirements.txt # Python dependencies
βββ .env.example # Environment variables template
βββ readme.md # This file
βββ LICENSE # MIT License
Model: YOLOv8 (via Roboflow API)
- Classes: Players (Team A, Team B, Goalkeepers), Referees (Main, Assistant), Ball
- Input Resolution: 640Γ640 pixels
- Confidence Threshold: 0.3
- NMS Threshold: 0.5
Tracking Algorithm: ByteTrack
- Features: Robust multi-object tracking with occlusion handling
- Frame-to-frame association using Kalman filtering
- Handles: Player identity consistency across frames
Algorithm: Distance-based proximity detection
def detect_possession(player_centers, ball_center, threshold=100):
distances = euclidean_distance(player_centers, ball_center)
return argmin(distances) if min(distances) < threshold else NoneParameters:
- Proximity threshold: 100 pixels (adjustable)
- Update frequency: Per frame (30 FPS)
Current: Text-to-Speech (gTTS)
- Generates audio for player names on possession change
- Synchronizes with video timestamp using FFmpeg
Future: LLM-based generation (Gemini/GPT-4)
- Context-aware commentary
- Multiple commentary styles (analytical, entertaining)
- Real-time event descriptions
1. Frame Extraction (30 FPS)
2. Object Detection (YOLOv8)
3. Tracking Update (ByteTrack)
4. Possession Analysis
5. Event Detection [Future]
6. Commentary Generation
7. Audio Overlay (FFmpeg)
8. Output Video Creation- GPU Acceleration: CUDA-enabled inference using ONNX Runtime
- Batch Processing: Frame batching for efficient GPU utilization
- Model Optimization: Quantization and pruning for faster inference
- Player detection using YOLOv8
- Ball detection and tracking
- ByteTrack integration
- Player identification system
- Basic possession detection
- Text-to-speech commentary
- Team classification using color clustering
- Advanced event detection (passes, shots, fouls)
- Temporal action recognition
- Tracker comparison experiments (ByteTrack vs DeepSORT vs SORT)
- Enhanced possession accuracy
- Contextual commentary generation
- LLM integration (Gemini/GPT-4)
- Context-aware commentary
- Multi-language support (Hindi, Nepali)
- Real-time streaming capability
- Performance optimization
- Model quantization for edge devices
- REST API development
- Web-based interface
- Cloud deployment
- Mobile application
TBC later
As part of our research phase, we're conducting comprehensive experiments to compare different tracking algorithms:
| Algorithm | Pros | Cons | Use Case |
|---|---|---|---|
| ByteTrack | Fast, handles occlusions well | Requires tuning | Real-time tracking |
| DeepSORT | High accuracy, uses appearance | Slower, GPU intensive | Offline processing |
| SORT | Very fast, simple | Lower accuracy | Quick prototyping |
Experiment Results: Coming soon in results/comparison_report.md
We welcome contributions! Please follow these guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
- Follow PEP 8 style guide for Python code
- Add docstrings to all functions and classes
- Write unit tests for new features
- Update documentation as needed
FuseAI Fellowship - Football Commentary Generation Team
- Bijay Shrestha
- Sudip Shrestha
Mentor: Sushil Dyopla
Program: FuseAI Fellowship 2024
This project is licensed under the MIT License - see the LICENSE file for details.
- Roboflow for object detection infrastructure
- Supervision for computer vision utilities
- ByteTrack for multi-object tracking
- gTTS for text-to-speech synthesis
- FFmpeg for audio-video processing
- FuseAI Fellowship for mentorship and support
- Zhang, Y., et al. (2022). "ByteTrack: Multi-Object Tracking by Associating Every Detection Box"
- Wojke, N., et al. (2017). "Simple Online and Realtime Tracking with a Deep Association Metric"
- Redmon, J., et al. (2016). "You Only Look Once: Unified, Real-Time Object Detection"
For questions, suggestions, or collaboration opportunities:
- Email: sudipshrestha2051219@gmail.com, bijay17khadka@gmail.com
- Project Repository: GitHub
- Issues: GitHub Issues
If you find this project useful, please consider:
- β Starring the repository
- π Reporting bugs and issues
- π‘ Suggesting new features
- π Improving documentation
- π€ Contributing code
Note: This project is under active development as part of the FuseAI Fellowship program. Features and documentation are continuously being updated. Star β the repo to stay updated with the latest developments!
Last Updated: October 2024