AICamera is a high-performance, real-time object detection and tracking system built as part of a larger Computer Vision pipeline during a project at ExaWizards Inc. It focuses on accurate person detection and tracking, optimized for deployment on NVIDIA GPUs using TensorRT. The system leverages the speed and accuracy of YOLOv8 for detection and combines it with DeepSORT for robust multi-object tracking.
This submodule was designed to act as a core engine in downstream applications such as real-time surveillance, retail analytics, and smart camera systems — with a strong emphasis on efficiency, modularity, and real-world deployability.
- 🔍 Detection: YOLOv8 — state-of-the-art real-time object detection model by Ultralytics.
- 🧭 Tracking: DeepSORT — combines motion (Kalman filter) and appearance (ReID) features for reliable tracking.
- ⚡ Acceleration: NVIDIA TensorRT — significantly reduces inference time by optimizing ONNX models to run on NVIDIA GPUs.
-
Input Capture: Reads video frames from a file or webcam using OpenCV.
-
Preprocessing: Resizes and normalizes images for inference.
-
Detection (YOLOv8 + TensorRT): Outputs class labels, bounding boxes, and confidence scores.
-
Tracking (DeepSORT + TensorRT ReID):
- ReID crops extracted from detected persons.
- TensorRT-optimized ReID model generates embeddings.
- Kalman filter + cosine distance + IoU matching for identity preservation.
-
Visualization: Annotated video with object IDs and bounding boxes.
- 🚶♂️ Focused on Person Tracking
- ⚡ Blazing Fast due to TensorRT acceleration
- 🔁 Modular Pipeline with clean interfaces for detection, tracking, and I/O
- 🧪 Easily Configurable: Adjust thresholds, engine paths, and target classes
- 🖥️ Supports Webcam & Video Input
- 📂 Out-of-the-box Setup: Includes helper scripts for model downloading and engine conversion
-
OS: Ubuntu 22.04 LTS (or compatible)
-
GPU: NVIDIA CUDA-enabled GPU (Compute Capability ≥ 6.1)
-
NVIDIA Stack:
- NVIDIA Driver (latest)
- CUDA Toolkit ≥ 12.1
- cuDNN ≥ 9.0
- TensorRT ≥ 10.x (Ensure
trtexec
is available)
-
Python: 3.10.x
git clone https://github.com/abdur75648/AI-Camera.git
cd AI-Camera
Ensure you have the NVIDIA driver, CUDA, cuDNN, and TensorRT installed. You can check if they are installed correctly by running:
nvidia-smi
This should show your GPU information.
Follow NVIDIA's official documentation for installing the driver, CUDA, cuDNN, and TensorRT, if not already installed.
Ensure:
nvcc --version
trtexec --help
pip install --upgrade pip
pip install -r requirements.txt
bash scripts/download_models.sh
bash scripts/export_trt_engines.sh
This creates:
models/detection/yolov8n.engine
models/reid/deepsort_reid.engine
Run on a video file:
python3 -m src.aicamera_tracker --input sample_input/sample_video.mp4 --show_display
Run on webcam:
python3 -m src.aicamera_tracker --webcam_id 0 --output_filename outputs/webcam_run.mp4 --show_display
Run with custom confidence threshold:
python3 -m src.aicamera_tracker --input video.mp4 --conf_thresh 0.4
Argument | Description |
---|---|
--input |
Path to input video file |
--webcam_id |
Webcam ID (default: 0) |
--output_dir |
Output directory (default: outputs/ ) |
--output_filename |
Output filename (auto-generated if not set) |
--show_display |
Show live OpenCV display |
--no_save |
Skip saving output video |
--yolo_engine |
Path to YOLOv8 TensorRT engine |
--reid_engine |
Path to ReID TensorRT engine |
--conf_thresh |
Confidence threshold for detections |
--device |
Inference device (default: cuda:0 ) |
Component | Raw Engine Speed (GTX 1660Ti) | Notes |
---|---|---|
YOLOv8n (TRT) | ~400+ FPS | Highly optimized inference |
ReID (TRT) | ~600+ FPS | Fast identity embedding |
End-to-End Pipeline | ~30 FPS | Varies by resolution, #objects |
⚠️ Enabling--show_display
may reduce FPS due to rendering overhead. Disable for benchmarking.
Update src/config.py
to modify:
-
Paths:
YOLO_ENGINE_PATH
,REID_ENGINE_PATH
-
YOLO Settings:
YOLO_INPUT_SHAPE
,YOLO_CONF_THRESHOLD
-
DeepSORT Settings:
DEEPSORT_MAX_DIST
,DEEPSORT_MAX_AGE
, etc.
-
Classes to Track:
CLASSES_TO_TRACK = {'person'}
AICamera/
├── assets/ # Demo assets
├── models/ # Pretrained models (.onnx, .engine)
│ ├── detection/
│ └── reid/
├── scripts/ # Scripts for model setup
├── sample_input/ # Sample videos/images
├── src/
│ ├── aicamera_tracker.py
│ ├── config.py
│ ├── detector/ # YOLOv8 wrapper
│ ├── tracker/ # DeepSORT tracker
│ │ └── core/ # Core DeepSORT logic
│ ├── utils/ # Helper utilities
│ └── trt_utils/ # TensorRT engine handling
└── requirements.txt
- Support for other YOLOv8 sizes (s, m, l, x)
- Integration with other tracking algorithms (e.g., ByteTrack, OC-SORT)
- Smarter gallery management in ReID
- Asynchronous pipeline for faster I/O
- MOT evaluation metrics (MOTA, MOTP)
- Batch-mode frame processing
Licensed under the MIT License. You're free to use, modify, and distribute with attribution.