Skip to content

This project implements a text-based image search system. Users input a descriptive query and the the system will return the most closely related images to that description.

Notifications You must be signed in to change notification settings

r-butl/TextBasedImageSearch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🖼️ Text-Based Image Search

This project implements a machine learning system that allows users to retrieve relevant images based on natural language text queries. The system aligns text and image feature embeddings using a trained neural network and retrieves the closest-matching images based on cosine similarity.

🔍 Overview

Given a text query (e.g., "a cat sitting on a couch"), the system:

  1. Encodes the query using a pretrained text embedding model (MiniLM-L6-v2).
  2. Uses a trained feedforward neural network to map the text embedding into image embedding space.
  3. Compares the predicted image embedding against a database of image embeddings (extracted using Dinov2).
  4. Returns the most semantically relevant images using cosine similarity.

🧠 Model Architecture

  • Text Embedding: MiniLM-L6-v2
  • Image Embedding: Dinov2
  • Neural Network: 5-layer feedforward network
  • Loss Function: Cosine Similarity Loss
  • Optimization: Adam with ReduceLRonPlateau
  • Training Dataset: TextCaps (28k images with 140k captions)

📊 Evaluation Metrics

  • MRR (Mean Reciprocal Rank)
  • Precision / Recall / Jaccard Similarity (evaluated on both fine-grained and coarse-grained class sets)

📈 Achieved recall of 0.84 (coarse-grained), showing strong ability to retrieve semantically relevant images.


🚀 Getting Started

🔎 Hyperparameter Search

Run to find the best layer sizes and training config:

python hyperparameter_search.py

🏋️‍♂️ Train the Model

Train using the best configuration:

python train.py

🧪 Test the Model

Evaluate using cosine similarity & MRR:

python test.py

📎 References & Docs


👨‍💻 Contributors

  • Lucas Butler
  • Boxi Chen
  • Anthony Pecoraro
  • Hayat White

About

This project implements a text-based image search system. Users input a descriptive query and the the system will return the most closely related images to that description.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •