Skip to content

AdityaEXP/ComicFinder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

27 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿง  ComicFinder

ComicFinder is an AI-powered content-based recommendation system built using Python and OpenAI Embeddings. It helps users discover semantically similar manga, manhwa, manhua, and webtoons based on natural language descriptions, genres, or titles โ€” ideal for fans seeking personalized recommendations beyond keyword search.

ComicFinder Preview: Streamlit interface for manga recommendation


๐Ÿ’ป Live Demo Of Comic Finder

https://comicfinder.streamlit.app/


๐Ÿš€ Features Of Comic Finder

  • ๐Ÿ” Recommends similar manga/manhwa/manhua/webtoon based on descriptions or titles
  • ๐Ÿ“ฆ Utilizes precomputed clean_embeddings.npy for fast results
  • ๐Ÿง  Embedding generation using OPENAI embeddings api
  • โšก Fast cosine similarity search for real-time recommendation
  • ๐Ÿ–ฅ๏ธ Clean Streamlit-based frontend
  • ๐Ÿ“ Organized data and scripts for easy retraining or extension

๐Ÿ“ Project Structure

comic-recommender/
โ”œโ”€โ”€ app.py                       # Main application script
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ data.csv                 # Original manhwa dataset
โ”‚   โ”œโ”€โ”€ clean_data.csv           # Cleaned and preprocessed data
โ”‚   โ””โ”€โ”€ clean_embeddings.npy     # (Ignored from Git, must be downloaded separately)
โ”œโ”€โ”€ scripts/
โ”‚   โ”œโ”€โ”€ clean_dataset.py         # Data cleaning script
โ”‚   โ”œโ”€โ”€ generate_embeddings.py   # Embedding generation
โ”‚   โ””โ”€โ”€ recommend.py             # Similarity-based recommendations but CLI version
โ”œโ”€โ”€ .env                         # Store API keys 
โ”œโ”€โ”€ requirements.txt             # Python dependencies
โ””โ”€โ”€ README.md                    # You're here!

๐Ÿ”ง How to Install and Run ComicFinder Locally

git clone https://github.com/AdityaEXP/ComicFinder.git
cd ComicFinder

# Optional: Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

pip install -r requirements.txt
streamlit run .\app.py

๐Ÿ“Œ Example Use Cases

  • Find romance manhwa similar to What's Wrong with Secretary Kim?
  • Get fantasy webtoon recommendations with strong male leads
  • Discover hidden manga gems with character development arcs
  • Replace genre filters with AI-powered natural language queries

๐Ÿ“ฅ Download Embedding File

Since clean_embeddings.npy is large, itโ€™s not included in this repo. ๐Ÿ“ฆ Download clean_embeddings.npy Or you can also generate the clean_embeddings.npy using your own openai api key it will cost around $0.02 per generation


๐Ÿ” Environment Variables

Create a .env file for your OpenAI API Key

OPENAIKEY=sk-xxxxxx

๐Ÿ“œ License

MIT โ€” free to use, modify, and distribute.


๐Ÿค Author

Aditya ๐Ÿ› ๏ธ AI + Python + Web3 Enthusiast


๐Ÿ”ฎ Future Plans

  1. Replace cosine similarity by FAISS for fast searches
  2. Adding Anime and webseries dataset as well
  3. Create a automated source for scrapping data from webpages or api and update the dataset periodically.
  4. Improve searching by creating high value embeddings using more data etc.

๐Ÿ“š Dataset Source and Preprocessing

This project uses data inspired by or adapted from the following Kaggle dataset:

๐Ÿ“Š Kaggle - Manhwa and Webtoon Dataset
Credit to Victor Soeiro for compiling and sharing this dataset.

Releases

No releases published

Packages

No packages published

Languages