Spam Detector App

A Machine Learning project to detect spam messages using Natural Language Processing (NLP), TF-IDF vectorization, SMOTE for imbalance handling, and a Logistic Regression classifier — all wrapped up in Streamlit web app.

Model Accuracy: ~99% on test set

Project Structure


project-root/
│
├── data/
│   ├── spam.csv                # Original dataset
│   ├── finalmodel.pkl          # Trained ML model
│   ├── vectorizer.pkl          # Saved TF-IDF vectorizer
│   ├── feature.pkl           
│   └── label.pkl         
│
├── notebook/
│   └── eda.ipynb               # Exploratory Data Analysis
│
├── src/
│   ├── preprocessing.ipynb     # Preprocessing pipeline
│   └── training.ipynb          # Model training, tuning, evaluation
│
├── app.py                      # Streamlit app
└── README.md                   # You are here!

Features

Cleans & lemmatizes text
TF-IDF vectorization
Text length feature
SMOTE for handling imbalanced classes
Classifies using Logistic Regression
Also tested with Multinomial Naive Bayes
Built with reusability using joblib
Streamlit app for user interaction

Tech Stack

Python
Pandas, NumPy, Matplotlib, Seaborn
scikit-learn, imblearn, nltk
Streamlit
joblib

Performance

Model	Accuracy	Precision	Recall	F1-Score
Logistic Regression	99%	0.99	0.99	0.99
MultinomialNB	96%	0.93	1.00	0.96

How to Run

Clone the repo

git clone https://github.com/sarfraspc/spam-detector.git

Install requirements
```
pip install -r requirements.txt
```
Run the Streamlit app
```
streamlit run app.py
```

Dataset

Source: spam.csv
5572 messages labeled as ham or spam

Author

Sarfras LinkedIn

License

This project is open-source and available under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Spam Detector App

Project Structure

Features

Tech Stack

Performance

How to Run

Dataset

Author

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
app		app
data		data
notebook		notebook
src		src
LICENSE		LICENSE
README.md		README.md

License

sarfraspc/spam-detector

Folders and files

Latest commit

History

Repository files navigation

Spam Detector App

Project Structure

Features

Tech Stack

Performance

How to Run

Dataset

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages