Spam Classifier

A very Spam Classifer built using TF-IDF Vectorizer and tested on multiple split parameters.

Two models are compared for performance: Logistic Regression and Random Forest Classifier.
Both models are properly evaluated with classification reports to show performance based on precision/recall.
Classification capabilities are also analyzed using feature importances from both models, showing how agressive and conservative each model is in classifying certain words as spam.
The models have been run twice, both with and without stop words.

Logistic Regression Results:

Class	Precision	Recall	F1-Score	Support
ham	0.97	1.00	0.99	958
spam	0.98	0.83	0.90	157

Class	Precision	Recall	F1-Score	Support
ham	0.98	1.00	0.99	958
spam	1.00	0.85	0.92	157

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
spam-classifier.ipynb		spam-classifier.ipynb