Cat vs Dog Image Classification with SVM
This project focuses on classifying images of cats and dogs using a Support Vector Machine (SVM). The goal was to create a simple yet effective image classification model, preprocess the dataset for consistency, and evaluate the performance using metrics such as accuracy, precision, recall, and F1-score. This project demonstrates the application of classical machine learning techniques for image classification.
The dataset used for this project is the Kaggle Cats vs Dogs Dataset. It contains 25,000 labeled images of cats and dogs.
You can find and download the dataset here: Kaggle Cats vs Dogs Dataset
The following libraries and tools are required to run this project:
- Python 3.7 or above
- NumPy
- OpenCV
- scikit-learn
- matplotlib
- seaborn
To install the dependencies, use:
pip install -r requirements.txt
Follow these steps to run the project locally:
-
Clone the repository:
git clone https://github.com/<your-username>/SCT_ML_3.git cd SCT_ML_3
-
Install the required Python libraries:
pip install -r requirements.txt
-
Download the Kaggle dataset:
- Go to Kaggle Cats vs Dogs Dataset.
- Download the dataset and extract it to a folder (e.g.,
train/
).
-
Update the dataset path in the
Cats_Dogs_Classification.ipynb
file:
Modify thetrain_path
variable to point to your dataset folder. -
Run the Jupyter Notebook:
jupyter notebook Cats_Dogs_Classification.ipynb
-
Follow the steps in the notebook to preprocess the dataset, train the SVM model, and evaluate its performance.
The model achieved 57.20% accuracy on the test dataset.
precision recall f1-score support
0 0.55 0.63 0.59 971
1 0.60 0.52 0.55 1029
accuracy 0.57 2000
macro avg 0.57 0.57 0.57 2000
weighted avg 0.58 0.57 0.57 2000
The confusion matrix shows the model's performance on correctly classifying cats and dogs.
This project demonstrates the use of Support Vector Machines (SVMs) for image classification tasks. Although the achieved accuracy is modest, it showcases the challenges of classical machine learning techniques for image data and highlights the importance of feature extraction and dataset preprocessing.
Future improvements may include feature engineering, hyperparameter tuning, or using deep learning models like CNNs for better performance.