This repository performs image classification on a sub-set of the Caltech 101 dataset. The data folder contains .jpeg images of seven classes from the original dataset and are: scorpion, seahorse, soccer ball, starfish, stegosaurus, stop sign and sunflower. Data pre-processing, model definition, training and evaluation are all contained in the imclass.py file. The created model and corpus were chosen due to computing and memory restraints of my laptop.
The data folder contains separate training and validation folders each containing sub-folders indicating each class label. There are approximately 60 and 20 training and validation images for each label, respectively. Due to the small number of training images, the effect of data augmentation and dropout regularization is assessed. A relatively small custom CNN is used for classification. In addition, one may use a pre-trained Inception v3 CNN model and re-train it on the current task which provides higher classification accuracy. The script outputs graphs of the training and validation loss and accuracy.