Skip to content

ectorr01/MNIST-Dataset-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

MNIST Dataset Analysis

This project performs an exploratory data analysis (EDA) on the MNIST dataset using TensorFlow and related libraries such as tensorflow-datasets, matplotlib, seaborn, and pandas.

πŸ” Overview

The MNIST dataset contains 70,000 grayscale images of handwritten digits (0–9), divided into:

  • Training set: 60,000 images
  • Test set: 10,000 images

Each image is 28x28 pixels in size.

This script analyzes both the training and test datasets by:

  • Plotting class distributions
  • Displaying example images
  • Checking if the dataset is balanced

🧰 Requirements

To run this code, you need the following libraries installed:

pip install tensorflow tensorflow-datasets matplotlib seaborn pandas numpy

πŸ“ Files

  • dataset_analysis.py: Main script to load and analyze the MNIST dataset.
  • README.md: This file.

πŸ“ˆ Visualizations

Class Distribution Bar Chart

A bar plot showing how many samples are present for each digit class.

Sample Image Grid

A 3x3 grid displaying sample images from the dataset along with their corresponding labels.

Class Distribution Heatmap

A heatmap summarizing the count of images per class in a compact format.

πŸš€ Usage

To run the analysis:

python dataset_analysis.py

The script will display plots and print statistics about the dataset.

πŸ“¦ Next Steps

In future commits, we will add:

  • A deep learning model built with Keras
  • Model training and evaluation
  • Model inference and saving/loading functionality

Stay tuned!

About

A simple yet complete EDA of the MNIST dataset using TensorFlow and TensorFlow Datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages