MLPerf Inference Dashboard

A comprehensive performance analysis dashboard for MLPerf Inference benchmark results.

Features

Multi-Version Support

Compare MLPerf v5.0, v5.1 submissions

Benchmark Comparisons

Interactive bar charts for performance comparison across systems
Support for multiple models: DeepSeek-R1, Llama 3.1 8B, Llama 2 70B, and more
Filter by organizations, accelerators, scenarios (Offline/Server)

Normalized Result Analysis

Per-GPU and per-8-GPU-node normalization options
Performance benefit calculation vs. global baseline
Baseline system information displayed for each chart
Handles systems with varying accelerator counts

Dataset Representation

Lightweight CSV-based dataset summaries
Token length distribution histograms with statistics
Visual representation of input/output token patterns
Median and max value annotations

Offline vs Server Comparison

Performance degradation analysis between scenarios
Side-by-side metric comparison
Detailed per-system breakdown

Cross-Version Analysis

Track system performance evolution across MLPerf versions
Automatic identification of multi-version systems

Directory Structure

mlperf-dashboard/
├── app.py                          # Main application entry point
├── mlperf_datacenter.py            # MLPerf dashboard module
├── dashboard_styles.py             # CSS styling
├── requirements.txt                # Python dependencies
├── pyproject.toml                  # Project metadata
├── Makefile                        # Development commands
├── mlperf-data/                    # MLPerf data files
│   ├── mlperf-5.1.csv              # MLPerf v5.1 submission data
│   ├── mlperf-5.0.csv              # MLPerf v5.0 submission data
│   ├── summaries/                  # Dataset summaries (version controlled)
│   │   ├── README.md
│   │   ├── deepseek-r1.csv
│   │   ├── llama3-1-8b-datacenter.csv
│   │   └── llama2-70b-99.csv
│   └── original/                   # Original datasets (NOT version controlled)
│       ├── README.md
│       └── generate_dataset_summaries.py
└── tests/                          # Test suite
    ├── conftest.py
    ├── test_mlperf_datacenter.py
    └── README.md

Quick Start

Local Development

Clone the repository:

git clone https://github.com/Harshith-umesh/mlperf-dashboard.git
cd mlperf-dashboard

Set up Python environment:

python3 -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
pip3 install -r requirements.txt

Run the dashboard:
```
streamlit run app.py
```
Access: Open http://localhost:8501 in your browser

Development Environment Setup

For a complete development environment with linting, formatting, and code quality tools:

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Available development commands:

make format - Auto-format code (Black, Ruff)
make lint - Run linting checks
make type-check - Run static type checking
make test - Run tests with coverage
make ci-local - Run all CI checks locally
make clean - Clean temporary files

MLPerf Data Management

MLPerf CSV Files

The dashboard includes MLPerf submission data:

mlperf-data/mlperf-5.1.csv - v5.1 submissions
mlperf-data/mlperf-5.0.csv - v5.0 submissions

These files are version controlled.

Dataset Summaries

Lightweight CSV summaries (40-180 KB vs 1-20 MB originals):

mlperf-data/summaries/deepseek-r1.csv
mlperf-data/summaries/llama3-1-8b-datacenter.csv
mlperf-data/summaries/llama2-70b-99.csv

Managing Original Datasets

Original datasets are stored in mlperf-data/original/ (NOT version controlled).

To download datasets:

Visit MLCommons Inference Benchmark Data Download

Example:

cd mlperf-data/original/
bash <(curl -s https://raw.githubusercontent.com/mlcommons/r2-downloader/refs/heads/main/mlc-r2-downloader.sh) -d ./ https://inference.mlcommons-storage.org/metadata/deepseek-r1-datasets-fp8-eval.uri

To generate summaries:

cd /path/to/mlperf-dashboard
python mlperf-data/original/generate_dataset_summaries.py

See mlperf-data/original/README.md for detailed instructions.

Testing

Run all tests:

pytest tests/

Run with coverage:

pytest tests/ --cov=. --cov-report=html

Quick test:

make test

Configuration

Environment Variables

STREAMLIT_SERVER_HEADLESS=true - Headless mode for production
STREAMLIT_SERVER_PORT=8501 - Server port
STREAMLIT_SERVER_ADDRESS=0.0.0.0 - Listen address

Data Requirements

CSV files must include columns for model, scenario, organization, accelerator, and metrics
Dataset summaries require input_length and output_length columns

Contributing

Fork the repository
Create a feature branch: git checkout -b feature-name
Set up development environment: pip install -e ".[dev]"
Install pre-commit hooks: pre-commit install
Make changes and test: pytest tests/
Run code quality checks: make ci-local
Submit a pull request

Key Metrics Analyzed

Performance: Samples/s, Tokens/s, Queries/s
Normalization: Per-GPU, Per-8-GPU-Node
Scenarios: Offline (batch), Server (online)
Systems: Multi-vendor, multi-accelerator comparison
Dataset Statistics: Token length distributions

License

Apache-2.0 License

🔗 Resources

Note: This dashboard displays MLPerf Inference benchmark results for analysis and comparison purposes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MLPerf Inference Dashboard

Features

Multi-Version Support

Benchmark Comparisons

Normalized Result Analysis

Dataset Representation

Offline vs Server Comparison

Cross-Version Analysis

Directory Structure

Quick Start

Local Development

Development Environment Setup

MLPerf Data Management

MLPerf CSV Files

Dataset Summaries

Managing Original Datasets

Testing

Configuration

Environment Variables

Data Requirements

Contributing

Key Metrics Analyzed

License

🔗 Resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
mlperf-data		mlperf-data
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
app.py		app.py
dashboard_styles.py		dashboard_styles.py
mlperf_datacenter.py		mlperf_datacenter.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MLPerf Inference Dashboard

Features

Multi-Version Support

Benchmark Comparisons

Normalized Result Analysis

Dataset Representation

Offline vs Server Comparison

Cross-Version Analysis

Directory Structure

Quick Start

Local Development

Development Environment Setup

MLPerf Data Management

MLPerf CSV Files

Dataset Summaries

Managing Original Datasets

Testing

Configuration

Environment Variables

Data Requirements

Contributing

Key Metrics Analyzed

License

🔗 Resources

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages