Skip to content

Latest commit

 

History

History
executable file
·
126 lines (95 loc) · 4.56 KB

README.md

File metadata and controls

executable file
·
126 lines (95 loc) · 4.56 KB

Machine Learning Benchmarks

Build Status

Scikit-learn_bench is a benchmark tool for libraries and frameworks implementing Scikit-learn-like APIs and other workloads.

Benefits:

  • Full control of benchmarks suite through CLI
  • Flexible and powerful benchmark config structure
  • Available with advanced profiling tools, such as Intel(R) VTune* Profiler
  • Automated benchmarks report generation

📜 Table of Contents

🔧 Create a Python Environment

How to create a usable Python environment with the following required frameworks:

  • sklearn, sklearnex, and gradient boosting frameworks:
# with pip
pip install -r envs/requirements-sklearn.txt
# or with conda
conda env create -n sklearn -f envs/conda-env-sklearn.yml
  • RAPIDS:
conda env create -n rapids --solver=libmamba -f envs/conda-env-rapids.yml

🚀 How To Use Scikit-learn_bench

Benchmarks Runner

How to run sklearnex benchmarks on CPU using the sklbench module and regular scope of benchmarking cases:

python -m sklbench --configs configs/regular \
    --filters algorithm:library=sklearnex algorithm:device=cpu \
    --environment-name ENV_NAME --result-file result_sklearnex_cpu_regular.json
# Same command with shorter argument aliases for typing convenience
python -m sklbench -c configs/regular \
    -f algorithm:library=sklearnex algorithm:device=cpu \
    -e ENV_NAME -r result_sklearnex_cpu_regular.json

The default output is a file with JSON-formatted results of benchmarking cases. To generate a better human-readable report, use the following command:

python -m sklbench -c configs/regular \
    -f algorithm:library=sklearnex algorithm:device=cpu \
    -e ENV_NAME -r result_sklearnex_cpu_regular.json \
    --report --report-file result-sklearnex-cpu-regular.xlsx

In order to optimize datasets downloading and get more verbose output, use --prefetch-datasets and -l INFO arguments:

python -m sklbench -c configs/regular \
    -f algorithm:library=sklearnex algorithm:device=cpu \
    -e ENV_NAME -r result_sklearnex_cpu_regular.json \
    --report --report-file report-sklearnex-cpu-regular.xlsx \
    --prefetch-datasets -l INFO

To select measurement for few algorithms only, extend filter (-f) argument:

# ...
  -f algorithm:library=sklearnex algorithm:device=cpu algorithm:estimator=PCA,KMeans
# ...

For a description of all benchmarks runner arguments, refer to documentation.

Report Generator

To combine raw result files gathered from different environments, call the report generator:

python -m sklbench.report \
    --result-files result_1.json result_2.json \
    --report-file report_example.xlsx

For a description of all report generator arguments, refer to documentation.

Scikit-learn_bench High-Level Workflow

flowchart TB
    A[User] -- High-level arguments --> B[Benchmarks runner]
    B -- Generated benchmarking cases --> C["Benchmarks collection"]
    C -- Raw JSON-formatted results --> D[Report generator]
    D -- Human-readable report --> A

    classDef userStyle fill:#44b,color:white,stroke-width:2px,stroke:white;
    class A userStyle
Loading

📚 Benchmark Types

Scikit-learn_bench supports the following types of benchmarks:

  • Scikit-learn estimator - Measures performance and quality metrics of the sklearn-like estimator.
  • Function - Measures performance metrics of specified function.

📑 Documentation

Scikit-learn_bench: