Scikit-learn_bench is a benchmark tool for libraries and frameworks implementing Scikit-learn-like APIs and other workloads.
Benefits:
- Full control of benchmarks suite through CLI
- Flexible and powerful benchmark config structure
- Available with advanced profiling tools, such as Intel(R) VTune* Profiler
- Automated benchmarks report generation
How to create a usable Python environment with the following required frameworks:
- sklearn, sklearnex, and gradient boosting frameworks:
# with pip
pip install -r envs/requirements-sklearn.txt
# or with conda
conda env create -n sklearn -f envs/conda-env-sklearn.yml
- RAPIDS:
conda env create -n rapids --solver=libmamba -f envs/conda-env-rapids.yml
How to run sklearnex benchmarks on CPU using the sklbench
module and regular scope of benchmarking cases:
python -m sklbench --configs configs/regular \
--filters algorithm:library=sklearnex algorithm:device=cpu \
--environment-name ENV_NAME --result-file result_sklearnex_cpu_regular.json
# Same command with shorter argument aliases for typing convenience
python -m sklbench -c configs/regular \
-f algorithm:library=sklearnex algorithm:device=cpu \
-e ENV_NAME -r result_sklearnex_cpu_regular.json
The default output is a file with JSON-formatted results of benchmarking cases. To generate a better human-readable report, use the following command:
python -m sklbench -c configs/regular \
-f algorithm:library=sklearnex algorithm:device=cpu \
-e ENV_NAME -r result_sklearnex_cpu_regular.json \
--report --report-file result-sklearnex-cpu-regular.xlsx
In order to optimize datasets downloading and get more verbose output, use --prefetch-datasets
and -l INFO
arguments:
python -m sklbench -c configs/regular \
-f algorithm:library=sklearnex algorithm:device=cpu \
-e ENV_NAME -r result_sklearnex_cpu_regular.json \
--report --report-file report-sklearnex-cpu-regular.xlsx \
--prefetch-datasets -l INFO
To select measurement for few algorithms only, extend filter (-f
) argument:
# ...
-f algorithm:library=sklearnex algorithm:device=cpu algorithm:estimator=PCA,KMeans
# ...
For a description of all benchmarks runner arguments, refer to documentation.
To combine raw result files gathered from different environments, call the report generator:
python -m sklbench.report \
--result-files result_1.json result_2.json \
--report-file report_example.xlsx
For a description of all report generator arguments, refer to documentation.
flowchart TB
A[User] -- High-level arguments --> B[Benchmarks runner]
B -- Generated benchmarking cases --> C["Benchmarks collection"]
C -- Raw JSON-formatted results --> D[Report generator]
D -- Human-readable report --> A
classDef userStyle fill:#44b,color:white,stroke-width:2px,stroke:white;
class A userStyle
Scikit-learn_bench supports the following types of benchmarks:
- Scikit-learn estimator - Measures performance and quality metrics of the sklearn-like estimator.
- Function - Measures performance metrics of specified function.