🔍 Shapley Values Evaluation

A Comprehensive Evaluation of Shapley Value Approximations

📄 Contents

🔎 Overview
⚙️ Get Started
📊 Evaluation
📫 Contact
⭐ Acknowledgements

🔎 1. Overview

Understanding the choices made by machine learning models is essential for building trust and promoting real-world adoption. Shapley values have emerged as a principled and widely-used method for feature attribution. By considering all feature subsets, Shapley values offer comprehensive and fair explanations for model predictions.

However, computing exact Shapley values is computationally intractable (NP-hard), prompting the development of various approximation techniques. The abundance of such methods introduces a new challenge: Which technique should practitioners trust?

This work fills that gap through a systematic and large-scale evaluation of 17 Shapley value approximation algorithms across:

💯 100 tabular datasets from diverse domains
🧠 6 model architectures

We analyze two core aspects:

Replacement Strategies for handling missing features
Tractable Estimation Strategies to approximate Shapley values efficiently

Our results reveal critical trade-offs in accuracy, compute time, and robustness. This benchmark provides the foundation for selecting the right method and encourages further research in interpretable machine learning.

⚙️ 2. Get Started

✅ Prerequisites

You will need:

git
conda (Anaconda or Miniconda)

📦 Installation

Step 1: Clone this repository using git and change into its root directory.

git clone https://github.com/TheDatumOrg/ShapleyValuesEval.git

Step 2: Create and activate a conda environment named shapeval.

conda env create --file environment.yml
conda activate shapeval

📊 3. Evaluation

We decompose the evaluation into two principal dimensions:

🧩 Replacement Strategies

These strategies define how to handle missing features in the Shapley framework. We compare 8 different strategies using an exhaustive sampling baseline for accuracy, enabling deep insights into their strengths and limitations.

⚙️ Tractable Estimation Strategies

To address the computational burden, we evaluate 17 different estimation methods that trade off fidelity for efficiency.

Each method is benchmarked using 100 datasets and 6 types of predictive models. The evaluation pipeline includes accuracy metrics, compute-time analysis, and critical difference diagrams to highlight statistically significant performance differences.

Approaches	⚙️ Estimation Strategy	🧩 Replacement Strategy
Exhaustive Sampling	Exact (All potential feature subsets)	Conditional Distribution: Separate Models
Interactions-based Method for Explanation	Random Order	Marginal Distribution: Empirical
Conditional Expectations Shapley	Random Order	Conditional Distribution: Empirical
Shapley Cohort refinement	Random Order	Conditional Distribution: Empirical
Multilinear Sampling	Multilinear Extension	Marginal Distribution: Empirical
KernelSHAP	Weighted Least Squares	Marginal Distribution: Empirical
Parametric KernelSHAP	Weighted Least Squares	Conditional Distribution: Parametric Assumption(Gaussian/Copula)
Non-Parametric KernelSHAP	Weighted Least Squares	Conditional Distribution: Empirical
SGD-Shapley	Weighted Least Squares	Predetermined Baseline: Mean
FastSHAP	Weighted Least Squares	Conditional Distribution: Surrogate model
Independent LinearSHAP	Linear	Marginal Distribution: Empirical
Correlated LinearSHAP	Linear	Conditional Distribution: Parametric Assumption(Gaussian)
Tree Interventional	Tree	Marginal Distribution: Empirical
Tree Path-dependent	Tree	Conditional Distribution: Empirical
DeepLIFT	Deep	Predetermined Baseline: All-zeros
DeepSHAP	Deep	Marginal Distribution: Empirical
DASP	Deep	Predetermined Baseline: Mean

📫 4. Contact

If you have any questions, suggestions, or ideas for improvement, feel free to:

Open an issue on this repository
Start a discussion
Reach out via email: Suchit Gupte ([email protected]), John Paparrizos ([email protected])

We welcome contributions and collaboration!

⭐ 5. Acknowledgements

This project is part of a broader initiative to standardize and democratize interpretability research in machine learning. We thank the research community for their foundational work in Shapley value approximations and their continued efforts in explainable AI. This benchmark builds on their insights and aims to further the goal of trustworthy and transparent machine learning models.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
algorithms		algorithms
models		models
notebooks		notebooks
results		results
README.md		README.md
cacert.pem		cacert.pem
environment.yml		environment.yml
main_algorithms.py		main_algorithms.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🔍 Shapley Values Evaluation

A Comprehensive Evaluation of Shapley Value Approximations

📄 Contents

🔎 1. Overview

⚙️ 2. Get Started

✅ Prerequisites

📦 Installation

📊 3. Evaluation

🧩 Replacement Strategies

⚙️ Tractable Estimation Strategies

📫 4. Contact

⭐ 5. Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

TheDatumOrg/ShapleyValuesEval

Folders and files

Latest commit

History

Repository files navigation

🔍 Shapley Values Evaluation

A Comprehensive Evaluation of Shapley Value Approximations

📄 Contents

🔎 1. Overview

⚙️ 2. Get Started

✅ Prerequisites

📦 Installation

📊 3. Evaluation

🧩 Replacement Strategies

⚙️ Tractable Estimation Strategies

📫 4. Contact

⭐ 5. Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages