Python comparison scripts for test suite output (diff_test_runs) by ppenenko · Pull Request #2822 · AcademySoftwareFoundation/MaterialX

ppenenko · 2026-03-12T15:44:09Z

Summary

A new Python package under python/MaterialXTest/diff_test_runs/ provides three scripts for comparing pairs of MaterialX test suite output directories (baseline vs. optimized). The trace comparison script (diff_traces.py) consumes the Perfetto traces and GPU track events produced by the tracing instrumentation from #2742.

Script	Purpose
`diff_images.py`	Perceptual image comparison using NVIDIA FLIP, with HTML side-by-side reports. Below is an example diff between two test runs with 1024 and 1 environment samples, respectively:
`diff_traces.py`	Perfetto trace comparison with per-material CPU slice and GPU render-time analysis, multiple `--slice` filters, `--warmup-frames` burn-in period, inline SVG charts in HTML reports, e.g.
`diff_shaders.py`	Offline shader analysis: line count, SPIR-V binary size, `glslangValidator` compile time, `spirv-opt` optimization time; auto-discovers Vulkan SDK tools in `PATH`, e.g.
`_report.py`	Shared utilities for comparison tables, SVG chart generation, and HTML report building

Key features:

HTML reports with interactive, searchable inline SVG bar charts
Sorted comparison tables showing per-material deltas (absolute and percentage)
Summary statistics (mean, median, best/worst) for quick assessment
Configurable filtering (--min-delta-ms, --min-delta-pct) to focus on significant changes
Warm-up frame support (--warmup-frames) to discard initial GPU frames for more stable averages

Motivation

This tooling was developed to measure and validate shader generation optimizations (early lobe pruning, dead code elimination) in a companion branch. The infrastructure is generally useful for:

Regression testing: Detect visual regressions from codegen changes via FLIP
Performance profiling: Compare shader generation time, compilation time, and GPU render time across branches
Shader quality analysis: Quantify code size and SPIR-V improvements from optimizations

Test plan

Verified all three comparison scripts produce correct HTML reports when comparing two test suite runs
Verified diff_traces.py correctly processes Perfetto traces with CPU slices and GPU async track events
Verified diff_images.py produces FLIP-based perceptual difference reports
Verified diff_shaders.py compiles dumped GLSL to SPIR-V and reports LOC/size/timing deltas

Add Perfetto tracing and a test output directory option #2742 -- Perfetto tracing infrastructure (merged)
Test comparison tooling and extended tracing instrumentation #2774 -- Original combined PR (to be superseded by this and sibling PRs)
Extended tracing instrumentation for shader codegen and rendering #2820 -- Extended C++ tracing instrumentation (companion PR)
Add envSampleCount and framesPerMaterial test suite options #2821 -- Test suite options: envSampleCount, framesPerMaterial (companion PR)

Assisted-by: Claude (Anthropic) via Cursor IDE
Signed-off-by: Pavlo Penenko <pavlo.penenko@autodesk.com>

New Python package under python/MaterialXTest/diff_test_runs/ with three scripts for comparing pairs of test suite output directories: - diff_images.py: Perceptual image comparison using NVIDIA FLIP, with HTML side-by-side reports - diff_traces.py: Perfetto trace comparison with per-material CPU slice and GPU render-time analysis, inline SVG charts - diff_shaders.py: Offline shader analysis (line count, SPIR-V binary size, compile time) using Vulkan SDK tools - _report.py: Shared utilities for comparison tables, SVG chart generation, and HTML report building

This was referenced Mar 13, 2026

Extended tracing instrumentation for shader codegen and rendering #2820

Open

GPU async timing instrumentation via MX_TRACE_ASYNC #2824

Open

Propose an initial graph refactoring framework for shader generation #2832

Open

ppenenko marked this pull request as ready for review March 20, 2026 19:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python comparison scripts for test suite output (diff_test_runs)#2822

Python comparison scripts for test suite output (diff_test_runs)#2822
ppenenko wants to merge 1 commit intoAcademySoftwareFoundation:mainfrom
autodesk-forks:ppenenko/diff_test_runs_scripts

ppenenko commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ppenenko commented Mar 12, 2026

Summary

Motivation

Test plan

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant