Skip to content

Python comparison scripts for test suite output (diff_test_runs)#2822

Open
ppenenko wants to merge 1 commit intoAcademySoftwareFoundation:mainfrom
autodesk-forks:ppenenko/diff_test_runs_scripts
Open

Python comparison scripts for test suite output (diff_test_runs)#2822
ppenenko wants to merge 1 commit intoAcademySoftwareFoundation:mainfrom
autodesk-forks:ppenenko/diff_test_runs_scripts

Conversation

@ppenenko
Copy link
Contributor

Summary

A new Python package under python/MaterialXTest/diff_test_runs/ provides three scripts for comparing pairs of MaterialX test suite output directories (baseline vs. optimized). The trace comparison script (diff_traces.py) consumes the Perfetto traces and GPU track events produced by the tracing instrumentation from #2742.

Script Purpose
diff_images.py Perceptual image comparison using NVIDIA FLIP, with HTML side-by-side reports. Below is an example diff between two test runs with 1024 and 1 environment samples, respectively:
image
diff_traces.py Perfetto trace comparison with per-material CPU slice and GPU render-time analysis, multiple --slice filters, --warmup-frames burn-in period, inline SVG charts in HTML reports, e.g. image
diff_shaders.py Offline shader analysis: line count, SPIR-V binary size, glslangValidator compile time, spirv-opt optimization time; auto-discovers Vulkan SDK tools in PATH, e.g. image
_report.py Shared utilities for comparison tables, SVG chart generation, and HTML report building

Key features:

  • HTML reports with interactive, searchable inline SVG bar charts
  • Sorted comparison tables showing per-material deltas (absolute and percentage)
  • Summary statistics (mean, median, best/worst) for quick assessment
  • Configurable filtering (--min-delta-ms, --min-delta-pct) to focus on significant changes
  • Warm-up frame support (--warmup-frames) to discard initial GPU frames for more stable averages

Motivation

This tooling was developed to measure and validate shader generation optimizations (early lobe pruning, dead code elimination) in a companion branch. The infrastructure is generally useful for:

  • Regression testing: Detect visual regressions from codegen changes via FLIP
  • Performance profiling: Compare shader generation time, compilation time, and GPU render time across branches
  • Shader quality analysis: Quantify code size and SPIR-V improvements from optimizations

Test plan

  • Verified all three comparison scripts produce correct HTML reports when comparing two test suite runs
  • Verified diff_traces.py correctly processes Perfetto traces with CPU slices and GPU async track events
  • Verified diff_images.py produces FLIP-based perceptual difference reports
  • Verified diff_shaders.py compiles dumped GLSL to SPIR-V and reports LOC/size/timing deltas

Related


Assisted-by: Claude (Anthropic) via Cursor IDE
Signed-off-by: Pavlo Penenko <pavlo.penenko@autodesk.com>

New Python package under python/MaterialXTest/diff_test_runs/ with
three scripts for comparing pairs of test suite output directories:

- diff_images.py: Perceptual image comparison using NVIDIA FLIP,
  with HTML side-by-side reports
- diff_traces.py: Perfetto trace comparison with per-material CPU
  slice and GPU render-time analysis, inline SVG charts
- diff_shaders.py: Offline shader analysis (line count, SPIR-V
  binary size, compile time) using Vulkan SDK tools
- _report.py: Shared utilities for comparison tables, SVG chart
  generation, and HTML report building
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant