-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEST] test old UMF perf #17635
base: unify-benchmark-ci
Are you sure you want to change the base?
[TEST] test old UMF perf #17635
Conversation
00e26d8
to
d38ee71
Compare
Compute Benchmarks level_zero_v2 run (with params: ): |
Benchmarks level_zero_v2 run (): Failures
Summary(Emphasized values are the best results) Performance change in benchmark groupsCompute BenchmarksRelative perf in group SubmitKernel (2)
Relative perf in group SubmitKernel With Completion (2)
Relative perf in group SinKernelGraph 5 (2)
Relative perf in group SinKernelGraph 100 (2)
Relative perf in group EmptyKernel 1000 256 (1)
Relative perf in group KernelSwitch 8 200 (1)
DetailsBenchmark details - environment, command...api_overhead_benchmark_l0 SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 ulls_benchmark_l0 EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_l0 KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 |
d38ee71
to
6c126a5
Compare
Compute Benchmarks level_zero_v2 run (with params: ): |
Benchmarks level_zero_v2 run (): Failures
Summary(Emphasized values are the best results) Performance change in benchmark groupsCompute BenchmarksRelative perf in group SubmitKernel (2)
Relative perf in group SubmitKernel With Completion (2)
Relative perf in group SinKernelGraph 5 (2)
Relative perf in group SinKernelGraph 100 (2)
Relative perf in group EmptyKernel 1000 256 (1)
Relative perf in group KernelSwitch 8 200 (1)
DetailsBenchmark details - environment, command...api_overhead_benchmark_l0 SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 ulls_benchmark_l0 EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_l0 KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 |
f141db2
to
a42a419
Compare
Compute Benchmarks level_zero_v2 run (with params: ): |
Benchmarks level_zero_v2 run (): Failures
Summary(Emphasized values are the best results) Performance change in benchmark groupsCompute BenchmarksRelative perf in group SubmitKernel (2)
Relative perf in group SubmitKernel With Completion (2)
Relative perf in group SinKernelGraph 5 (2)
Relative perf in group SinKernelGraph 100 (2)
Relative perf in group EmptyKernel 1000 256 (1)
Relative perf in group KernelSwitch 8 200 (1)
DetailsBenchmark details - environment, command...api_overhead_benchmark_l0 SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 ulls_benchmark_l0 EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_l0 KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 |
Compute Benchmarks level_zero run (with params: ): |
Benchmarks level_zero run (): Failures
Summary(Emphasized values are the best results) Performance change in benchmark groupsCompute BenchmarksRelative perf in group SubmitKernel Out Of Order (3)
Relative perf in group SubmitKernel Out Of Order With Completion (3)
Relative perf in group SubmitKernel In Order (3)
Relative perf in group SubmitKernel In Order With Completion (3)
Relative perf in group SubmitKernel Out Of Order CPU count (1)
Relative perf in group SubmitKernel Out Of Order With Completion CPU count (1)
Relative perf in group SubmitKernel In Order CPU count (1)
Relative perf in group SubmitKernel In Order With Completion CPU count (1)
Relative perf in group SinKernelGraph 5 (5)
Relative perf in group SinKernelGraph 100 (5)
Relative perf in group EmptyKernel 1000 256 (2)
Relative perf in group KernelSwitch 8 200 (2)
Relative perf in group SubmitGraph 4 (4)
Relative perf in group SubmitGraph 10 (4)
Relative perf in group SubmitGraph 32 (4)
Relative perf in group Other (10)
Relative perf in group UsmMemoryAllocation Device 4096 Both (1)
Relative perf in group UsmMemoryAllocation Device 4194304 Both (1)
Relative perf in group UsmBatchMemoryAllocation Device 256 4096 Both (1)
Relative perf in group UsmBatchMemoryAllocation Device 32 4194304 Both (1)
Relative perf in group UsmRandomMemoryAllocation Device 256 4096 33554432 LogUniform (1)
Velocity BenchRelative perf in group Other (8)
DetailsBenchmark details - environment, command...api_overhead_benchmark_sycl SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_sycl SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_sycl SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_sycl SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of order CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of order with measure completion CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in order CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in order with measure completion CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 graph_api_benchmark_sycl SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_sycl SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 ulls_benchmark_sycl EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_sycl --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_sycl KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_sycl --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 ulls_benchmark_l0 EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_l0 KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 0 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=0 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 0 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=1 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 0 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=0 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 0 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=1 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 0 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=0 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 0 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=1 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 1 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=0 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 1 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=1 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 1 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=0 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 1 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=1 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 1 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=0 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 1 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=1 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100 memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100 memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024 memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros --multiplier=1 --vectorSize=1 api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024 api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024 miscellaneous_benchmark_sycl VectorSumCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256 multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:1 dstUSM:1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=1 --NumOpsPerThread=400 --iterations=10 --SrcUSM=1 --DstUSM=1 multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:0 dstUSM:1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=1 --NumOpsPerThread=400 --iterations=10 --SrcUSM=0 --DstUSM=1 multithread_benchmark_ur MemcpyExecute opsPerThread:4096, numThreads:4, allocSize:1024 srcUSM:0 dstUSM:1 without eventsCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=0 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=4 --NumOpsPerThread=4096 --iterations=10 --SrcUSM=0 --DstUSM=1 api_overhead_benchmark_ur UsmMemoryAllocation usmMemoryPlacement:Device size:4096 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmMemoryAllocation --csv --noHeaders --type=Device --size=4096 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmMemoryAllocation usmMemoryPlacement:Device size:4194304 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmMemoryAllocation --csv --noHeaders --type=Device --size=4194304 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmBatchMemoryAllocation usmMemoryPlacement:Device allocationCount:256 size:4096 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmBatchMemoryAllocation --csv --noHeaders --type=Device --allocationCount=256 --size=4096 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmBatchMemoryAllocation usmMemoryPlacement:Device allocationCount:32 size:4194304 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmBatchMemoryAllocation --csv --noHeaders --type=Device --allocationCount=32 --size=4194304 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmRandomMemoryAllocation usmMemoryPlacement:Device operationCount:256 minSize:4096 maxSize:33554432 sizeDistribution:LogUniformCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmRandomMemoryAllocation --csv --noHeaders --type=Device --operationCount=256 --minSize=4096 --maxSize=33554432 --sizeDistribution=LogUniform --iterations=1000 Velocity-Bench HashtableCommand:/home/test-user/llvm_bench_workdir/hashtable/hashtable_sycl --no-verify Velocity-Bench BitcrackerCommand:/home/test-user/llvm_bench_workdir/bitcracker/bitcracker -f /home/test-user/llvm_bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/llvm_bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000 Velocity-Bench CudaSiftCommand:/home/test-user/llvm_bench_workdir/cudaSift/cudaSift Velocity-Bench QuickSilverCommand:/home/test-user/llvm_bench_workdir/QuickSilver/qs -i /home/test-user/llvm_bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp Environment Variables:QS_DEVICE=GPU Velocity-Bench Sobel FilterCommand:/home/test-user/llvm_bench_workdir/sobel_filter/sobel_filter -i /home/test-user/llvm_bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5 Environment Variables:OPENCV_IO_MAX_IMAGE_PIXELS=1677721600 Velocity-Bench dl-cifarCommand:/home/test-user/llvm_bench_workdir/dl-cifar/dl-cifar_sycl Velocity-Bench dl-mnistCommand:/home/test-user/llvm_bench_workdir/dl-mnist/dl-mnist-sycl -conv_algo ONEDNN_AUTO Environment Variables:NEOReadDebugKeys=1 Velocity-Bench svmCommand:/home/test-user/llvm_bench_workdir/svm/svm_sycl /home/test-user/llvm_bench_workdir/velocity-bench-repo/svm/SYCL/a9a /home/test-user/llvm_bench_workdir/velocity-bench-repo/svm/SYCL/a.m |
Compute Benchmarks level_zero run (with params: ): |
Benchmarks level_zero run (): Failures
Summary(Emphasized values are the best results) Performance change in benchmark groupsCompute BenchmarksRelative perf in group SubmitKernel Out Of Order (3)
Relative perf in group SubmitKernel Out Of Order With Completion (3)
Relative perf in group SubmitKernel In Order (3)
Relative perf in group SubmitKernel In Order With Completion (3)
Relative perf in group SubmitKernel Out Of Order CPU count (1)
Relative perf in group SubmitKernel Out Of Order With Completion CPU count (1)
Relative perf in group SubmitKernel In Order CPU count (1)
Relative perf in group SubmitKernel In Order With Completion CPU count (1)
Relative perf in group SinKernelGraph 5 (5)
Relative perf in group SinKernelGraph 100 (5)
Relative perf in group EmptyKernel 1000 256 (2)
Relative perf in group KernelSwitch 8 200 (2)
Relative perf in group SubmitGraph 4 (4)
Relative perf in group SubmitGraph 10 (4)
Relative perf in group SubmitGraph 32 (4)
Relative perf in group Other (10)
Relative perf in group UsmMemoryAllocation Device 4096 Both (1)
Relative perf in group UsmMemoryAllocation Device 4194304 Both (1)
Relative perf in group UsmBatchMemoryAllocation Device 256 4096 Both (1)
Relative perf in group UsmBatchMemoryAllocation Device 32 4194304 Both (1)
Relative perf in group UsmRandomMemoryAllocation Device 256 4096 33554432 LogUniform (1)
Velocity BenchRelative perf in group Other (8)
DetailsBenchmark details - environment, command...api_overhead_benchmark_sycl SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_sycl SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_sycl SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_sycl SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_l0 SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_l0 --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of order CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of order with measure completion CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel out of order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=0 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in order CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in orderCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=0 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in order with measure completion CPU countCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 api_overhead_benchmark_ur SubmitKernel in order with measure completionCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=SubmitKernel --csv --noHeaders --Ioq=1 --DiscardEvents=0 --MeasureCompletion=1 --iterations=100000 --Profiling=0 --NumKernels=10 --KernelExecTime=1 graph_api_benchmark_sycl SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_sycl SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_l0 SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_l0 --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:0, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:0, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=0 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:1, numKernels:5Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=5 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 graph_api_benchmark_ur SinKernelGraph graphs:1, numKernels:100Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_ur --test=SinKernelGraph --csv --noHeaders --iterations=10000 --numKernels=100 --withGraphs=1 --withCopyOffload=1 --immediateAppendCmdList=0 ulls_benchmark_sycl EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_sycl --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_sycl KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_sycl --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 ulls_benchmark_l0 EmptyKernel wgc:1000, wgs:256Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=EmptyKernel --csv --noHeaders --iterations=10000 --wgs=256 --wgc=256 ulls_benchmark_l0 KernelSwitch count 8 kernelTime 200Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/ulls_benchmark_l0 --test=KernelSwitch --csv --noHeaders --iterations=1000 --count=8 --kernelTime=200 --barrier=0 --hostVisible=0 --ioq=1 --ctrBasedEvents=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 0 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=0 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 0 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=1 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 0 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=0 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 0 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=1 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 0 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=0 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 0 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=1 --InOrderQueue=0 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 1 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=0 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:4 ioq 1 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=4 --MeasureCompletionTime=1 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 1 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=0 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:10 ioq 1 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=10 --MeasureCompletionTime=1 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 1 measureCompletion 0Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=0 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 graph_api_benchmark_sycl SubmitGraph numKernels:32 ioq 1 measureCompletion 1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/graph_api_benchmark_sycl --test=SubmitGraph --csv --noHeaders --iterations=10000 --NumKernels=32 --MeasureCompletionTime=1 --InOrderQueue=1 --Profiling=0 --KernelExecutionTime=1 memory_benchmark_sycl QueueInOrderMemcpy from Device to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Device --destinationPlacement=Device --size=1024 --count=100 memory_benchmark_sycl QueueInOrderMemcpy from Host to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueInOrderMemcpy --csv --noHeaders --iterations=10000 --IsCopyOnly=0 --sourcePlacement=Host --destinationPlacement=Device --size=1024 --count=100 memory_benchmark_sycl QueueMemcpy from Device to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=QueueMemcpy --csv --noHeaders --iterations=10000 --sourcePlacement=Device --destinationPlacement=Device --size=1024 memory_benchmark_sycl StreamMemory, placement Device, type Triad, size 10240Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/memory_benchmark_sycl --test=StreamMemory --csv --noHeaders --iterations=10000 --type=Triad --size=10240 --memoryPlacement=Device --useEvents=0 --contents=Zeros --multiplier=1 --vectorSize=1 api_overhead_benchmark_sycl ExecImmediateCopyQueue out of order from Device to Device, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=0 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Device --dst=Device --size=1024 api_overhead_benchmark_sycl ExecImmediateCopyQueue in order from Device to Host, size 1024Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_sycl --test=ExecImmediateCopyQueue --csv --noHeaders --iterations=100000 --ioq=1 --IsCopyOnly=1 --MeasureCompletionTime=0 --src=Host --dst=Host --size=1024 miscellaneous_benchmark_sycl VectorSumCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/miscellaneous_benchmark_sycl --test=VectorSum --csv --noHeaders --iterations=1000 --numberOfElementsX=512 --numberOfElementsY=256 --numberOfElementsZ=256 multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:1 dstUSM:1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=1 --NumOpsPerThread=400 --iterations=10 --SrcUSM=1 --DstUSM=1 multithread_benchmark_ur MemcpyExecute opsPerThread:400, numThreads:1, allocSize:102400 srcUSM:0 dstUSM:1Command:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=1 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=102400 --NumThreads=1 --NumOpsPerThread=400 --iterations=10 --SrcUSM=0 --DstUSM=1 multithread_benchmark_ur MemcpyExecute opsPerThread:4096, numThreads:4, allocSize:1024 srcUSM:0 dstUSM:1 without eventsCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/multithread_benchmark_ur --test=MemcpyExecute --csv --noHeaders --Ioq=1 --UseEvents=0 --MeasureCompletion=1 --UseQueuePerThread=1 --AllocSize=1024 --NumThreads=4 --NumOpsPerThread=4096 --iterations=10 --SrcUSM=0 --DstUSM=1 api_overhead_benchmark_ur UsmMemoryAllocation usmMemoryPlacement:Device size:4096 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmMemoryAllocation --csv --noHeaders --type=Device --size=4096 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmMemoryAllocation usmMemoryPlacement:Device size:4194304 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmMemoryAllocation --csv --noHeaders --type=Device --size=4194304 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmBatchMemoryAllocation usmMemoryPlacement:Device allocationCount:256 size:4096 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmBatchMemoryAllocation --csv --noHeaders --type=Device --allocationCount=256 --size=4096 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmBatchMemoryAllocation usmMemoryPlacement:Device allocationCount:32 size:4194304 measureMode:BothCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmBatchMemoryAllocation --csv --noHeaders --type=Device --allocationCount=32 --size=4194304 --measureMode=Both --iterations=1000 api_overhead_benchmark_ur UsmRandomMemoryAllocation usmMemoryPlacement:Device operationCount:256 minSize:4096 maxSize:33554432 sizeDistribution:LogUniformCommand:/home/test-user/llvm_bench_workdir/compute-benchmarks-build/bin/api_overhead_benchmark_ur --test=UsmRandomMemoryAllocation --csv --noHeaders --type=Device --operationCount=256 --minSize=4096 --maxSize=33554432 --sizeDistribution=LogUniform --iterations=1000 Velocity-Bench HashtableCommand:/home/test-user/llvm_bench_workdir/hashtable/hashtable_sycl --no-verify Velocity-Bench BitcrackerCommand:/home/test-user/llvm_bench_workdir/bitcracker/bitcracker -f /home/test-user/llvm_bench_workdir/velocity-bench-repo/bitcracker/hash_pass/img_win8_user_hash.txt -d /home/test-user/llvm_bench_workdir/velocity-bench-repo/bitcracker/hash_pass/user_passwords_60000.txt -b 60000 Velocity-Bench CudaSiftCommand:/home/test-user/llvm_bench_workdir/cudaSift/cudaSift Velocity-Bench QuickSilverCommand:/home/test-user/llvm_bench_workdir/QuickSilver/qs -i /home/test-user/llvm_bench_workdir/velocity-bench-repo/QuickSilver/Examples/AllScattering/scatteringOnly.inp Environment Variables:QS_DEVICE=GPU Velocity-Bench Sobel FilterCommand:/home/test-user/llvm_bench_workdir/sobel_filter/sobel_filter -i /home/test-user/llvm_bench_workdir/data/sobel_filter/sobel_filter_data/silverfalls_32Kx32K.png -n 5 Environment Variables:OPENCV_IO_MAX_IMAGE_PIXELS=1677721600 Velocity-Bench dl-cifarCommand:/home/test-user/llvm_bench_workdir/dl-cifar/dl-cifar_sycl Velocity-Bench dl-mnistCommand:/home/test-user/llvm_bench_workdir/dl-mnist/dl-mnist-sycl -conv_algo ONEDNN_AUTO Environment Variables:NEOReadDebugKeys=1 Velocity-Bench svmCommand:/home/test-user/llvm_bench_workdir/svm/svm_sycl /home/test-user/llvm_bench_workdir/velocity-bench-repo/svm/SYCL/a9a /home/test-user/llvm_bench_workdir/velocity-bench-repo/svm/SYCL/a.m |
test old UMF perf