Open
Conversation
Contributor
There was a problem hiding this comment.
CUDA.jl Benchmarks
Details
| Benchmark suite | Current: 5dd9f87 | Previous: 340127b | Ratio |
|---|---|---|---|
array/accumulate/Float32/1d |
100754 ns |
101458 ns |
0.99 |
array/accumulate/Float32/dims=1 |
76473 ns |
77354.5 ns |
0.99 |
array/accumulate/Float32/dims=1L |
1585449 ns |
1586530.5 ns |
1.00 |
array/accumulate/Float32/dims=2 |
143371 ns |
144422 ns |
0.99 |
array/accumulate/Float32/dims=2L |
657279 ns |
658433.5 ns |
1.00 |
array/accumulate/Int64/1d |
118263 ns |
118510 ns |
1.00 |
array/accumulate/Int64/dims=1 |
79581 ns |
79806 ns |
1.00 |
array/accumulate/Int64/dims=1L |
1706091 ns |
1705339 ns |
1.00 |
array/accumulate/Int64/dims=2 |
156925 ns |
156169.5 ns |
1.00 |
array/accumulate/Int64/dims=2L |
961168 ns |
961890 ns |
1.00 |
array/broadcast |
20513 ns |
20634 ns |
0.99 |
array/construct |
1274.6 ns |
1265.4 ns |
1.01 |
array/copy |
18269 ns |
18079 ns |
1.01 |
array/copyto!/cpu_to_gpu |
214123 ns |
216989 ns |
0.99 |
array/copyto!/gpu_to_cpu |
280436 ns |
284915 ns |
0.98 |
array/copyto!/gpu_to_gpu |
10977 ns |
10779 ns |
1.02 |
array/iteration/findall/bool |
134196 ns |
134702.5 ns |
1.00 |
array/iteration/findall/int |
148851 ns |
150167 ns |
0.99 |
array/iteration/findfirst/bool |
80843 ns |
81791 ns |
0.99 |
array/iteration/findfirst/int |
83434 ns |
83856 ns |
0.99 |
array/iteration/findmin/1d |
86703.5 ns |
86603.5 ns |
1.00 |
array/iteration/findmin/2d |
116663.5 ns |
117335 ns |
0.99 |
array/iteration/logical |
195739 ns |
198412.5 ns |
0.99 |
array/iteration/scalar |
67797 ns |
69274.5 ns |
0.98 |
array/permutedims/2d |
51765 ns |
52565 ns |
0.98 |
array/permutedims/3d |
52061 ns |
52656 ns |
0.99 |
array/permutedims/4d |
51013 ns |
51433.5 ns |
0.99 |
array/random/rand/Float32 |
12888 ns |
13120 ns |
0.98 |
array/random/rand/Int64 |
24944 ns |
24957 ns |
1.00 |
array/random/rand!/Float32 |
8401.666666666666 ns |
9858.333333333334 ns |
0.85 |
array/random/rand!/Int64 |
21701 ns |
21835 ns |
0.99 |
array/random/randn/Float32 |
36602 ns |
43515.5 ns |
0.84 |
array/random/randn!/Float32 |
30844 ns |
30949.5 ns |
1.00 |
array/reductions/mapreduce/Float32/1d |
34290.5 ns |
34651 ns |
0.99 |
array/reductions/mapreduce/Float32/dims=1 |
49209.5 ns |
44540.5 ns |
1.10 |
array/reductions/mapreduce/Float32/dims=1L |
51219 ns |
51138 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2 |
56416 ns |
56567 ns |
1.00 |
array/reductions/mapreduce/Float32/dims=2L |
69240 ns |
69274 ns |
1.00 |
array/reductions/mapreduce/Int64/1d |
42241 ns |
43548.5 ns |
0.97 |
array/reductions/mapreduce/Int64/dims=1 |
50891.5 ns |
43771.5 ns |
1.16 |
array/reductions/mapreduce/Int64/dims=1L |
87167 ns |
87029 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2 |
59275 ns |
59317 ns |
1.00 |
array/reductions/mapreduce/Int64/dims=2L |
84506 ns |
84771 ns |
1.00 |
array/reductions/reduce/Float32/1d |
34558 ns |
35127 ns |
0.98 |
array/reductions/reduce/Float32/dims=1 |
39512 ns |
42032 ns |
0.94 |
array/reductions/reduce/Float32/dims=1L |
51243 ns |
51445 ns |
1.00 |
array/reductions/reduce/Float32/dims=2 |
56338 ns |
57148 ns |
0.99 |
array/reductions/reduce/Float32/dims=2L |
69769 ns |
70183 ns |
0.99 |
array/reductions/reduce/Int64/1d |
42569 ns |
43251 ns |
0.98 |
array/reductions/reduce/Int64/dims=1 |
41961 ns |
51781 ns |
0.81 |
array/reductions/reduce/Int64/dims=1L |
86985 ns |
87117 ns |
1.00 |
array/reductions/reduce/Int64/dims=2 |
59408 ns |
59507 ns |
1.00 |
array/reductions/reduce/Int64/dims=2L |
84411 ns |
84592 ns |
1.00 |
array/reverse/1d |
17512 ns |
17712 ns |
0.99 |
array/reverse/1dL |
68271 ns |
68269 ns |
1.00 |
array/reverse/1dL_inplace |
65675 ns |
65689 ns |
1.00 |
array/reverse/1d_inplace |
10248 ns |
10212.666666666666 ns |
1.00 |
array/reverse/2d |
20802 ns |
20819 ns |
1.00 |
array/reverse/2dL |
72786 ns |
72976 ns |
1.00 |
array/reverse/2dL_inplace |
65736 ns |
65812 ns |
1.00 |
array/reverse/2d_inplace |
10221 ns |
10867 ns |
0.94 |
array/sorting/1d |
2734313.5 ns |
2735648 ns |
1.00 |
array/sorting/2d |
1067889 ns |
1069639 ns |
1.00 |
array/sorting/by |
3303732 ns |
3305008 ns |
1.00 |
cuda/synchronization/context/auto |
1192.3 ns |
1149 ns |
1.04 |
cuda/synchronization/context/blocking |
968.4827586206897 ns |
961.56 ns |
1.01 |
cuda/synchronization/context/nonblocking |
7872.1 ns |
6843.4 ns |
1.15 |
cuda/synchronization/stream/auto |
997.3 ns |
1040.15 ns |
0.96 |
cuda/synchronization/stream/blocking |
827.7010309278351 ns |
834.2666666666667 ns |
0.99 |
cuda/synchronization/stream/nonblocking |
7482 ns |
7317.4 ns |
1.02 |
integration/byval/reference |
143800.5 ns |
143691 ns |
1.00 |
integration/byval/slices=1 |
145773 ns |
145548 ns |
1.00 |
integration/byval/slices=2 |
284740 ns |
284456 ns |
1.00 |
integration/byval/slices=3 |
423037 ns |
423048 ns |
1.00 |
integration/cudadevrt |
102329 ns |
102314 ns |
1.00 |
integration/volumerhs |
23439655.5 ns |
23431006.5 ns |
1.00 |
kernel/indexing |
13295 ns |
13281.5 ns |
1.00 |
kernel/indexing_checked |
14021 ns |
13966 ns |
1.00 |
kernel/launch |
2028.9 ns |
2168.222222222222 ns |
0.94 |
kernel/occupancy |
667.2967741935483 ns |
669.85625 ns |
1.00 |
kernel/rand |
15019 ns |
14425 ns |
1.04 |
latency/import |
3825173202 ns |
3813842835.5 ns |
1.00 |
latency/precompile |
4577693352.5 ns |
4579317015 ns |
1.00 |
latency/ttfp |
4418922320.5 ns |
4395787908 ns |
1.01 |
This comment was automatically generated by workflow using github-action-benchmark.
a2f9500 to
35efee6
Compare
Contributor
|
Looks like it's time to tag a release of GPUArrays? |
gdalle
suggested changes
Apr 11, 2026
Co-authored-by: Guillaume Dalle <22795598+gdalle@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Needs JuliaGPU/GPUArrays.jl#700 first
cc @gdalle