Skip to content

Conversation

@mselim00
Copy link
Contributor

@mselim00 mselim00 commented Oct 31, 2025

Issue number:
Part of #254
Closes #261

Description of changes:

This adds a package for the Nvidia 580 driver on the 6.1 kernel. It is basically identical to #255 except for the kernel and related modules change, where it's more similar to the 570 package for 6.1.

Testing done:
Built a BR 1.32 Nvidia AMI on top of bottlerocket@79b1eceaaf819c19ad906d460824a4325f935205 (uses 6.1 kernel) with a 580 package built from this PR. Ran nccl-tests all to all, all gather, all reduce operations w/o EFA using aws-k8s-tester@7525fba939418a345b8c572b95f78aac6d218fef on a p5.48xlarge:

Log snippet
# All Reduce:
        [1,0]<stdout>:           8             2     float     sum      -1   125.87    0.00    0.00       0    96.03    0.00    0.00       0
        [1,0]<stdout>:          16             4     float     sum      -1    92.65    0.00    0.00       0    92.88    0.00    0.00       0
        [1,0]<stdout>:          32             8     float     sum      -1    95.77    0.00    0.00       0    96.16    0.00    0.00       0
        [1,0]<stdout>:          64            16     float     sum      -1    97.59    0.00    0.00       0    97.24    0.00    0.00       0
        [1,0]<stdout>:         128            32     float     sum      -1   139.73    0.00    0.00       0   127.78    0.00    0.00       0
        [1,0]<stdout>:         256            64     float     sum      -1   123.09    0.00    0.00       0   126.81    0.00    0.00       0
        [1,0]<stdout>:         512           128     float     sum      -1   135.27    0.00    0.01       0   122.40    0.00    0.01       0
        [1,0]<stdout>:        1024           256     float     sum      -1   130.04    0.01    0.01       0   135.07    0.01    0.01       0
        [1,0]<stdout>:        2048           512     float     sum      -1   144.88    0.01    0.03       0   150.61    0.01    0.03       0
        [1,0]<stdout>:        4096          1024     float     sum      -1   155.35    0.03    0.05       0   153.18    0.03    0.05       0
        [1,0]<stdout>:        8192          2048     float     sum      -1   177.96    0.05    0.09       0   195.53    0.04    0.08       0
        [1,0]<stdout>:       16384          4096     float     sum      -1   222.93    0.07    0.14       0   244.44    0.07    0.13       0
        [1,0]<stdout>:       32768          8192     float     sum      -1   267.01    0.12    0.23       0   274.58    0.12    0.22       0
        [1,0]<stdout>:       65536         16384     float     sum      -1   381.18    0.17    0.32       0   411.87    0.16    0.30       0
        [1,0]<stdout>:      131072         32768     float     sum      -1   605.18    0.22    0.41       0   612.15    0.21    0.40       0
        [1,0]<stdout>:      262144         65536     float     sum      -1   649.57    0.40    0.76       0   662.79    0.40    0.74       0
        [1,0]<stdout>:      524288        131072     float     sum      -1   808.74    0.65    1.22       0   822.57    0.64    1.20       0
        [1,0]<stdout>:     1048576        262144     float     sum      -1  1254.30    0.84    1.57       0  1265.98    0.83    1.55       0
        [1,0]<stdout>:     2097152        524288     float     sum      -1  2311.79    0.91    1.70       0  2310.56    0.91    1.70       0
        [1,0]<stdout>:     4194304       1048576     float     sum      -1  4275.04    0.98    1.84       0  4225.24    0.99    1.86       0
        [1,0]<stdout>:     8388608       2097152     float     sum      -1  6787.74    1.24    2.32       0  6797.03    1.23    2.31       0
        [1,0]<stdout>:    16777216       4194304     float     sum      -1  13107.6    1.28    2.40       0  13056.9    1.28    2.41       0
        [1,0]<stdout>:    33554432       8388608     float     sum      -1  25181.2    1.33    2.50       0  25191.2    1.33    2.50       0
        [1,0]<stdout>:    67108864      16777216     float     sum      -1  49238.9    1.36    2.56       0  75570.1    0.89    1.67       0
        [1,0]<stdout>:   134217728      33554432     float     sum      -1   111279    1.21    2.26       0  92881.6    1.45    2.71       0
        [1,0]<stdout>:   268435456      67108864     float     sum      -1   199982    1.34    2.52       0   186874    1.44    2.69       0
        [1,0]<stdout>:   536870912     134217728     float     sum      -1   297350    1.81    3.39       0   307430    1.75    3.27       0
        [1,0]<stdout>:  1073741824     268435456     float     sum      -1   592503    1.81    3.40       0   613139    1.75    3.28       0
        [1,0]<stdout>:  2147483648     536870912     float     sum      -1  1197665    1.79    3.36       0  1193590    1.80    3.37       0
        [1,0]<stdout>:  4294967296    1073741824     float     sum      -1  2408944    1.78    3.34       0  2371984    1.81    3.40       0
        [1,0]<stdout>:  8589934592    2147483648     float     sum      -1  4740388    1.81    3.40       0  4753148    1.81    3.39       0
        [1,0]<stdout>: 17179869184    4294967296     float     sum      -1  9378302    1.83    3.43       0  9250284    1.86    3.48       0
        [1,7]<stdout>:multi-node-all-reduce-perf-worker-0:35:111 [7] NCCL INFO comm 0x556515170eb0 rank 7 nranks 16 cudaDev 7 busId ca000 - Destroy COMPLETE
        [1,9]<stdout>:multi-node-all-reduce-perf-worker-1:24:118 [1] NCCL INFO comm 0x5596166a63d0 rank 9 nranks 16 cudaDev 1 busId 64000 - Destroy COMPLETE
        [1,10]<stdout>:multi-node-all-reduce-perf-worker-1:25:117 [2] NCCL INFO comm 0x55b3b2f645a0 rank 10 nranks 16 cudaDev 2 busId 75000 - Destroy COMPLETE
        [1,11]<stdout>:multi-node-all-reduce-perf-worker-1:26:119 [3] NCCL INFO comm 0x55f9b8d36020 rank 11 nranks 16 cudaDev 3 busId 86000 - Destroy COMPLETE
        [1,15]<stdout>:multi-node-all-reduce-perf-worker-1:35:115 [7] NCCL INFO comm 0x5625839cad90 rank 15 nranks 16 cudaDev 7 busId ca000 - Destroy COMPLETE
        [1,6]<stdout>:multi-node-all-reduce-perf-worker-0:32:113 [6] NCCL INFO comm 0x556017ae9170 rank 6 nranks 16 cudaDev 6 busId b9000 - Destroy COMPLETE
        [1,5]<stdout>:multi-node-all-reduce-perf-worker-0:29:116 [5] NCCL INFO comm 0x563f9c7cb160 rank 5 nranks 16 cudaDev 5 busId a8000 - Destroy COMPLETE
        [1,4]<stdout>:multi-node-all-reduce-perf-worker-0:26:114 [4] NCCL INFO comm 0x5642ca77f350 rank 4 nranks 16 cudaDev 4 busId 97000 - Destroy COMPLETE
        [1,0]<stdout>:multi-node-all-reduce-perf-worker-0:21:118 [0] NCCL INFO comm 0x55dea2f7c970 rank 0 nranks 16 cudaDev 0 busId 53000 - Destroy COMPLETE
        [1,0]<stdout>:# Out of bounds values : 0 OK
        [1,0]<stdout>:# Avg bus bandwidth    : 1.34392 
        [1,0]<stdout>:#
        [1,0]<stdout>:# Collective test concluded: all_reduce_perf
        [1,2]<stdout>:multi-node-all-reduce-perf-worker-0:23:112 [2] NCCL INFO comm 0x55e1b9661520 rank 2 nranks 16 cudaDev 2 busId 75000 - Destroy COMPLETE
        [1,13]<stdout>:multi-node-all-reduce-perf-worker-1:29:113 [5] NCCL INFO comm 0x56373a1d7340 rank 13 nranks 16 cudaDev 5 busId a8000 - Destroy COMPLETE
        [1,14]<stdout>:multi-node-all-reduce-perf-worker-1:32:114 [6] NCCL INFO comm 0x55cd950695b0 rank 14 nranks 16 cudaDev 6 busId b9000 - Destroy COMPLETE
        [1,12]<stdout>:multi-node-all-reduce-perf-worker-1:27:112 [4] NCCL INFO comm 0x55cab0f49210 rank 12 nranks 16 cudaDev 4 busId 97000 - Destroy COMPLETE
        [1,8]<stdout>:multi-node-all-reduce-perf-worker-1:23:116 [0] NCCL INFO comm 0x55e603a44f60 rank 8 nranks 16 cudaDev 0 busId 53000 - Destroy COMPLETE
        [1,1]<stdout>:multi-node-all-reduce-perf-worker-0:22:115 [1] NCCL INFO comm 0x55d6935cb620 rank 1 nranks 16 cudaDev 1 busId 64000 - Destroy COMPLETE
        [1,3]<stdout>:multi-node-all-reduce-perf-worker-0:24:117 [3] NCCL INFO comm 0x55b23155df80 rank 3 nranks 16 cudaDev 3 busId 86000 - Destroy COMPLETE
        [1,0]<stdout>:

# All Gather
        [1,0]<stdout>:           0             0     float    none      -1     0.73    0.00    0.00       0     0.48    0.00    0.00       0
        [1,0]<stdout>:           0             0     float    none      -1     0.47    0.00    0.00       0     0.50    0.00    0.00       0
        [1,0]<stdout>:           0             0     float    none      -1     0.49    0.00    0.00       0     0.75    0.00    0.00       0
        [1,0]<stdout>:           0             0     float    none      -1     1.09    0.00    0.00       0     0.52    0.00    0.00       0
        [1,0]<stdout>:           0             0     float    none      -1     0.68    0.00    0.00       0     0.51    0.00    0.00       0
        [1,0]<stdout>:         256             4     float    none      -1   181.98    0.00    0.00       0   172.45    0.00    0.00       0
        [1,0]<stdout>:         512             8     float    none      -1   192.88    0.00    0.00       0   182.23    0.00    0.00       0
        [1,0]<stdout>:        1024            16     float    none      -1   153.56    0.01    0.01       0   152.58    0.01    0.01       0
        [1,0]<stdout>:        2048            32     float    none      -1   289.39    0.01    0.01       0   287.07    0.01    0.01       0
        [1,0]<stdout>:        4096            64     float    none      -1   299.63    0.01    0.01       0   304.94    0.01    0.01       0
        [1,0]<stdout>:        8192           128     float    none      -1   312.36    0.03    0.02       0   310.41    0.03    0.02       0
        [1,0]<stdout>:       16384           256     float    none      -1   300.69    0.05    0.05       0   305.11    0.05    0.05       0
        [1,0]<stdout>:       32768           512     float    none      -1   321.34    0.10    0.10       0   365.93    0.09    0.08       0
        [1,0]<stdout>:       65536          1024     float    none      -1   327.13    0.20    0.19       0   356.20    0.18    0.17       0
        [1,0]<stdout>:      131072          2048     float    none      -1  2902.92    0.05    0.04       0  2896.32    0.05    0.04       0
        [1,0]<stdout>:      262144          4096     float    none      -1  5700.32    0.05    0.04       0  5832.77    0.04    0.04       0
        [1,0]<stdout>:      524288          8192     float    none      -1  5899.30    0.09    0.08       0  5750.35    0.09    0.09       0
        [1,0]<stdout>:     1048576         16384     float    none      -1  5844.50    0.18    0.17       0  5848.13    0.18    0.17       0
        [1,0]<stdout>:     2097152         32768     float    none      -1  5985.18    0.35    0.33       0  5981.84    0.35    0.33       0
        [1,0]<stdout>:     4194304         65536     float    none      -1  32369.0    0.13    0.12       0  6985.87    0.60    0.56       0
        [1,0]<stdout>:     8388608        131072     float    none      -1  7942.76    1.06    0.99       0  8094.81    1.04    0.97       0
        [1,0]<stdout>:    16777216        262144     float    none      -1  10950.3    1.53    1.44       0  10924.2    1.54    1.44       0
        [1,0]<stdout>:    33554432        524288     float    none      -1  17809.7    1.88    1.77       0  17682.7    1.90    1.78       0
        [1,0]<stdout>:    67108864       1048576     float    none      -1  27133.4    2.47    2.32       0  27516.1    2.44    2.29       0
        [1,0]<stdout>:   134217728       2097152     float    none      -1  55659.7    2.41    2.26       0  50597.5    2.65    2.49       0
        [1,0]<stdout>:   268435456       4194304     float    none      -1   145696    1.84    1.73       0   155391    1.73    1.62       0
        [1,0]<stdout>:   536870912       8388608     float    none      -1   265424    2.02    1.90       0   237679    2.26    2.12       0
        [1,0]<stdout>:  1073741824      16777216     float    none      -1   403937    2.66    2.49       0   428740    2.50    2.35       0
        [1,0]<stdout>:  2147483648      33554432     float    none      -1   866496    2.48    2.32       0   768240    2.80    2.62       0
        [1,0]<stdout>:  4294967296      67108864     float    none      -1  1525958    2.81    2.64       0  1472038    2.92    2.74       0
        [1,0]<stdout>:  8589934592     134217728     float    none      -1  2890437    2.97    2.79       0  2898870    2.96    2.78       0
        [1,0]<stdout>: 17179869184     268435456     float    none      -1  5812380    2.96    2.77       0  5842777    2.94    2.76       0
        [1,1]<stdout>:multi-node-all-gather-perf-worker-0:22:110 [1] NCCL INFO comm 0x55b1d7f136a0 rank 1 nranks 16 cudaDev 1 busId 64000 - Destroy COMPLETE
        [1,3]<stdout>:multi-node-all-gather-perf-worker-0:24:106 [3] NCCL INFO comm 0x558a562c3f50 rank 3 nranks 16 cudaDev 3 busId 86000 - Destroy COMPLETE
        [1,2]<stdout>:multi-node-all-gather-perf-worker-0:23:104 [2] NCCL INFO comm 0x55696b7443e0 rank 2 nranks 16 cudaDev 2 busId 75000 - Destroy COMPLETE
        [1,10]<stdout>:multi-node-all-gather-perf-worker-1:25:106 [2] NCCL INFO comm 0x5648a0265670 rank 10 nranks 16 cudaDev 2 busId 75000 - Destroy COMPLETE
        [1,7]<stdout>:multi-node-all-gather-perf-worker-0:35:107 [7] NCCL INFO comm 0x55d55df4ec90 rank 7 nranks 16 cudaDev 7 busId ca000 - Destroy COMPLETE
        [1,6]<stdout>:multi-node-all-gather-perf-worker-0:32:105 [6] NCCL INFO comm 0x563db8679260 rank 6 nranks 16 cudaDev 6 busId b9000 - Destroy COMPLETE
        [1,4]<stdout>:multi-node-all-gather-perf-worker-0:26:109 [4] NCCL INFO comm 0x563fadd26340 rank 4 nranks 16 cudaDev 4 busId 97000 - Destroy COMPLETE
        [1,5]<stdout>:multi-node-all-gather-perf-worker-0:29:108 [5] NCCL INFO comm 0x55ba492c9270 rank 5 nranks 16 cudaDev 5 busId a8000 - Destroy COMPLETE
        [1,11]<stdout>:multi-node-all-gather-perf-worker-1:26:105 [3] NCCL INFO comm 0x5632597b6fd0 rank 11 nranks 16 cudaDev 3 busId 86000 - Destroy COMPLETE
        [1,9]<stdout>:multi-node-all-gather-perf-worker-1:24:111 [1] NCCL INFO comm 0x55f312bba370 rank 9 nranks 16 cudaDev 1 busId 64000 - Destroy COMPLETE
        [1,12]<stdout>:multi-node-all-gather-perf-worker-1:27:110 [4] NCCL INFO comm 0x556d34d771d0 rank 12 nranks 16 cudaDev 4 busId 97000 - Destroy COMPLETE
        [1,13]<stdout>:multi-node-all-gather-perf-worker-1:30:109 [5] NCCL INFO comm 0x55cbffb00420 rank 13 nranks 16 cudaDev 5 busId a8000 - Destroy COMPLETE
        [1,15]<stdout>:multi-node-all-gather-perf-worker-1:37:108 [7] NCCL INFO comm 0x55c840aca040 rank 15 nranks 16 cudaDev 7 busId ca000 - Destroy COMPLETE
        [1,0]<stdout>:multi-node-all-gather-perf-worker-0:21:111 [0] NCCL INFO comm 0x557b2d9ee900 rank 0 nranks 16 cudaDev 0 busId 53000 - Destroy COMPLETE
        [1,14]<stdout>:multi-node-all-gather-perf-worker-1:34:112 [6] NCCL INFO comm 0x55f96ee46500 rank 14 nranks 16 cudaDev 6 busId b9000 - Destroy COMPLETE
        [1,0]<stdout>:# Out of bounds values : 0 OK
        [1,0]<stdout>:# Avg bus bandwidth    : 0.84551 
        [1,0]<stdout>:#
        [1,0]<stdout>:# Collective test concluded: all_gather_perf
        [1,8]<stdout>:multi-node-all-gather-perf-worker-1:23:107 [0] NCCL INFO comm 0x561eb47e20f0 rank 8 nranks 16 cudaDev 0 busId 53000 - Destroy COMPLETE
        [1,0]<stdout>:

# All to All
        [1,0]<stdout>:           0             0     float    none      -1   213.48    0.00    0.00       0   181.64    0.00    0.00    N/A
        [1,0]<stdout>:           0             0     float    none      -1   203.28    0.00    0.00       0   185.53    0.00    0.00    N/A
        [1,0]<stdout>:           0             0     float    none      -1   197.08    0.00    0.00       0   182.40    0.00    0.00    N/A
        [1,0]<stdout>:           0             0     float    none      -1   181.25    0.00    0.00       0   183.72    0.00    0.00    N/A
        [1,0]<stdout>:           0             0     float    none      -1   185.29    0.00    0.00       0   196.10    0.00    0.00    N/A
        [1,0]<stdout>:         256             4     float    none      -1   243.70    0.00    0.00       0   248.45    0.00    0.00    N/A
        [1,0]<stdout>:         512             8     float    none      -1   217.80    0.00    0.00       0   201.61    0.00    0.00    N/A
        [1,0]<stdout>:        1024            16     float    none      -1   195.19    0.01    0.00       0   232.67    0.00    0.00    N/A
        [1,0]<stdout>:        2048            32     float    none      -1   202.82    0.01    0.01       0   236.30    0.01    0.01    N/A
        [1,0]<stdout>:        4096            64     float    none      -1   285.00    0.01    0.01       0   300.77    0.01    0.01    N/A
        [1,0]<stdout>:        8192           128     float    none      -1   286.71    0.03    0.03       0   265.11    0.03    0.03    N/A
        [1,0]<stdout>:       16384           256     float    none      -1   301.32    0.05    0.05       0   285.28    0.06    0.05    N/A
        [1,0]<stdout>:       32768           512     float    none      -1   295.90    0.11    0.10       0   290.89    0.11    0.11    N/A
        [1,0]<stdout>:       65536          1024     float    none      -1   320.39    0.20    0.19       0   303.64    0.22    0.20    N/A
        [1,0]<stdout>:      131072          2048     float    none      -1   346.70    0.38    0.35       0   351.30    0.37    0.35    N/A
        [1,0]<stdout>:      262144          4096     float    none      -1   389.17    0.67    0.63       0   399.63    0.66    0.61    N/A
        [1,0]<stdout>:      524288          8192     float    none      -1   536.50    0.98    0.92       0   518.04    1.01    0.95    N/A
        [1,0]<stdout>:     1048576         16384     float    none      -1   961.65    1.09    1.02       0   928.24    1.13    1.06    N/A
        [1,0]<stdout>:     2097152         32768     float    none      -1  3728.98    0.56    0.53       0  9941.44    0.21    0.20    N/A
        [1,0]<stdout>:     4194304         65536     float    none      -1  10488.8    0.40    0.37       0  27212.9    0.15    0.14    N/A
        [1,0]<stdout>:     8388608        131072     float    none      -1  38932.7    0.22    0.20       0  37257.3    0.23    0.21    N/A
        [1,0]<stdout>:    16777216        262144     float    none      -1   102473    0.16    0.15       0  47723.4    0.35    0.33    N/A
        [1,0]<stdout>:    33554432        524288     float    none      -1   122636    0.27    0.26       0   105002    0.32    0.30    N/A
        [1,0]<stdout>:    67108864       1048576     float    none      -1   156873    0.43    0.40       0   160631    0.42    0.39    N/A
        [1,0]<stdout>:   134217728       2097152     float    none      -1   183513    0.73    0.69       0   220170    0.61    0.57    N/A
        [1,0]<stdout>:   268435456       4194304     float    none      -1   371463    0.72    0.68       0   360310    0.75    0.70    N/A
        [1,0]<stdout>:   536870912       8388608     float    none      -1   513066    1.05    0.98       0   488900    1.10    1.03    N/A
        [1,0]<stdout>:  1073741824      16777216     float    none      -1  1025070    1.05    0.98       0   924410    1.16    1.09    N/A
        [1,0]<stdout>:  2147483648      33554432     float    none      -1  1632556    1.32    1.23       0  2333292    0.92    0.86    N/A
        [1,0]<stdout>:  4294967296      67108864     float    none      -1  3598784    1.19    1.12       0  3765446    1.14    1.07    N/A
        [1,0]<stdout>:  8589934592     134217728     float    none      -1  9065833    0.95    0.89       0  6584106    1.30    1.22    N/A
        [1,0]<stdout>: 17179869184     268435456     float    none      -1  1.3e+07    1.33    1.24       0  1.3e+07    1.32    1.24    N/A
        [1,6]<stdout>:multi-node-alltoall-perf-worker-0:32:111 [6] NCCL INFO comm 0x55a22c691180 rank 6 nranks 16 cudaDev 6 busId b9000 - Destroy COMPLETE
        [1,1]<stdout>:multi-node-alltoall-perf-worker-0:22:112 [1] NCCL INFO comm 0x563ca99229f0 rank 1 nranks 16 cudaDev 1 busId 64000 - Destroy COMPLETE
        [1,4]<stdout>:multi-node-alltoall-perf-worker-0:26:110 [4] NCCL INFO comm 0x55f4d2e3a3d0 rank 4 nranks 16 cudaDev 4 busId 97000 - Destroy COMPLETE
        [1,14]<stdout>:multi-node-alltoall-perf-worker-1:32:115 [6] NCCL INFO comm 0x55bdbf3d4590 rank 14 nranks 16 cudaDev 6 busId b9000 - Destroy COMPLETE
        [1,8]<stdout>:multi-node-alltoall-perf-worker-1:21:110 [0] NCCL INFO comm 0x563250d8e310 rank 8 nranks 16 cudaDev 0 busId 53000 - Destroy COMPLETE
        [1,15]<stdout>:multi-node-alltoall-perf-worker-1:35:114 [7] NCCL INFO comm 0x5595bd9e9c30 rank 15 nranks 16 cudaDev 7 busId ca000 - Destroy COMPLETE
        [1,13]<stdout>:multi-node-alltoall-perf-worker-1:29:113 [5] NCCL INFO comm 0x563f14759470 rank 13 nranks 16 cudaDev 5 busId a8000 - Destroy COMPLETE
        [1,7]<stdout>:multi-node-alltoall-perf-worker-0:35:114 [7] NCCL INFO comm 0x558289633e20 rank 7 nranks 16 cudaDev 7 busId ca000 - Destroy COMPLETE
        [1,5]<stdout>:multi-node-alltoall-perf-worker-0:29:115 [5] NCCL INFO comm 0x55fa73ff82e0 rank 5 nranks 16 cudaDev 5 busId a8000 - Destroy COMPLETE
        [1,9]<stdout>:multi-node-alltoall-perf-worker-1:22:116 [1] NCCL INFO comm 0x556f04337150 rank 9 nranks 16 cudaDev 1 busId 64000 - Destroy COMPLETE
        [1,10]<stdout>:multi-node-alltoall-perf-worker-1:23:111 [2] NCCL INFO comm 0x55bf0528b410 rank 10 nranks 16 cudaDev 2 busId 75000 - Destroy COMPLETE
        [1,11]<stdout>:multi-node-alltoall-perf-worker-1:24:112 [3] NCCL INFO comm 0x55c5a5f09e40 rank 11 nranks 16 cudaDev 3 busId 86000 - Destroy COMPLETE
        [1,3]<stdout>:multi-node-alltoall-perf-worker-0:24:117 [3] NCCL INFO comm 0x56217d170fd0 rank 3 nranks 16 cudaDev 3 busId 86000 - Destroy COMPLETE
        [1,2]<stdout>:multi-node-alltoall-perf-worker-0:23:116 [2] NCCL INFO comm 0x55deb7c5f350 rank 2 nranks 16 cudaDev 2 busId 75000 - Destroy COMPLETE
        [1,0]<stdout>:multi-node-alltoall-perf-worker-0:21:113 [0] NCCL INFO comm 0x5616467e99d0 rank 0 nranks 16 cudaDev 0 busId 53000 - Destroy COMPLETE
        [1,12]<stdout>:multi-node-alltoall-perf-worker-1:26:109 [4] NCCL INFO comm 0x5648fe0d6e50 rank 12 nranks 16 cudaDev 4 busId 97000 - Destroy COMPLETE
        [1,0]<stdout>:# Out of bounds values : 0 OK
        [1,0]<stdout>:# Avg bus bandwidth    : 0.403206 
        [1,0]<stdout>:#
        [1,0]<stdout>:# Collective test concluded: alltoall_perf
        [1,0]<stdout>:

Terms of contribution:

By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.

@yeazelm yeazelm self-requested a review November 2, 2025 22:31
@arnaldo2792 arnaldo2792 self-requested a review November 3, 2025 21:24
@mselim00 mselim00 marked this pull request as ready for review November 6, 2025 23:38
@arnaldo2792 arnaldo2792 merged commit 6d4bca0 into bottlerocket-os:develop Nov 13, 2025
2 checks passed
@mselim00 mselim00 deleted the 580-61 branch November 13, 2025 23:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update kmod-6.12-nvidia-r580 COPYING file

5 participants