@@ -11,6 +11,29 @@ AWS-specific parts a compile-time option. When a feature (or entire
11
11
release) was only available in one of the two variants, we note that
12
12
in the release notes.
13
13
14
+ # v1.9.2-aws release notes
15
+ This is a bugfix release which requires [ Libfabric
16
+ v1.18.0] ( https://github.com/ofiwg/libfabric/releases/tag/v1.18.0 ) or later and
17
+ supports [ NCCL 2.21.5-1] ( https://github.com/NVIDIA/nccl/releases/tag/v2.21.5-1 )
18
+ while maintaining backward compatibility with older NCCL versions ([ NCCL
19
+ v2.4.8] ( https://github.com/NVIDIA/nccl/releases/tag/v2.4.8-1 ) and later).
20
+
21
+ Bug Fixes:
22
+ * Improved tuner model to make better decisions on P5 instances.
23
+ * Added support, in RDMA protocol, for truncation when receiving a size in the
24
+ isend call greater than the size in the correspond irecv.
25
+ * Fixed bug that prevented the tuner from getting loaded with NCCL 2.19 and
26
+ 2.20.
27
+ * Fixed logging statement regarding if a domain is created per thread or per
28
+ process.
29
+ * Updated plugin to not advertise global MR support, to avoid a performance
30
+ regression in user-registered buffers.
31
+
32
+ The plugin has been tested with following libfabric providers using tests
33
+ bundled in the source code and
34
+ [ nccl-tests] ( https://github.com/NVIDIA/nccl-tests ) suite:
35
+ * efa
36
+
14
37
# v1.9.1-aws release notes
15
38
This is a bugfix release which requires [ Libfabric
16
39
v1.18.0] ( https://github.com/ofiwg/libfabric/releases/tag/v1.18.0 ) or later and
0 commit comments