You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: gpu-operator/release-notes.rst
+20-4Lines changed: 20 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -43,11 +43,15 @@ New Features
43
43
44
44
* Updated software component versions:
45
45
46
+
- NVIDIA Driver Manager for Kubernetes v0.9.0
46
47
- NVIDIA Container Toolkit v1.18.0
47
48
- NVIDIA DCGM v4.4.1
48
-
- NVIDIA DCGM Exporter v4.4.1-4.5.2
49
-
- Node Feature Discovery v0.18.1
49
+
- NVIDIA DCGM Exporter v4.4.1-4.6.0
50
+
- Node Feature Discovery v0.18.2
50
51
- NVIDIA GDS Driver v2.26.6
52
+
- NVIDIA Kubernetes Device Plugin v0.18.0
53
+
- NVIDIA MIG Manager for Kubernetes v0.13.0
54
+
- NVIDIA vGPU Device Manager v0.4.1
51
55
52
56
* Added support for these NVIDIA Data Center GPU Driver versions:
53
57
@@ -68,7 +72,7 @@ New Features
68
72
updated to ``true`` since the Operator Lifecycle Manager (OLM) does not mutate custom
69
73
resources on operator upgrades.
70
74
71
-
* When using virtualization, on GPUs that support MIG, you now have the option to select MIG-backed vGPU instances instead of time-sliced vGPU instances.
75
+
* When using NVIDIA vGPU with KubeVirt / OpenShift Virtualization, on GPUs that support MIG, you now have the option to select MIG-backed vGPU instances instead of time-sliced vGPU instances.
72
76
To select a MIG-backed vGPU profile, label the node with the name of the MIG-backed vGPU profile.
73
77
74
78
* Added support for NVIDIA HGX B300 and NVIDIA HGX GB300 NVL72.
@@ -110,6 +114,17 @@ New Features
110
114
* ``2g.70gb`` :math:`\times` 1
111
115
* ``3g.139gb`` :math:`\times` 1
112
116
117
+
Improvements
118
+
------------
119
+
120
+
* The GPU Operator now configures containerd and cri-o using drop-in files by default.
121
+
When installing on microk8s, you need to set the value of the RUNTIME_CONFIG_SOURCE parameter in the CLusterPolicy to ``file=/var/snap/microk8s/current/args/containerd.toml``.
122
+
123
+
* Hardened the GPU Operator container image by using a distroless as a base image.
124
+
125
+
* Validator for NVIDIA GPU Operator is now included as part of the GPU Operator container image.
126
+
It is no longer a separate image.
127
+
113
128
Fixed Issues
114
129
------------
115
130
@@ -118,7 +133,8 @@ Fixed Issues
118
133
Known Issues
119
134
------------
120
135
121
-
* TBD
136
+
* When using cri-o as the container runtime, several of the GPU Operator pods may be stuck in the ``RunContainerError`` state during installation or upgrade of GPU Operator.
137
+
The pods may be in this state for several minutes, but will recover from this state as soon as the container toolkit pod starts running.
0 commit comments