Add 580.65.06 to 25.3.2 (#230)

chenopis · web-flow · commit 738f8227f55d · 2025-08-15T12:52:48.000-07:00
* Add 580.65.06 to 25.3.2 Signed-off-by: Andrew Chen <andrewch@nvidia.com> * added known issue Signed-off-by: Andrew Chen <andrewch@nvidia.com> * reworded known issue Signed-off-by: Andrew Chen <andrewch@nvidia.com> * updated Containerd to support up to 2.1 Signed-off-by: Andrew Chen <andrewch@nvidia.com> * added fix for CVE-2025-23266 and CVE-2025-23267 to release notes Signed-off-by: Andrew Chen <andrewch@nvidia.com> * added nouveau driver to Known Issues Signed-off-by: Andrew Chen <andrewch@nvidia.com> * minor edits to comply with NVIDIA Style Guide Signed-off-by: Andrew Chen <andrewch@nvidia.com> * reorder known issues Signed-off-by: Andrew Chen <andrewch@nvidia.com> --------- Signed-off-by: Andrew Chen <andrewch@nvidia.com>
diff --git a/container-toolkit/arch-overview.md b/container-toolkit/arch-overview.md
@@ -92,7 +92,7 @@ a `prestart` hook into it, and then calls out to the native `runC`, passing it t
 For versions of the NVIDIA Container Runtime from `v1.12.0`, this runtime also performs additional modifications to the OCI runtime spec to inject
 specific devices and mounts not handled by the NVIDIA Container CLI.
 
-It's important to note that this component is not necessarily specific to docker (but it is specific to `runC`).
+It is important to note that this component is not necessarily specific to docker (but it is specific to `runC`).
 
 ### The NVIDIA Container Toolkit CLI
 
diff --git a/gpu-operator/dra-intro-install.rst b/gpu-operator/dra-intro-install.rst
@@ -12,7 +12,7 @@ Introduction
 
 With NVIDIA's DRA Driver for GPUs, your Kubernetes workload can allocate and consume the following two types of resources:
 
-* **GPUs**: for controlled sharing and dynamic reconfiguration of GPUs. A modern replacement for the traditional GPU allocation method (using `NVIDIA's device plugin <https://github.com/NVIDIA/k8s-device-plugin>`_). We are excited about this part of the driver; it is however not yet fully supported (Technology Preview).
+* **GPUs**: for controlled sharing and dynamic reconfiguration of GPUs. A modern replacement for the traditional GPU allocation method (using `NVIDIA's device plugin <https://github.com/NVIDIA/k8s-device-plugin>`_). NVIDIA is excited about this part of the driver; it is however not yet fully supported (Technology Preview).
 * **ComputeDomains**: for robust and secure Multi-Node NVLink (MNNVL) for NVIDIA GB200 and similar systems. Fully supported.
 
 A primer on DRA
@@ -25,7 +25,7 @@ For NVIDIA devices, there are two particularly beneficial characteristics provid
 #. A clean way to allocate **cross-node resources** in Kubernetes (leveraged here for providing NVLink connectivity across pods running on multiple nodes).
 #. Mechanisms to explicitly **share, partition, and reconfigure** devices **on-the-fly** based on user requests (leveraged here for advanced GPU allocation).
 
-To understand and make best use of NVIDIA's DRA Driver for GPUs, we recommend becoming familiar with DRA by working through the `official documentation <https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/>`_.
+To understand and make best use of NVIDIA's DRA Driver for GPUs, NVIDIA recommends becoming familiar with DRA by working through the `official documentation <https://kubernetes.io/docs/concepts/scheduling-eviction/dynamic-resource-allocation/>`_.
 
 
 The twofold nature of this driver
@@ -34,7 +34,7 @@ The twofold nature of this driver
 NVIDIA's DRA Driver for GPUs is comprised of two subsystems that are largely independent of each other: one manages GPUs, and the other one manages ComputeDomains.
 
 Below, you can find instructions for how to install both parts or just one of them.
-Additionally, we have prepared two separate documentation chapters, providing more in-depth information for each of the two subsystems:
+Additionally, NVIDIA has prepared two separate documentation chapters, providing more in-depth information for each of the two subsystems:
 
 - :ref:`Documentation for ComputeDomain (MNNVL) support <dra_docs_compute_domains>`
 - :ref:`Documentation for GPU support <dra_docs_gpus>`
@@ -52,7 +52,7 @@ Prerequisites
 - `CDI <https://github.com/cncf-tags/container-device-interface?tab=readme-ov-file#how-to-configure-cdi>`_ must be enabled in the underlying container runtime (such as containerd or CRI-O).
 - NVIDIA GPU Driver 565 or later.
 
-For the last two items on the list above, as well as for other reasons, we recommend installing NVIDIA's GPU Operator v25.3.0 or later.
+For the last two items on the list above, as well as for other reasons, NVIDIA recommends installing NVIDIA's GPU Operator v25.3.0 or later.
 For detailed instructions, see the official GPU Operator `installation documentation <https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html#common-chart-customization-options>`__.
 Also note that, in the near future, the preferred method to install NVIDIA's DRA Driver for GPUs will be through the GPU Operator (the DRA driver will then no longer require installation as a separate Helm chart).
 
@@ -65,8 +65,8 @@ Also note that, in the near future, the preferred method to install NVIDIA's DRA
   - Refer to the `docs on installing the GPU Operator with a pre-installed GPU driver <https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/getting-started.html#pre-installed-nvidia-gpu-drivers>`__.
 
 
-Configure and Helm-install the driver
-=====================================
+Configure and install the driver with Helm
+==========================================
 
 #. Add the NVIDIA Helm repository:
 
@@ -103,15 +103,15 @@ All install-time configuration parameters can be listed by running ``helm show v
 .. note::
 
   - A common mode of operation for now is to enable only the ComputeDomain subsystem (to have GPUs allocated using the traditional device plugin). The example above achieves that by setting ``resources.gpus.enabled=false``.
-  - Setting  ``nvidiaDriverRoot=/run/nvidia/driver`` above expects a GPU Operator-provided GPU driver. That configuration parameter must be changed in case the GPU driver is installed straight on the host (typically at ``/``, which is the default value for ``nvidiaDriverRoot``).
+  - Setting ``nvidiaDriverRoot=/run/nvidia/driver`` above expects a GPU Operator-provided GPU driver. That configuration parameter must be changed in case the GPU driver is installed straight on the host (typically at ``/``, which is the default value for ``nvidiaDriverRoot``).
 
 
 Validate installation
 =====================
 
 A lot can go wrong, depending on the exact nature of your Kubernetes environment and specific hardware and driver choices as well as configuration options chosen.
-That is why we recommend to perform a set of validation tests to confirm the basic functionality of your setup.
-To that end, we have prepared separate documentation:
+That is why NVIDIA recommends performing a set of validation tests to confirm the basic functionality of your setup.
+To that end, NVIDIA has prepared separate documentation:
 
 - `Testing ComputeDomain allocation <https://github.com/NVIDIA/k8s-dra-driver-gpu/wiki/Validate-setup-for-ComputeDomain-allocation>`_
 - `Testing GPU allocation <https://github.com/NVIDIA/k8s-dra-driver-gpu/wiki/Validate-setup-for-GPU-allocation>`_
diff --git a/gpu-operator/getting-started.rst b/gpu-operator/getting-started.rst
@@ -277,7 +277,7 @@ To view all the options, run ``helm show values nvidia/gpu-operator``.
      - ``{}``
 
    * - ``psp.enabled``
-     - The GPU operator deploys ``PodSecurityPolicies`` if enabled.
+     - The GPU Operator deploys ``PodSecurityPolicies`` if enabled.
      - ``false``
 
    * - ``sandboxWorkloads.defaultWorkload``
diff --git a/gpu-operator/gpu-operator-kata.rst b/gpu-operator/gpu-operator-kata.rst
@@ -19,8 +19,8 @@
 ..
    lingo:
 
-   It's "Kata Containers" when referring to the software component.
-   It's "Kata container" when it's a container that uses the Kata Containers runtime.
+   It is "Kata Containers" when referring to the software component.
+   It is "Kata container" when it is a container that uses the Kata Containers runtime.
    Treat our operands as proper nouns and use title case.
 
 #################################
diff --git a/gpu-operator/gpu-operator-kubevirt.rst b/gpu-operator/gpu-operator-kubevirt.rst
@@ -37,7 +37,7 @@ Given the following node configuration:
 * Node B is configured with the label ``nvidia.com/gpu.workload.config=vm-passthrough`` and configured to run virtual machines with Passthrough GPU.
 * Node C is configured with the label ``nvidia.com/gpu.workload.config=vm-vgpu`` and configured to run virtual machines with vGPU.
 
-The GPU operator will deploy the following software components on each node:
+The GPU Operator will deploy the following software components on each node:
 
 * Node A receives the following software components:
    * ``NVIDIA Datacenter Driver`` - to install the driver
diff --git a/gpu-operator/gpu-operator-mig.rst b/gpu-operator/gpu-operator-mig.rst
@@ -102,7 +102,7 @@ Perform the following steps to install the Operator and configure MIG:
       Known Issue: For drivers 570.124.06, 570.133.20, 570.148.08, and 570.158.01,
       GPU workloads cannot be scheduled on nodes that have a mix of MIG slices and full GPUs. 
       This manifests as GPU pods getting stuck indefinitely in the ``Pending`` state. 
-      It's recommended that you downgrade the driver to version 570.86.15 to work around this issue.
+      NVIDIA recommends that you downgrade the driver to version 570.86.15 to work around this issue.
       For more detailed information, see GitHub issue https://github.com/NVIDIA/gpu-operator/issues/1361.
 
 
diff --git a/gpu-operator/install-gpu-operator-air-gapped.rst b/gpu-operator/install-gpu-operator-air-gapped.rst
@@ -246,7 +246,7 @@ Sample of ``values.yaml`` for GPU Operator v1.9.0:
 Local Package Repository
 ************************
 
-The ``driver`` container deployed as part of the GPU operator requires certain packages to be available as part of the
+The ``driver`` container deployed as part of the GPU Operator requires certain packages to be available as part of the
 driver installation. In restricted internet access or air-gapped installations, users are required to create a
 local mirror repository for their OS distribution and make the following packages available:
 
diff --git a/gpu-operator/install-gpu-operator-outdated-kernels.rst b/gpu-operator/install-gpu-operator-outdated-kernels.rst
@@ -12,7 +12,7 @@ On GPU nodes where the running kernel is not the latest, the ``driver`` containe
 see the following error message: ``Could not resolve Linux kernel version``.
 
 In general, upgrading your system to the latest kernel should fix this issue. But if this is not an option, the following is a
-workaround to successfully deploy the GPU operator when GPU nodes in your cluster may not be running the latest kernel.
+workaround to successfully deploy the GPU Operator when GPU nodes in your cluster may not be running the latest kernel.
 
 Add Archived Package Repositories
 =================================
diff --git a/gpu-operator/life-cycle-policy.rst b/gpu-operator/life-cycle-policy.rst
@@ -91,8 +91,9 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information.
      - ${version} 
 
    * - NVIDIA GPU Driver |ki|_
-     - | `575.57.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-575-57-08/index.html>`_ 
-       | `570.172.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-172-08/index.html>`_ (default, recommended)        
+     - | `580.65.06 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-580-65-06/index.html>`_ (recommended)        
+       | `575.57.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-575-57-08/index.html>`_
+       | `570.172.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-172-08/index.html>`_ (default)        
        | `570.158.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-158-01/index.html>`_
        | `570.148.08 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-148-08/index.html>`_
        | `535.261.03 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-535-261-03/index.html>`_
@@ -152,7 +153,7 @@ Refer to :ref:`Upgrading the NVIDIA GPU Operator` for more information.
    Known Issue: For drivers 570.124.06, 570.133.20, 570.148.08, and 570.158.01,
    GPU workloads cannot be scheduled on nodes that have a mix of MIG slices and full GPUs. 
    This manifests as GPU pods getting stuck indefinitely in the ``Pending`` state. 
-   It's recommended that you downgrade the driver to version 570.86.15 to work around this issue.
+   NVIDIA recommends that you downgrade the driver to version 570.86.15 to work around this issue.
    For more detailed information, see GitHub issue https://github.com/NVIDIA/gpu-operator/issues/1361.
 
 
diff --git a/gpu-operator/overview.rst b/gpu-operator/overview.rst
@@ -31,7 +31,7 @@ configuration of multiple software components such as drivers, container runtime
 and prone to errors. The NVIDIA GPU Operator uses the `operator framework <https://coreos.com/blog/introducing-operator-framework>`_
 within Kubernetes to automate the management of all NVIDIA software components needed to provision GPU. These components include the NVIDIA drivers (to enable CUDA),
 Kubernetes device plugin for GPUs, the `NVIDIA Container Toolkit <https://github.com/NVIDIA/nvidia-container-toolkit>`_,
-automatic node labelling using `GFD <https://github.com/NVIDIA/gpu-feature-discovery>`_, `DCGM <https://developer.nvidia.com/dcgm>`_ based monitoring and others.
+automatic node labeling using `GFD <https://github.com/NVIDIA/gpu-feature-discovery>`_, `DCGM <https://developer.nvidia.com/dcgm>`_ based monitoring and others.
 
 
 .. card:: Red Hat OpenShift Container Platform
diff --git a/gpu-operator/platform-support.rst b/gpu-operator/platform-support.rst
@@ -459,8 +459,8 @@ The GPU Operator has been validated in the following scenarios:
 Supported Precompiled Drivers
 -----------------------------
 
-The GPU Operator has been validated with the following precomplied drivers.
-See the :doc:`precompiled-drivers` page for more on using precompiled drivers.
+The GPU Operator has been validated with the following precompiled drivers.
+See the :doc:`precompiled-drivers` page for more information about using precompiled drivers.
 
 +----------------------------+------------------------+----------------+---------------------+
 | Operating System           | Kernel Flavor          | Kernel Version | CUDA Driver Branch  |
@@ -477,10 +477,10 @@ See the :doc:`precompiled-drivers` page for more on using precompiled drivers.
 Supported Container Runtimes
 ----------------------------
 
-The GPU Operator has been validated in the following scenarios:
+The GPU Operator has been validated for the following container runtimes:
 
 +----------------------------+------------------------+----------------+
-| Operating System           | Containerd 1.6 - 2.0   | CRI-O          |
+| Operating System           | Containerd 1.6 - 2.1   | CRI-O          |
 +============================+========================+================+
 | Ubuntu 20.04 LTS           | Yes                    | Yes            |
 +----------------------------+------------------------+----------------+
@@ -581,4 +581,4 @@ Additional Supported Container Management Tools
 -----------------------------------------------
 
 * Helm v3
-* Red Hat Operator Lifecycle Manager (OLM)
+* Red Hat Operator Lifecycle Manager (OLM)
diff --git a/gpu-operator/release-notes.rst b/gpu-operator/release-notes.rst
diff --git a/gpu-operator/troubleshooting.rst b/gpu-operator/troubleshooting.rst
diff --git a/gpu-operator/upgrade.rst b/gpu-operator/upgrade.rst
diff --git a/repo.toml b/repo.toml