Add hyperdisk-ml support for gke-storage #3672

saltysoup · 2025-02-13T20:38:19Z

Hyperdisk-ml is the recommended storage for speeding up load times and inference of larger models eg. DeepSeek-R1, Llama3-405b, which often takes a long time ~20min+

Source: https://cloud.google.com/kubernetes-engine/docs/tutorials/serve-multihost-gpu#speed_up_model_load_times_with_hyperdisk_ml

This PR is to add hyperdisk-ml support by adding this disk type as a valid condition in variables.tf, and new templates for storage class and PVC

wiktorn · 2025-02-13T21:20:24Z

modules/file-system/gke-storage/storage-class/hyperdisk-ml-sc.yaml.tftpl

+  %{~ for key, val in labels ~}
+    ${key}: ${val}
+provisioner: pd.csi.storage.gke.io
+allowVolumeExpansion: true
+parameters:
+  %{~ endfor ~}


Was this for really meant to be that long?

what do you mean by long? FWIW i used the same template as hyperdisk-extreme (noticed its slightly different to hyperdisk-balanced template).

Hyperdisk-ml doesnt support customizable IOPS so i took that out. I also didnt include custom throughput as it automatically specifies this based on disk size

I expected something like this:

Suggested change

%{~ for key, val in labels ~}

${key}: ${val}

provisioner: pd.csi.storage.gke.io

allowVolumeExpansion: true

parameters:

%{~ endfor ~}

%{~ for key, val in labels ~}

${key}: ${val}

%{~ endfor ~}

provisioner: pd.csi.storage.gke.io

allowVolumeExpansion: true

parameters:

I think you don't need to repeat provsioner and allowVolmeExpansion, and most imporatantly, you don't want them missing if no labels are provided. And to me it looks like hyperdisk-extreme has the same issue

Adding hyperdisk-ml support

916059d

wiktorn reviewed Feb 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add hyperdisk-ml support for gke-storage #3672

Add hyperdisk-ml support for gke-storage #3672

saltysoup commented Feb 13, 2025

wiktorn Feb 13, 2025

saltysoup Feb 14, 2025

wiktorn Feb 14, 2025

Add hyperdisk-ml support for gke-storage #3672

Are you sure you want to change the base?

Add hyperdisk-ml support for gke-storage #3672

Conversation

saltysoup commented Feb 13, 2025

wiktorn Feb 13, 2025

Choose a reason for hiding this comment

saltysoup Feb 14, 2025

Choose a reason for hiding this comment

wiktorn Feb 14, 2025

Choose a reason for hiding this comment