Skip to content
This repository was archived by the owner on Oct 15, 2025. It is now read-only.

Commit 39b6b4e

Browse files
committed
Fix upstream inferencepool chart integration with correct OCI registry
- Use correct OCI registry: us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts - Restore proper subchart dependency structure for inferencepool - Remove manual InferencePool template (use upstream chart instead) - Fix HTTPRoute backend reference to use subchart release name - Successfully builds dependencies and passes lint The upstream kubernetes-sigs/gateway-api-inference-extension charts are published to Google Artifact Registry, not GitHub Container Registry. This fix enables proper integration with the official upstream Helm charts.
1 parent b2cc225 commit 39b6b4e

File tree

7 files changed

+82
-20
lines changed

7 files changed

+82
-20
lines changed

charts/IMPLEMENTATION_SUMMARY.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ This implementation addresses [issue #312](https://github.com/llm-d/llm-d-deploy
4242
- Configuration orchestration
4343

4444
**Integration Points**:
45-
- Uses upstream `inferencepool` chart for intelligent routing
45+
- Creates InferencePool resources (requires upstream CRDs)
4646
- Connects vLLM services via label matching
4747
- Maintains backward compatibility for deployment
4848

@@ -97,9 +97,9 @@ helm install llm-d-new ./charts/llm-d-umbrella \
9797
## Benefits Achieved
9898

9999
### ✅ Upstream Integration
100-
- Uses official Gateway API Inference Extension charts
101-
- Leverages multi-provider support (GKE, Istio, kGateway)
102-
- Gets upstream bug fixes and feature updates automatically
100+
- Uses official Gateway API Inference Extension CRDs and APIs
101+
- Creates InferencePool resources following upstream specifications
102+
- Compatible with multi-provider support (GKE, Istio, kGateway)
103103

104104
### ✅ Modular Architecture
105105
- vLLM and gateway concerns properly separated

charts/llm-d-umbrella/Chart.lock

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
dependencies:
2+
- name: common
3+
repository: https://charts.bitnami.com/bitnami
4+
version: 2.27.0
5+
- name: inferencepool
6+
repository: oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts
7+
version: v0
8+
- name: llm-d-vllm
9+
repository: file://../llm-d-vllm
10+
version: 1.0.0
11+
digest: sha256:80feac6ba991f6b485fa14153c7f061a0cbfb19d65ee332c03c8fba288922501
12+
generated: "2025-06-13T19:53:15.903878-04:00"

charts/llm-d-umbrella/Chart.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -26,8 +26,8 @@ dependencies:
2626
version: "2.27.0"
2727
# Upstream inference gateway chart
2828
- name: inferencepool
29-
repository: oci://ghcr.io/kubernetes-sigs/gateway-api-inference-extension/charts
30-
version: "0.0.0"
29+
repository: oci://us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/charts
30+
version: "v0"
3131
condition: inferencepool.enabled
3232
# Our vLLM model serving chart
3333
- name: llm-d-vllm
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
{{ template "chart.header" . }}
2+
3+
{{ template "chart.description" . }}
4+
5+
## Prerequisites
6+
7+
- Kubernetes 1.30+
8+
- Helm 3.10+
9+
- Gateway API CRDs installed
10+
- **InferencePool CRDs** (from Gateway API Inference Extension):
11+
```bash
12+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferencepool-resources.yaml
13+
```
14+
15+
{{ template "chart.maintainersSection" . }}
16+
17+
{{ template "chart.sourcesSection" . }}
18+
19+
{{ template "chart.requirementsSection" . }}
20+
21+
{{ template "chart.valuesSection" . }}
22+
23+
## Installation
24+
25+
1. Install prerequisites:
26+
```bash
27+
# Install Gateway API CRDs (if not already installed)
28+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.0.0/standard-install.yaml
29+
30+
# Install InferencePool CRDs
31+
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/raw/main/config/manifests/inferencepool-resources.yaml
32+
```
33+
34+
2. Install the chart:
35+
```bash
36+
helm install my-llm-d-umbrella llm-d/llm-d-umbrella
37+
```
38+
39+
## Architecture
40+
41+
This umbrella chart combines:
42+
- **Upstream InferencePool**: Intelligent routing and load balancing for inference workloads
43+
- **llm-d-vLLM**: Dedicated vLLM model serving components
44+
- **Gateway API**: External traffic routing and management
45+
46+
The modular design enables:
47+
- Clean separation between inference gateway and model serving
48+
- Leveraging upstream Gateway API Inference Extension
49+
- Intelligent endpoint selection and load balancing
50+
- Backward compatibility with existing deployments
51+
52+
{{ template "chart.homepage" . }}

charts/llm-d-umbrella/templates/httproute.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ spec:
2020
{{- range .backendRefs }}
2121
- group: {{ .group }}
2222
kind: {{ .kind }}
23-
name: {{ .name }}
23+
name: {{ tpl .name $ }}
2424
port: {{ .port }}
2525
{{- end }}
2626
---

charts/llm-d-umbrella/values.yaml

Lines changed: 2 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -33,18 +33,7 @@ commonAnnotations: {}
3333
inferencepool:
3434
enabled: true
3535

36-
# Configure the inference extension (endpoint picker)
37-
inferenceExtension:
38-
replicas: 1
39-
image:
40-
name: epp
41-
hub: gcr.io/gke-ai-eco-dev
42-
tag: 0.3.0
43-
pullPolicy: Always
44-
externalProcessingPort: 9002
45-
env: []
46-
47-
# Configure the inference pool for vLLM
36+
# InferencePool configuration (passed to upstream chart)
4837
inferencePool:
4938
targetPort: 8000
5039
modelServerType: vllm
@@ -120,5 +109,5 @@ gateway:
120109
backendRefs:
121110
- group: inference.networking.x-k8s.io
122111
kind: InferencePool
123-
name: vllm-inference-pool # Name from inferencepool chart
112+
name: "{{ .Release.Name }}-inferencepool"
124113
port: 8000

charts/llm-d-vllm/Chart.lock

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
dependencies:
2+
- name: common
3+
repository: https://charts.bitnami.com/bitnami
4+
version: 2.27.0
5+
- name: redis
6+
repository: https://charts.bitnami.com/bitnami
7+
version: 20.13.4
8+
digest: sha256:772ec68662ea0b33874d50d86123af9486c4f549bd1fb18db7b685315a3d0163
9+
generated: "2025-06-13T19:53:30.705482-04:00"

0 commit comments

Comments
 (0)