Skip to content

docs: Add documentation for LLMInferenceService label and annotation propagation #607

@khushiiagrawal

Description

@khushiiagrawal

Describe the change you'd like to see

Add a documentation page for the LLMInferenceService label and annotation propagation feature, as requested by @sivanantha321 in kserve/kserve#5009 (kserve/kserve#5009 (comment)).

This feature (merged in kserve/kserve#5009) enables users to propagate Kubernetes labels and annotations from an LLMInferenceService resource to the workload pods it manages. It supports all deployment modes: single-node Deployments, multi-node LeaderWorkerSets, disaggregated prefill-decode workloads, and the scheduler (EPP) Deployment.

The documentation should cover:

  • Two propagation layers: top-level metadata (prefix-filtered via an approved allowlist) vs. spec-level fields (spec.labels, spec.annotations, and per-component equivalents) which propagate all keys without filtering.
  • Approved prefix allowlists: which annotation prefixes (k8s.v1.cni.cncf.io, kueue.x-k8s.io, prometheus.io) and label prefixes (kueue.x-k8s.io) are propagated from .metadata.
  • Per-component spec fields: spec.prefill.labels/spec.prefill.annotations for prefill pods and spec.router.scheduler.labels/spec.router.scheduler.annotations for the scheduler pod.
  • Multi-node behaviour: propagation to both leader and worker pod templates.
  • Precedence rules: spec-level values override top-level metadata when the same key appears in both.
  • Practical examples: Kueue queue assignment, Multus CNI attachment, Prometheus scraping config, and custom platform labels for cost allocation.

Suggested location: docs/model-serving/generative-inference/llmisvc/llmisvc-label-propagation.md under the existing LLMInferenceService sidebar category.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions