-
Notifications
You must be signed in to change notification settings - Fork 35
Closed
Description
To create a good experience, and as a part of the split of deployer to modelservice charts, I belive there needs to be a quick start for modelservice that will inherit many of the non-infrastructure pieces that used to live in deployer. This would include things like:
- Creating and validating the HF token
- Enabling or disabling metrics for that modelservice instance (This should really only be for prefill and decode service monitors, EPPs should be handled upstream: Inferencepool helm charts non GKE metrics enablement for EPP kubernetes-sigs/gateway-api-inference-extension#1115)
- Creating PVCs and downloading the model to PVC if desired
All of these tasks should be able to happen for every potential installation of the, and so those aspects of the quickstart should move here. Much of this will be pulled from what will be removed in: llm-d/llm-d-deployer#360
Metadata
Metadata
Assignees
Labels
No labels