-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Open
Labels
Triton BackendRelated to NVIDIA Triton Inference Server backendRelated to NVIDIA Triton Inference Server backend
Description
The nvcr.io/nvidia/tensorrt-llm/release:0.20.0
container does not include the triton_backend/
folder, is this the norm going forward for future containers? Should these templates just be copied out of the github project sep. before creating a repos folder for further triton backend hosting?
Is the guidance to also use llm_api/tensorrt_llm
going forward, or is there a future for the inflight_batcher_llm
and dissaggregated serving
c++ methods?
Thanks!
Metadata
Metadata
Assignees
Labels
Triton BackendRelated to NVIDIA Triton Inference Server backendRelated to NVIDIA Triton Inference Server backend