Issues · triton-inference-server/tensorrtllm_backend · GitHub

[Issue Template]Short one-line summary of the issue
#270 · juney-nvidia opened on Jan 1, 2024

Labels Milestones

ci: add automated test coverage for llmapi (PyTorch backend) path

#854

· faradawn opened

on Apr 16, 2026

Qwen2-VL SFT with extended vocab generates only vocab_size-1 (168063) tokens in TensorRT-LLM 0.17, while HF works fine

#849

· lhh0916 opened

on Mar 4, 2026

Question to optimize build

#843

· geraldstanje opened

on Feb 12, 2026

build error

#810

· geraldstanje1 opened

on Oct 16, 2025

Batching documentation confusing - can you update the docs of main repository please

#809

· protonicage opened

on Oct 10, 2025

Is it possible to obtain scores from the TRT-LLM model?

#806

· otvall opened

on Oct 2, 2025

Whisper one-shot enc+dec path treats mel frames (3000) as “encoder length”, causing length assertion at 1500 and internal broadcast shape errors

#804

· YuBeomGon opened

on Sep 29, 2025

LoRa weights not applied without warnings/errors when mismatch in type

#791

· rahchuenmonroe opened

on Aug 21, 2025

Encoder-Decoder example doesn't actually use encoder?

#786

· srinath2022 opened

on Aug 8, 2025

Quick start has outdated instructions

#770

· juanpabloguerra16 opened

on Jun 30, 2025

Question about increased Docker image size in recent version

#754

· fclearner opened

on May 28, 2025

lora_config shape mismatch when using converted LoRA at runtime

#750

· paulhendricks opened

on May 16, 2025