-
Notifications
You must be signed in to change notification settings - Fork 138
- #270 · juney-nvidia opened
on Jan 1, 2024
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
- Status: Open.#854 In triton-inference-server/tensorrtllm_backend;
Qwen2-VL SFT with extended vocab generates only vocab_size-1 (168063) tokens in TensorRT-LLM 0.17, while HF works fine
bugSomething isn't workingSomething isn't workingStatus: Open.#849 In triton-inference-server/tensorrtllm_backend;- Status: Open.#843 In triton-inference-server/tensorrtllm_backend;
- Status: Open.#810 In triton-inference-server/tensorrtllm_backend;
Batching documentation confusing - can you update the docs of main repository please
bugSomething isn't workingSomething isn't workingStatus: Open.#809 In triton-inference-server/tensorrtllm_backend;- Status: Open.#806 In triton-inference-server/tensorrtllm_backend;
Whisper one-shot enc+dec path treats mel frames (3000) as “encoder length”, causing length assertion at 1500 and internal broadcast shape errors
bugSomething isn't workingSomething isn't workingStatus: Open.#804 In triton-inference-server/tensorrtllm_backend;LoRa weights not applied without warnings/errors when mismatch in type
bugSomething isn't workingSomething isn't workingStatus: Open.#791 In triton-inference-server/tensorrtllm_backend;- Status: Open.#786 In triton-inference-server/tensorrtllm_backend;
- Status: Open.#770 In triton-inference-server/tensorrtllm_backend;
- Status: Open.#754 In triton-inference-server/tensorrtllm_backend;
lora_config shape mismatch when using converted LoRA at runtime
bugSomething isn't workingSomething isn't workingStatus: Open.#750 In triton-inference-server/tensorrtllm_backend;