[Bug]: Get meaningless output when run long context inference of Qwen2.5 model with vllm>=0.6.3 #10298
Open
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
models: Qwen2.5-Coder-7B-Instcut, Qwen2.5-7B-Instruct
vllm: 0.6.3
input token: >8000 tokens
🐛 Describe the bug
I have tested vllm 0.6.0~0.6.2, 0.5.5, all old versions are just ok.
So this bug was introduced since 0.6.3
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: