[Bug]: Speculative Decoding + TP on Spec Worker + Chunked Prefill does not work. #10276
Open
1 task done
Labels
bug
Something isn't working
Your current environment
The output of `python collect_env.py`
Model Input Dumps
No response
🐛 Describe the bug
The server will start but fail immediately on any prompt when spec decoding, tp on spec model, and chunked prefill are enabled together. Sample error:
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: