-
Notifications
You must be signed in to change notification settings - Fork 30.1k
Open
Labels
Description
System Info
on docker
os: ubuntu 24.04
transformers: 4.55.0.dev0
mistral_common: 1.8.3
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Command to lauch container:
docker run --gpus all -p 8000:8000 --ipc=host vllm/vllm-openai:latest --model mistralai/Voxtral-Mini-3B-2507
Expected behavior
The output will finish in:
vllm-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/tokenizer_group.py", line 24, in __init__
vllm-1 | self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config)
vllm-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
vllm-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/tokenizer.py", line 309, in get_tokenizer
vllm-1 | tokenizer = get_cached_tokenizer(tokenizer)
vllm-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
vllm-1 | File "/usr/local/lib/python3.12/dist-packages/vllm/transformers_utils/tokenizer.py", line 104, in get_cached_tokenizer
vllm-1 | tokenizer_all_special_tokens = tokenizer.all_special_tokens
vllm-1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
vllm-1 | AttributeError: 'MistralCommonTokenizer' object has no attribute 'all_special_tokens'. Did you mean: '_all_special_ids'?
vLLM docker server uses the pretrained tokenizer format:
https://github.com/vllm-project/vllm/blob/49314869887e169be080201ab8bcda14e745c080/vllm/transformers_utils/tokenizer.py#L97-L101
Which must include: all_special_ids
, all_special_tokens
, all_special_tokens_extended
default properties. However, MistralCommonTokenizer does not have implemented them. Is there a plan to standarize both tokenizers?
fay-askari72