Skip to content

vllm正确serve但是输出乱码 #26

@ZhonghaoWang

Description

@ZhonghaoWang

使用vllm(0.9.1版本)进行启动,启动命令如下,使用的8*A100显卡,并且已经根据教程将architectures改为MiniMaxText01ForCausalLM

export SAFETENSORS_FAST_GPU=1
export VLLM_USE_V1=0
VLLM_LOGGING_CONFIG_PATH=vllm_log_config.json python -u -m vllm.entrypoints.openai.api_server \
    --model open_source_models/MiniMax-M1-80k \
    --tensor-parallel-size 8 \
    --trust-remote-code \
    --quantization experts_int8  \
    --max_model_len 4096 \
    --dtype bfloat16

server启动正常,但是使用client请求后,输出部分为乱码,请求代码如下:

chat_response = client.chat.completions.create(
    model=model,
    messages=[
        {"role": "system", "content": [{"type": "text", "text": "You are a helpful assistant."}]},
        {"role": "user", "content": [{"type": "text", "text": "Who won the world series in 2020?"}]}
    ],
    max_tokens=1024,)

# print("Chat response:", chat_response)
print("Chat think response:",chat_response.choices[0].message.reasoning_content)
print("Chat response:",chat_response.choices[0].message.content)

结果如下:

Chat think response: None
Chat response: 特点和(from co的背后 మ nameSuggestionxin physiologic……(乱码循环)

请问下可能是什么原因呢

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions