使用vllm加载模型进行推理报错

环境：
cuda12.4
vllm 0.9.1
8*A100 80G

运行命令：
python -m vllm.entrypoints.openai.api_server \
--model ./MiniMax-M1-80k \
--tensor-parallel-size 8 \
--trust-remote-code \
--quantization experts_int8  \
--max_model_len 1024 \
--dtype bfloat16
报错：
ValueError: Cannot find model module. 'MiniMaxM1ForCausalLM' is not a registered model in the Transformers library (only relevant if the model is meant to be in Transformers) and 'AutoModel' is not present in the model config's 'auto_map' (relevant if the model is custom).

请问这个问题怎么解决？


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

使用vllm加载模型进行推理报错 #21

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

使用vllm加载模型进行推理报错 #21

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions