Replies: 1 comment
-
Either run it on 4 GPU's or buy another 2.
Given that you have generous GPU specs, another option would be to run 3x replicas with each replica taking 2 GPUs. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Dear vllm experts, I am trying to deploy vllm in distributed mode, we have at our research institute 4 nodes each with 1xA100, they are working pretty good with distributed ray cluster. Now we got an another node with 2xL40S, ray can show all 6 gpus, but one node with 2 gpus . how to start vllm to use all gpus?
currently we use:
Beta Was this translation helpful? Give feedback.
All reactions