Skip to content

Can I deploy DeepSeek V3 on 4 nodes with 14 GPUs? #3527

Closed Answered by ispobock
groklab asked this question in Q&A
Discussion options

You must be logged in to vote

The TP size needs to ensure that some dimensions of model weights are divisible. It is recommended to set TP size as the power of 2.

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by HandH1998
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants