Commit 02ca70d
Fix/preflight NUMA imbalance to mean uneven GPU distribution across nodes (#554)
### Changes:
Only flag imbalance if the COUNT of GPUs on each node differs.
Example:
4 on Node 0, 4 on Node 1 -> counts=[4,4] -> set={4} -> len=1 -> NOT
imbalanced.
7 on Node 0, 1 on Node 1 -> counts=[7,1] -> set={7,1} -> len=2 ->
Imbalanced.
### Reason for changes:
The previous logic would issue a NUMA imbalance warning if not all GPUs
were connected to the same node, resulting in a false positive when
using a multi-socket CPU.
---------
Co-authored-by: Xiaoming-AMD <Xiaoming.Peng@amd.com>1 parent 78633ae commit 02ca70d
1 file changed
Lines changed: 5 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
| 10 | + | |
10 | 11 | | |
11 | 12 | | |
12 | 13 | | |
| |||
109 | 110 | | |
110 | 111 | | |
111 | 112 | | |
112 | | - | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
113 | 117 | | |
114 | 118 | | |
115 | 119 | | |
| |||
0 commit comments