Skip to content

Commit 573ac52

Browse files
authored
Merge branch 'master' into olruwase/update_nvme_offload_states
2 parents 5991d43 + ac295aa commit 573ac52

File tree

2 files changed

+3
-3
lines changed

2 files changed

+3
-3
lines changed

blogs/deepspeed-gds/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,7 @@ We used three benchmarking tools for our evaluations. The first is fio, the popu
4747

4848
## High-Performance I/O with CPU Buffers via NVMe Scaling
4949

50-
Our first set of microbenchmark evaluations used fio and ds\_io to measure the performance of transferring 1GB data between NVMe and CPU memory. We configure fio to use the libaio backend for these experiments1. The results are summarized in Figure 1, from which we make two observations. First, DeepNVMe demonstrates high performance as it roughly matches fio, despite being more representative of DL applications. Second, DeepNVMe scales I/O performance almost linearly with available NVMe bandwidth, achieving rates of 10GB/sec reads and 5GB/sec writes.
50+
Our first set of microbenchmark evaluations used fio and ds\_io to measure the performance of transferring 1GB data between NVMe and CPU memory. We configure fio to use the libaio backend for these experiments. The results are summarized in Figure 1, from which we make two observations. First, DeepNVMe demonstrates high performance as it roughly matches fio, despite being more representative of DL applications. Second, DeepNVMe scales I/O performance almost linearly with available NVMe bandwidth, achieving rates of 10GB/sec reads and 5GB/sec writes.
5151

5252
<img src="./media/figure1.png" style="width:6.5in;height:3.42153in" />
5353

@@ -85,4 +85,4 @@ In this blog post, we introduced DeepNVMe, an I/O optimization technology create
8585

8686

8787
# Acknowlegements
88-
This work is the result of a deep collaboration between Microsoft and NVIDIA. The contributors include Joe Mayer, Martin Cai, and Olatunji Ruwase from Microsoft; Kiran Modukuri, Vahid Noormofidi, Sourab Gupta, and Sandeep Joshi from Nivida.
88+
This work is the result of a deep collaboration between Microsoft and NVIDIA. The contributors include Joe Mayer, Martin Cai, and Olatunji Ruwase from Microsoft; Kiran Modukuri, Vahid Noormofidi, Sourab Gupta, and Sandeep Joshi from Nvidia.

deepspeed/runtime/constants.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@
249249
Optional comm data type for seq paralleism should be set as:
250250
"seq_parallel_communication_data_type": "fp32"
251251
'''
252-
SEQ_PARALLEL_COMMUNICATION_DATA_TYPE = "seq_parallel_comm_data_type"
252+
SEQ_PARALLEL_COMMUNICATION_DATA_TYPE = "seq_parallel_communication_data_type"
253253
SEQ_PARALLEL_COMMUNICATION_DATA_TYPE_DEFAULT = "fp32"
254254

255255
#########################################

0 commit comments

Comments
 (0)