Skip to content

Commit 102cd4d

Browse files
authored
update_readme: add CMake, devcontainer, and SST documentation (#108)
* update_readme: add CMake, devcontainer, and SST * update_readme: fix branch for sst-elements * update_readme: fix typos
1 parent e36aff2 commit 102cd4d

File tree

2 files changed

+111
-23
lines changed

2 files changed

+111
-23
lines changed

.github/workflows/sst_integration.yml

+3-5
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,9 @@ jobs:
4040
- name: Prepare SST dependencies
4141
run: |
4242
apt install -y openmpi-bin openmpi-common libtool libtool-bin autoconf python3 python3-dev automake build-essential git
43-
# Use personal repo for now
4443
- name: Build SST-Core
4544
run: |
46-
git clone https://github.com/William-An/sst-core.git
45+
git clone https://github.com/sstsimulator/sst-core.git
4746
cd sst-core
4847
git pull
4948
git checkout devel
@@ -53,14 +52,13 @@ jobs:
5352
make install
5453
cd ..
5554
rm -rf ./sst-core
56-
# Use personal repo for now
5755
- name: Build SST-Elements
5856
run: |
59-
git clone https://github.com/William-An/sst-elements.git
57+
git clone https://github.com/sstsimulator/sst-elements.git
6058
source ./setup_environment
6159
cd sst-elements
6260
git pull
63-
git checkout balar-mmio-vanadis-llvm
61+
git checkout devel
6462
./autogen.sh
6563
./configure --prefix=`realpath ../sstelements-install` --with-sst-core=`realpath ../sstcore-install` --with-cuda=$CUDA_INSTALL_PATH --with-gpgpusim=$GPGPUSIM_ROOT
6664
make -j4

README.md

+108-18
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,29 @@
1+
# GPGPU-Sim
2+
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/accel-sim/gpgpu-sim_distribution)
3+
[![Short-Tests](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/main.yml/badge.svg)](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/main.yml)
4+
[![Short-Tests-AccelSim](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/accelsim.yml/badge.svg)](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/accelsim.yml)
5+
[![Short-Tests-CMake](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/cmake.yml/badge.svg)](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/cmake.yml)
6+
[![SST Integration Test](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/sst_integration.yml/badge.svg)](https://github.com/accel-sim/gpgpu-sim_distribution/actions/workflows/sst_integration.yml)
7+
- [GPGPU-Sim](#gpgpu-sim)
8+
- [CONTRIBUTIONS and HISTORY](#contributions-and-history)
9+
- [GPGPU-Sim](#gpgpu-sim-1)
10+
- [AccelWattch Power Model](#accelwattch-power-model)
11+
- [INSTALLING, BUILDING and RUNNING GPGPU-Sim](#installing-building-and-running-gpgpu-sim)
12+
- [Step 1: Dependencies](#step-1-dependencies)
13+
- [Step 1.1: Setup on a Linux machine](#step-11-setup-on-a-linux-machine)
14+
- [Step 1.2: Setup with docker image](#step-12-setup-with-docker-image)
15+
- [Step 1.3: Setup with devcontainer](#step-13-setup-with-devcontainer)
16+
- [Step 2: Build](#step-2-build)
17+
- [Step 2.1: Build with CMake](#step-21-build-with-cmake)
18+
- [Step 2.2: Build with make](#step-22-build-with-make)
19+
- [Step 3: Run](#step-3-run)
20+
- [(OPTIONAL) Contributing to GPGPU-Sim (ADVANCED USERS ONLY)](#optional-contributing-to-gpgpu-sim-advanced-users-only)
21+
- [Testing updated version of GPGPU-Sim](#testing-updated-version-of-gpgpu-sim)
22+
- [MISCELLANEOUS](#miscellaneous)
23+
- [Speeding up the execution](#speeding-up-the-execution)
24+
- [Debugging failing GPGPU-Sim Regressions](#debugging-failing-gpgpu-sim-regressions)
25+
- [SST-integration](#sst-integration)
26+
127
Welcome to GPGPU-Sim, a cycle-level simulator modeling contemporary graphics
228
processing units (GPUs) running GPU computing workloads written in CUDA or
329
OpenCL. Also included in GPGPU-Sim is a performance visualization tool called
@@ -6,7 +32,7 @@ GPGPU-Sim and AccelWattch have been rigorously validated with performance and
632
power measurements of real hardware GPUs.
733

834
This version of GPGPU-Sim has been tested with a subset of CUDA version 4.2,
9-
5.0, 5.5, 6.0, 7.5, 8.0, 9.0, 9.1, 10, and 11
35+
5.0, 5.5, 6.0, 7.5, 8.0, 9.0, 9.1, 10, 11, and 12
1036

1137
Please see the copyright notice in the file COPYRIGHT distributed with this
1238
release in the same directory as this file.
@@ -74,9 +100,9 @@ See Section 2 "INSTALLING, BUILDING and RUNNING GPGPU-Sim" below to get started.
74100

75101
See file CHANGES for updates in this and earlier versions.
76102

77-
# CONTRIBUTIONS and HISTORY
103+
## CONTRIBUTIONS and HISTORY
78104

79-
## GPGPU-Sim
105+
### GPGPU-Sim
80106

81107
GPGPU-Sim was created by Tor Aamodt's research group at the University of
82108
British Columbia. Many have directly contributed to development of GPGPU-Sim
@@ -107,7 +133,7 @@ library (part of the CUDA toolkit). Code to interface with the CUDA Math
107133
library is contained in cuda-math.h, which also includes several structures
108134
derived from vector_types.h (one of the CUDA header files).
109135

110-
## AccelWattch Power Model
136+
### AccelWattch Power Model
111137

112138
AccelWattch (introduced in GPGPU-Sim 4.2.0) was developed by researchers at
113139
Northwestern University, Purdue University, and the University of British Columbia.
@@ -121,7 +147,7 @@ the University of California, San Diego. The McPAT paper can be found at
121147
http://www.hpl.hp.com/research/mcpat/micro09.pdf.
122148

123149

124-
# INSTALLING, BUILDING and RUNNING GPGPU-Sim
150+
## INSTALLING, BUILDING and RUNNING GPGPU-Sim
125151

126152
Assuming all dependencies required by GPGPU-Sim are installed on your system,
127153
to build GPGPU-Sim all you need to do is add the following line to your
@@ -144,7 +170,8 @@ If the above fails, see "Step 1" and "Step 2" below.
144170
If the above worked, see "Step 3" below, which explains how to run a CUDA
145171
benchmark on GPGPU-Sim.
146172

147-
## Step 1: Dependencies
173+
### Step 1: Dependencies
174+
#### Step 1.1: Setup on a Linux machine
148175

149176
GPGPU-Sim was developed on SUSE Linux (this release was tested with SUSE
150177
version 11.3) and has been used on several other Linux platforms (both 32-bit
@@ -226,22 +253,64 @@ If running applications which use cuDNN or cuBLAS:
226253
export CUDNN_PATH=<Path To cuDNN Directory>
227254
export LD_LIBRARY_PATH=$CUDA_INSTALL_PATH/lib64:$CUDA_INSTALL_PATH/lib:$CUDNN_PATH/lib64
228255

229-
256+
#### Step 1.2: Setup with docker image
257+
258+
You can also opt in the prebuilt images on https://github.com/accel-sim/Dockerfile/pkgs/container/accel-sim-framework.
259+
260+
```bash
261+
# Pull Ubuntu 24.04 with Cuda 12.8
262+
docker pull ghcr.io/accel-sim/accel-sim-framework:ubuntu-24.04-cuda-12.8
263+
264+
# Pull code repo
265+
git clone [email protected]:accel-sim/gpgpu-sim_distribution.git
266+
cd gpgpu-sim_distribution
267+
268+
# Run the image in interactive mode with gpgpu-sim mounted to `/accel-sim/gpgpu-sim_distribution` inside container
269+
docker run -it --name gpgpusim -v ./:/accel-sim/gpgpu-sim_distribution
270+
```
271+
272+
#### Step 1.3: Setup with devcontainer
273+
274+
If you are using [VSCode](https://code.visualstudio.com/) or [Codespace](https://github.com/features/codespaces), you setup environment with [devcontainer](https://code.visualstudio.com/docs/devcontainers/containers).
230275

231-
## Step 2: Build
276+
- For VSCode, refer to [this guide](https://code.visualstudio.com/docs/devcontainers/containers) to setup devcontainer.
277+
- For Codespace, you can use this link: https://codespaces.new/accel-sim/gpgpu-sim_distribution to setup with devcontainer.
278+
279+
### Step 2: Build
232280

233281
To build the simulator, you first need to configure how you want it to be
234282
built. From the root directory of the simulator, type the following commands in
235283
a bash shell (you can check you are using a bash shell by running the command
236284
"echo \$SHELL", which should print "/bin/bash"):
237285

286+
```bash
238287
source setup_environment <build_type>
288+
```
239289

240-
replace <build_type> with debug or release. Use release if you need faster
241-
simulation and debug if you need to run the simulator in gdb. If nothing is
242-
specified, release will be used by default.
290+
replace <build_type> with `debug` or `release`. Use `release` if you need faster
291+
simulation and `debug` if you need to run the simulator in gdb. If nothing is
292+
specified, `release` will be used by default.
243293

244-
Now you are ready to build the simulator, just run
294+
> Note: specifying `build_type` has no impact with the CMake build flow as it relies on the CMake variable `CMAKE_BUILD_TYPE` to determine whether to build for `release` or `debug`
295+
296+
#### Step 2.1: Build with CMake
297+
To build with `cmake`, simply run the following commands:
298+
```bash
299+
# Create a release build for CMake
300+
cmake -B build
301+
302+
# Or you can specify a debug build
303+
cmake -DCMAKE_BUILD_TYPE=Debug -B build
304+
305+
# Build with 8 processes
306+
cmake --build build -j8
307+
308+
# Install built .so to lib/ folder
309+
cmake --install build
310+
```
311+
312+
#### Step 2.2: Build with make
313+
To build the simulator with `make`, just run
245314

246315
make
247316

@@ -264,7 +333,7 @@ The documentation resides at doc/doxygen/html.
264333

265334
To run Pytorch applications with the simulator, install the modified Pytorch library as well by following instructions [here](https://github.com/gpgpu-sim/pytorch-gpgpu-sim).
266335

267-
## Step 3: Run
336+
### Step 3: Run
268337

269338
Before we run, we need to make sure the application's executable file is dynamically linked to CUDA runtime library. This can be done during compilation of your program by introducing the nvcc flag "-lcudart" in makefile (quotes should be excluded).
270339

@@ -343,7 +412,7 @@ distributed separately on github under the repo ispass2009-benchmarks.
343412
The README.ISPASS-2009 file distributed with the benchmarks now contains
344413
updated instructions for running the benchmarks on GPGPU-Sim v3.x.
345414

346-
# (OPTIONAL) Contributing to GPGPU-Sim (ADVANCED USERS ONLY)
415+
## (OPTIONAL) Contributing to GPGPU-Sim (ADVANCED USERS ONLY)
347416

348417
If you have made modifications to the simulator and wish to incorporate new
349418
features/bugfixes from subsequent releases the following instructions may help.
@@ -398,7 +467,7 @@ to open a graphical merge tool to do the merge:
398467
git mergetool
399468
```
400469

401-
## Testing updated version of GPGPU-Sim
470+
### Testing updated version of GPGPU-Sim
402471

403472
Now you should test that the merged version "works". This means following the
404473
steps for building GPGPU-Sim in the _new_ README file (not this version) since
@@ -413,9 +482,9 @@ identify any compile time or runtime errors that occur due to the code merging
413482
process.
414483

415484

416-
# MISCELLANEOUS
485+
## MISCELLANEOUS
417486

418-
## Speeding up the execution
487+
### Speeding up the execution
419488

420489
Some applications take several hours to execute on GPGPUSim. This is because the simulator has to dump the PTX, analyze them and get resource usage statistics. This can be avoided everytime we execute the program in the following way:
421490

@@ -429,7 +498,7 @@ Some applications take several hours to execute on GPGPUSim. This is because the
429498
3. Disable -save_embedded_ptx flag, execute the code again. This will skip the dumping by cuobjdump and directly goes to executing the program thus saving time.
430499

431500

432-
## Debugging failing GPGPU-Sim Regressions
501+
### Debugging failing GPGPU-Sim Regressions
433502

434503
Credits: Tor M Aamodt
435504

@@ -500,3 +569,24 @@ To debug failing GPGPU-Sim regression tests you need to run them locally. The f
500569
```
501570
This will put you in at the (gdb) prompt. Setup any breakpoints needed and run.
502571
572+
### SST-integration
573+
574+
The `gpu->is_SST_mode()` conditionals in the codebase address architectural differences between GPGPU-Sim's original design and SST integration, primarily focusing on two areas:
575+
576+
- SST-Specific Hardware Configuration
577+
- Cache bypass: SST mode intercepts interconnect packets to redirect them externally instead of using GPGPU-Sim's native cache system.
578+
- Component initialization: Guards hardware setup steps that only apply to SST's simulation environment.
579+
- Frontend-Backend Coupling
580+
- In standard GPGPU-Sim:
581+
- CUDA frontend (separate thread) asynchronously pushes operations to stream manager
582+
- Backend simulator consumes operations independently via `cycle()` calls
583+
- In SST mode:
584+
- Single-threaded execution requires synchronization callbacks between Balar's frontend event handlers and backend clock ticks
585+
- Prevents deadlocks where:
586+
- Blocking CUDA requests wait for stream clearance
587+
- Stream processing depends on backend `cycle()` advancement
588+
- SST progression halts until current event handling completes
589+
590+
This coupling necessitated modified stream management to replace GPGPU-Sim's native busy-wait approach with SST-compatible synchronization triggers.
591+
592+
For detailed documentation on SST-integration, checkout the [SST elements documentation](https://sst-simulator.org/sst-docs/docs/elements/intro).

0 commit comments

Comments
 (0)