You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# Run the image in interactive mode with gpgpu-sim mounted to `/accel-sim/gpgpu-sim_distribution` inside container
269
+
docker run -it --name gpgpusim -v ./:/accel-sim/gpgpu-sim_distribution
270
+
```
271
+
272
+
#### Step 1.3: Setup with devcontainer
273
+
274
+
If you are using [VSCode](https://code.visualstudio.com/) or [Codespace](https://github.com/features/codespaces), you setup environment with [devcontainer](https://code.visualstudio.com/docs/devcontainers/containers).
230
275
231
-
## Step 2: Build
276
+
- For VSCode, refer to [this guide](https://code.visualstudio.com/docs/devcontainers/containers) to setup devcontainer.
277
+
- For Codespace, you can use this link: https://codespaces.new/accel-sim/gpgpu-sim_distribution to setup with devcontainer.
278
+
279
+
### Step 2: Build
232
280
233
281
To build the simulator, you first need to configure how you want it to be
234
282
built. From the root directory of the simulator, type the following commands in
235
283
a bash shell (you can check you are using a bash shell by running the command
236
284
"echo \$SHELL", which should print "/bin/bash"):
237
285
286
+
```bash
238
287
source setup_environment <build_type>
288
+
```
239
289
240
-
replace <build_type> with debug or release. Use release if you need faster
241
-
simulation and debug if you need to run the simulator in gdb. If nothing is
242
-
specified, release will be used by default.
290
+
replace <build_type> with `debug` or `release`. Use `release` if you need faster
291
+
simulation and `debug` if you need to run the simulator in gdb. If nothing is
292
+
specified, `release` will be used by default.
243
293
244
-
Now you are ready to build the simulator, just run
294
+
> Note: specifying `build_type` has no impact with the CMake build flow as it relies on the CMake variable `CMAKE_BUILD_TYPE` to determine whether to build for `release` or `debug`
295
+
296
+
#### Step 2.1: Build with CMake
297
+
To build with `cmake`, simply run the following commands:
298
+
```bash
299
+
# Create a release build for CMake
300
+
cmake -B build
301
+
302
+
# Or you can specify a debug build
303
+
cmake -DCMAKE_BUILD_TYPE=Debug -B build
304
+
305
+
# Build with 8 processes
306
+
cmake --build build -j8
307
+
308
+
# Install built .so to lib/ folder
309
+
cmake --install build
310
+
```
311
+
312
+
#### Step 2.2: Build with make
313
+
To build the simulator with `make`, just run
245
314
246
315
make
247
316
@@ -264,7 +333,7 @@ The documentation resides at doc/doxygen/html.
264
333
265
334
To run Pytorch applications with the simulator, install the modified Pytorch library as well by following instructions [here](https://github.com/gpgpu-sim/pytorch-gpgpu-sim).
266
335
267
-
## Step 3: Run
336
+
###Step 3: Run
268
337
269
338
Before we run, we need to make sure the application's executable file is dynamically linked to CUDA runtime library. This can be done during compilation of your program by introducing the nvcc flag "-lcudart" in makefile (quotes should be excluded).
270
339
@@ -343,7 +412,7 @@ distributed separately on github under the repo ispass2009-benchmarks.
343
412
The README.ISPASS-2009 file distributed with the benchmarks now contains
344
413
updated instructions for running the benchmarks on GPGPU-Sim v3.x.
345
414
346
-
# (OPTIONAL) Contributing to GPGPU-Sim (ADVANCED USERS ONLY)
415
+
##(OPTIONAL) Contributing to GPGPU-Sim (ADVANCED USERS ONLY)
347
416
348
417
If you have made modifications to the simulator and wish to incorporate new
349
418
features/bugfixes from subsequent releases the following instructions may help.
@@ -398,7 +467,7 @@ to open a graphical merge tool to do the merge:
398
467
git mergetool
399
468
```
400
469
401
-
## Testing updated version of GPGPU-Sim
470
+
###Testing updated version of GPGPU-Sim
402
471
403
472
Now you should test that the merged version "works". This means following the
404
473
steps for building GPGPU-Sim in the _new_ README file (not this version) since
@@ -413,9 +482,9 @@ identify any compile time or runtime errors that occur due to the code merging
413
482
process.
414
483
415
484
416
-
# MISCELLANEOUS
485
+
##MISCELLANEOUS
417
486
418
-
## Speeding up the execution
487
+
###Speeding up the execution
419
488
420
489
Some applications take several hours to execute on GPGPUSim. This is because the simulator has to dump the PTX, analyze them and get resource usage statistics. This can be avoided everytime we execute the program in the following way:
421
490
@@ -429,7 +498,7 @@ Some applications take several hours to execute on GPGPUSim. This is because the
429
498
3. Disable -save_embedded_ptx flag, execute the code again. This will skip the dumping by cuobjdump and directly goes to executing the program thus saving time.
430
499
431
500
432
-
## Debugging failing GPGPU-Sim Regressions
501
+
###Debugging failing GPGPU-Sim Regressions
433
502
434
503
Credits: Tor M Aamodt
435
504
@@ -500,3 +569,24 @@ To debug failing GPGPU-Sim regression tests you need to run them locally. The f
500
569
```
501
570
This will put you in at the (gdb) prompt. Setup any breakpoints needed and run.
502
571
572
+
### SST-integration
573
+
574
+
The `gpu->is_SST_mode()` conditionals in the codebase address architectural differences between GPGPU-Sim's original design and SST integration, primarily focusing on two areas:
575
+
576
+
- SST-Specific Hardware Configuration
577
+
- Cache bypass: SST mode intercepts interconnect packets to redirect them externally instead of using GPGPU-Sim's native cache system.
578
+
- Component initialization: Guards hardware setup steps that only apply to SST's simulation environment.
579
+
- Frontend-Backend Coupling
580
+
- In standard GPGPU-Sim:
581
+
- CUDA frontend (separate thread) asynchronously pushes operations to stream manager
582
+
- Backend simulator consumes operations independently via `cycle()` calls
583
+
- In SST mode:
584
+
- Single-threaded execution requires synchronization callbacks between Balar's frontend event handlers and backend clock ticks
585
+
- Prevents deadlocks where:
586
+
- Blocking CUDA requests wait for stream clearance
587
+
- Stream processing depends on backend `cycle()` advancement
588
+
- SST progression halts until current event handling completes
589
+
590
+
This coupling necessitated modified stream management to replace GPGPU-Sim's native busy-wait approach with SST-compatible synchronization triggers.
591
+
592
+
For detailed documentation on SST-integration, checkout the [SST elements documentation](https://sst-simulator.org/sst-docs/docs/elements/intro).
0 commit comments