Description: Proxy-app, reference implementation of typical linear algebra algorithms and workloads for a quantum molecular dynamics (QMD) electronic structure code. The algorithm is based on a recursive second-order Fermi-Operator expansion method (SP2) and is tailored for density functional based tight-binding calculations of material systems. The SP2 algorithm variants are part of the Los Alamos Transferable Tight-binding for Energetics (LATTE) code, based on a matrix expansion of the Fermi operator in a recursive series of generalized matrix-matrix multiplications. It is created and maintained by Co-Design Center for Particle Applications (CoPA). The code is intended to serve as a vehicle for co-design by allowing others to extend and/or reimplement as needed to test performance of new architectures, programming models, etc.
URL: https://github.com/ECP-copa/ExaSP2
Team: CloudHPC
Details of any changes to the Spack recipe used.
Git commit hash of checkout for pacakage:
Pull request for Spack recipe changes:
spack install [email protected]
spack install [email protected]
spack external find cmake
spack external find python
spack add exasp2%nvhpc
spack install
spack install [email protected]
spack install [email protected]
spack external find cmake
spack external find python
spack add exasp2%nvhpc
spack install
spack install exasp2%[email protected] ^openmpi
$ spack spec -Il exasp2%[email protected] ^openmpi
Input spec
--------------------------------
- exasp2%[email protected]
- ^openmpi
Concretized
--------------------------------
[+] 4i67huk [email protected]%[email protected]+mpi arch=linux-amzn2-skylake_avx512
[+] rt6uajc ^[email protected]%[email protected]~ipo+mpi+shared build_type=RelWithDebInfo arch=linux-amzn2-skylake_avx512
- gozuirv ^[email protected]%[email protected]~doc+ncurses+openssl+ownlibs~qt build_type=Release arch=linux-amzn2-skylake_avx512
[+] xbybdoz ^[email protected]%[email protected]~symlinks+termlib abi=none arch=linux-amzn2-skylake_avx512
- i665ooz ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] larjnul ^[email protected]%[email protected]~docs+systemcerts arch=linux-amzn2-skylake_avx512
- fb3kjch ^[email protected]%[email protected]+cpanm+shared+threads arch=linux-amzn2-skylake_avx512
- i5lbkjo ^[email protected]%[email protected]+cxx~docs+stl patches=b231fcc4d5cff05e5c3a4814f6a5af0e9a966428dc2176540d2c05aff41de522 arch=linux-amzn2-skylake_avx512
- s36txvt ^[email protected]%[email protected]~debug~pic+shared arch=linux-amzn2-skylake_avx512
- kjoplsl ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] qmzfn6j ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- fgwgsih ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- i35suwy ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] q2x25kt ^[email protected]%[email protected]+optimize+pic+shared arch=linux-amzn2-skylake_avx512
[+] skexx3l ^[email protected]%[email protected]~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none arch=linux-amzn2-skylake_avx512
[+] pmn26hx ^[email protected]%[email protected]~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java~legacylaunchers~lustre~memchecker+pmi~singularity~sqlite3+static~thread_multiple+vt+wrapper-rpath fabrics=ofi patches=60ce20bc14d98c572ef7883b9fcd254c3f232c2f3a13377480f96466169ac4c8 schedulers=slurm arch=linux-amzn2-skylake_avx512
[+] xkz726a ^[email protected]%[email protected]~cairo~cuda~gl~libudev+libxml2~netloc~nvml+pci+shared arch=linux-amzn2-skylake_avx512
[+] a4nq5nh ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- ya47eic ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- 6y53od3 ^[email protected]%[email protected]+sigsegv patches=3877ab548f88597ab2327a2230ee048d2d07ace1062efe81fc92e91b7f39cd00,fc9b61654a3ba1a8d6cd78ce087e7c96366c290bc8d2c299f09828d793b853c8 arch=linux-amzn2-skylake_avx512
- 5qpmdxk ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- 4fouma3 ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] mztzlil ^[email protected]%[email protected]~python arch=linux-amzn2-skylake_avx512
[+] p7yqdpr ^[email protected]%[email protected]~pic libs=shared,static arch=linux-amzn2-skylake_avx512
[+] rt2yj4o ^[email protected]%[email protected]+openssl arch=linux-amzn2-skylake_avx512
[+] aodqozx ^[email protected]%[email protected]~debug~kdreg fabrics=sockets,tcp,udp arch=linux-amzn2-skylake_avx512
[+] uqxtsju ^[email protected]%[email protected] patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006 arch=linux-amzn2-skylake_avx512
- qx56ujy ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- xveamuz ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] 7t25qrr ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] 7523zhe ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
[+] 724okpi ^slurm@20-02-4-1%[email protected]~gtk~hdf5~hwloc~mariadb~pmix+readline~restd sysconfdir=PREFIX/etc arch=linux-amzn2-skylake_avx512
- p7mkxd4 ^[email protected]%[email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87 arch=linux-amzn2-skylake_avx512
- 256y6qy ^[email protected]%[email protected]+libbsd arch=linux-amzn2-skylake_avx512
- mxgrkle ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- aehweer ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- vfg4fms ^[email protected]%[email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz arch=linux-amzn2-skylake_avx512
- fajl3kg ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
- vv3r7pc ^[email protected]%[email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0 arch=linux-amzn2-skylake_avx512
- 6ox5zyb ^[email protected]%[email protected]+column_metadata+fts~functions~rtree arch=linux-amzn2-skylake_avx512
- uzq2g5d ^[email protected]%[email protected] arch=linux-amzn2-skylake_avx512
spack install exasp2%[email protected] ^openmpi
$ spack spec -Il exasp2
Input spec
--------------------------------
- exasp2
Concretized
--------------------------------
[+] wocdwe5 [email protected]%[email protected]+mpi arch=linux-amzn2-graviton2
[+] 4p22aej ^[email protected]%[email protected]~ipo+mpi+shared build_type=RelWithDebInfo arch=linux-amzn2-graviton2
[+] m7325ee ^[email protected]%[email protected]~doc+ncurses+openssl+ownlibs~qt build_type=Release arch=linux-amzn2-graviton2
[+] iwzirqc ^[email protected]%[email protected]~symlinks+termlib abi=none arch=linux-amzn2-graviton2
[+] s4pw7zm ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] 5i3lgfb ^[email protected]%[email protected]~docs+systemcerts arch=linux-amzn2-graviton2
[+] ijjxlug ^[email protected]%[email protected]+cpanm+shared+threads arch=linux-amzn2-graviton2
[+] y42m6yr ^[email protected]%[email protected]+cxx~docs+stl patches=b231fcc4d5cff05e5c3a4814f6a5af0e9a966428dc2176540d2c05aff41de522 arch=linux-amzn2-graviton2
[+] rqrpmap ^[email protected]%[email protected]~debug~pic+shared arch=linux-amzn2-graviton2
[+] 2w7bert ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] y5ei3cm ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] wjwqncx ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] 3zy7kxk ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] qepjcvj ^[email protected]%[email protected]+optimize+pic+shared arch=linux-amzn2-graviton2
[+] rv7gj6u ^[email protected]%[email protected]~bignuma~consistent_fpcsr~ilp64+locking+pic+shared threads=none arch=linux-amzn2-graviton2
[+] l7oony6 ^[email protected]%[email protected]~atomics~cuda~cxx~cxx_exceptions+gpfs~internal-hwloc~java~legacylaunchers~lustre~memchecker~pmi~singularity~sqlite3+static~thread_multiple+vt+wrapper-rpath fabrics=none schedulers=none arch=linux-amzn2-graviton2
[+] cukmqbg ^[email protected]%[email protected]~cairo~cuda~gl~libudev+libxml2~netloc~nvml+pci+shared arch=linux-amzn2-graviton2
[+] asgtk6a ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] z2uysov ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] 3mz7xyt ^[email protected]%[email protected]+sigsegv arch=linux-amzn2-graviton2
[+] ltbv6bk ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] 4xr3hhh ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] iyhm3wi ^[email protected]%[email protected]~python arch=linux-amzn2-graviton2
[+] ye3kcvv ^[email protected]%[email protected]~pic libs=shared,static arch=linux-amzn2-graviton2
[+] tadxrfp ^[email protected]%[email protected]+openssl arch=linux-amzn2-graviton2
[+] mhav5gn ^[email protected]%[email protected] patches=4e1d78cbbb85de625bad28705e748856033eaafab92a66dffd383a3d7e00cc94,62fc8a8bf7665a60e8f4c93ebbd535647cebf74198f7afafec4c085a8825c006 arch=linux-amzn2-graviton2
[+] gignjm7 ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] h3qfzfb ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] wturp6c ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] ivotdt7 ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] 62czasr ^[email protected]%[email protected]+bz2+ctypes+dbm~debug+libxml2+lzma~nis~optimizations+pic+pyexpat+pythoncmd+readline+shared+sqlite3+ssl~tix~tkinter~ucs4+uuid+zlib patches=0d98e93189bc278fbc37a50ed7f183bd8aaf249a8e1670a465f0db6bb4f8cf87 arch=linux-amzn2-graviton2
[+] ychdz7l ^[email protected]%[email protected]+libbsd arch=linux-amzn2-graviton2
[+] ourxkez ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] nssrqfc ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] fqlpcsl ^[email protected]%[email protected]+bzip2+curses+git~libunistring+libxml2+tar+xz arch=linux-amzn2-graviton2
[+] v6cutkh ^[email protected]%[email protected] arch=linux-amzn2-graviton2
[+] 35cffos ^[email protected]%[email protected] patches=26f26c6f29a7ce9bf370ad3ab2610f99365b4bdd7b82e7c31df41a3370d685c0 arch=linux-amzn2-graviton2
[+] 2q753q6 ^[email protected]%[email protected]+column_metadata+fts~functions~rtree arch=linux-amzn2-graviton2
[+] 2non7qx ^[email protected]%[email protected] arch=linux-amzn2-graviton2
../bin/reframe -c benchmark.py -r --performance-report
Details of the validation for Test Case 1
.
==============================================================================
PERFORMANCE REPORT
------------------------------------------------------------------------------
****
------------------------------------------------------------------------------
Performance comparison of two compilers.
Cores | Compiler 1 | Compiler 2 |
---|---|---|
List of top-10 functions / code locations from a serial profile.
Profiling command used:
:
Position | Routine | Time (s) | Time (%) |
---|---|---|---|
1 | |||
2 | |||
3 | |||
4 | |||
5 | |||
6 | |||
7 | |||
8 | |||
9 | |||
10 |
List of top-10 functions / code locations from a full node profile.
Profiling command used:
:
Position | Routine | Time (s) | Time (%) | MPI (%) |
---|---|---|---|---|
1 | ||||
2 | ||||
3 | ||||
4 | ||||
5 | ||||
6 | ||||
7 | ||||
8 | ||||
9 | ||||
10 |
On-node scaling study for two compilers.
Cores | Compiler 1 | Compiler 2 |
---|---|---|
1 | ||
2 | ||
4 | ||
8 | ||
16 | ||
32 | ||
64 |
Off-node scaling study comparing C6g and C6gn instances.
Nodes | Cores | C6g | C6gn |
---|---|---|---|
1 | 8 | ||
1 | 16 | ||
1 | 32 | ||
1 | 64 | ||
2 | 128 | ||
4 | 256 | ||
8 | 512 |
On-node scaling study for two architectures.
Cores | C6gn (Aarch64) | C5n (X86) |
---|---|---|
1 | ||
2 | ||
4 | ||
8 | ||
16 | ||
32 | ||
64 |
Details of steps taken to optimise performance of the application. Please document work with compiler flags, maths libraries, system libraries, code optimisations, etc.
Compiler flags before:
CFLAGS=
FFLAGS=
Compiler flags after:
CFLAGS=
FFLAGS=
Cores | Original Flags | New Flags |
---|---|---|
1 | ||
2 | ||
4 | ||
8 | ||
16 | ||
32 | ||
64 |
Report on use of maths library calls generated by (Perf Lib Tools)[https://github.com/ARM-software/perf-libs-tools]. Please attach the corresponding apl files.
Performance analysis of the use of different maths libraries.
Cores | OpenBLAS | ArmPL | BLIS |
---|---|---|---|
1 | |||
2 | |||
4 | |||
8 | |||
16 | |||
32 | |||
64 |
How fast can you make the code?
Use all of the above aproaches and any others to make the code as fast as possible. Demonstrate your gains by providing a scaling study for your test case, demonstrating the performance before and after.
Details of lessons from compiling the application.
Details of lessons from analysing the performance of the application.
Details of lessons from performance optimising the application.